Re: svn commit: r334804 - in head/sys: kern modules/tcp modules/tcp/rack netinet netinet/tcp_stacks sys
On Fri, Jun 8, 2018 at 12:37 PM, hiren panchasara < hi...@strugglingcoder.info> wrote: > ps: I know we are not killing the default stack as of yet and just > stopping active maintenance of it but just wanted to raise these > (probably obvious) points. We absolutely are not stopping active maintenance of the "default" (soon to be renamed "base") TCP stack. RACK is an alternative which might be useful for some people. But, there is no suggestion that we should in any way change the priority we give to maintaining the base TCP stack anytime in the near future. Jonathan ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r334804 - in head/sys: kern modules/tcp modules/tcp/rack netinet netinet/tcp_stacks sys
On 06/07/18 at 08:07P, Matthew Macy wrote: > > > > Okay. I believe there might be situations where we may want to still > > keep the 'default' stack alive. I know Windows doesn't yet use RACK when > > rtt is lesser than 10ms (or something like that), as an example. > > > > Is there any reason RACK wouldn't work for tracking 10us RTTs? If we > know we know the peer doesn't do delack or have enough data in flight > and the other stack doesn't have broken LRO, could we just use this in > lieu of high resolution timestamps? I believe the issue is both ends having fine-grained timers for RACK to be able to do the right thing. If timer resolution ends up being coarser than rtt, just depending on rack may be problematic. I know 10ms is doable on most systems but just using this as an example that we probably want non-rack ('default') stack to be around a little longer and possibly with the enhancements that we can easily extract out to be shared among all the stacks. Also RACK needing pacing which requires but more CPU so question for us could be that do we want to keep the 'default' stack around for machines that can't take that extra CPU hit. SACK is inefficient in default stack and PRR could be super useful as "psuedo" pacing mechanism and could help recover faster but at the end of the day it all depends on someone with time/energy/motivation to maintain the 'default' stack with all shiny things that appear in non-default stacks. cheers, Hiren ps: I know we are not killing the default stack as of yet and just stopping active maintenance of it but just wanted to raise these (probably obvious) points. pgpd1Qw8EoOpa.pgp Description: PGP signature
Re: svn commit: r334804 - in head/sys: kern modules/tcp modules/tcp/rack netinet netinet/tcp_stacks sys
> > Okay. I believe there might be situations where we may want to still > keep the 'default' stack alive. I know Windows doesn't yet use RACK when > rtt is lesser than 10ms (or something like that), as an example. > Is there any reason RACK wouldn't work for tracking 10us RTTs? If we know we know the peer doesn't do delack or have enough data in flight and the other stack doesn't have broken LRO, could we just use this in lieu of high resolution timestamps? -M ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r334804 - in head/sys: kern modules/tcp modules/tcp/rack netinet netinet/tcp_stacks sys
On 06/07/18 at 08:58P, Randall Stewart wrote: > > > > On Jun 7, 2018, at 6:01 PM, hiren panchasara > > wrote: > > > > On 06/07/18 at 06:18P, Randall Stewart wrote: > >> Author: rrs > >> Date: Thu Jun 7 18:18:13 2018 > >> New Revision: 334804 > >> URL: https://svnweb.freebsd.org/changeset/base/334804 > >> > >> Log: > >> This commit brings in a new refactored TCP stack called Rack. > >> Rack includes the following features: > >> - A different SACK processing scheme (the old sack structures are not > >> used). > >> - RACK (Recent acknowledgment) where counting dup-acks is no longer done > >> instead time is used to knwo when to retransmit. (see the I-D) > >> - TLP (Tail Loss Probe) where we will probe for tail-losses to attempt > >> to try not to take a retransmit time-out. (see the I-D) > >> - Burst mitigation using TCPHTPS > >> - PRR (partial rate reduction) see the RFC. > >> > >> Once built into your kernel, you can select this stack by either > >> socket option with the name of the stack is "rack" or by setting > >> the global sysctl so the default is rack. > >> > >> Note that any connection that does not support SACK will be kicked > >> back to the "default" base FreeBSD stack (currently known as "default"). > >> > >> To build this into your kernel you will need to enable in your > >> kernel: > >> makeoptions WITH_EXTRA_TCP_STACKS=1 > >> options TCPHPTS > >> > >> Sponsored by: Netflix Inc. > >> Differential Revision:https://reviews.freebsd.org/D15525 > >> > >> Added: > >> head/sys/modules/tcp/rack/ > >> head/sys/modules/tcp/rack/Makefile (contents, props changed) > >> head/sys/netinet/tcp_stacks/rack.c (contents, props changed) > >> head/sys/netinet/tcp_stacks/rack_bbr_common.h (contents, props changed) > >> head/sys/netinet/tcp_stacks/sack_filter.c (contents, props changed) > >> head/sys/netinet/tcp_stacks/sack_filter.h (contents, props changed) > >> head/sys/netinet/tcp_stacks/tcp_rack.h (contents, props changed) > >> Modified: > >> head/sys/kern/uipc_sockbuf.c > >> head/sys/modules/tcp/Makefile > >> head/sys/netinet/tcp.h > >> head/sys/netinet/tcp_log_buf.h > >> head/sys/netinet/tcp_output.c > >> head/sys/netinet/tcp_stacks/fastpath.c > >> head/sys/netinet/tcp_timer.c > >> head/sys/netinet/tcp_timer.h > >> head/sys/netinet/tcp_var.h > >> head/sys/sys/mbuf.h > >> head/sys/sys/queue.h > >> head/sys/sys/sockbuf.h > >> head/sys/sys/time.h > > > > I thought we'd have more time to review/test this. Looks like BSDCan > > commit-spree in effect. :-) > > The Phabricator review has been up since May 22nd. Thats over 2.5 weeks, > this was also discussed on the Thursday conference calls. Fair enough. (I am out of touch a little so shouldn't really complain ;-)) > > > > A few questions: > > 1) Does RACK work reliably without HPTS? If yes, has that config been > > tested? > > > No it requires the pacer. > > > 2) It looks like PRR is tied to RACK. Why did we go that route? > > Shouldn't it be easily used with the 'default' stack also? > > > > It is what I developed.. and I had no desire to work with the default stack. > That > is a fifth rail that no one wants touched. > > > 3) Can new SACK be used with the traditional stack? > > Well if you want to rework the base stack you might be able to do that :) > > It would be quite some effort.. I think Robert wants eventually the old > stack to be de-composed and then slowly work at getting more common > code between them until eventually you can have a diff and somehow > figure out how to integrate the two. Okay. I believe there might be situations where we may want to still keep the 'default' stack alive. I know Windows doesn't yet use RACK when rtt is lesser than 10ms (or something like that), as an example. Such optimizations (PRR, better SACK) should be made available to that also if we see non-RACK having a viable future. > > > > > 4) Where should manpage like info for RACK go? a new man-page or > > extending tcp(4)? Info like how to enable system-wide or per socket > > should go here. > > > > The enable/disable or per-socket I think is in with the pluggable stack > stuff. We might want a Rack man page.. have to think about it. > > > 5) Any perf numbers to go along with this commit? Synthetic or > > production numbers showing improvements in transfer speed or any other > > impact on CPU usage (specially with HPTS) that you can share? > > > > CPU will be more but we see close to a drop in rebuffers by about 12% I am > told. Great. Thanks. > > > 6) In your testing, have you found cases where RACK does poorly compared > > to the 'default' stack? Any recommendations on when should RACK be > > enabled? (Something like this could go in the manpage.) > > Nope. Okay. I assuming the 2 places this got tested at (NFLX and LLNW) would have provided enough variety in terms of traffic types. Thanks again, Hiren pgp1Ak9Vqu90k.pgp Description:
Re: svn commit: r334804 - in head/sys: kern modules/tcp modules/tcp/rack netinet netinet/tcp_stacks sys
> On Jun 7, 2018, at 6:01 PM, hiren panchasara > wrote: > > On 06/07/18 at 06:18P, Randall Stewart wrote: >> Author: rrs >> Date: Thu Jun 7 18:18:13 2018 >> New Revision: 334804 >> URL: https://svnweb.freebsd.org/changeset/base/334804 >> >> Log: >> This commit brings in a new refactored TCP stack called Rack. >> Rack includes the following features: >> - A different SACK processing scheme (the old sack structures are not >> used). >> - RACK (Recent acknowledgment) where counting dup-acks is no longer done >> instead time is used to knwo when to retransmit. (see the I-D) >> - TLP (Tail Loss Probe) where we will probe for tail-losses to attempt >> to try not to take a retransmit time-out. (see the I-D) >> - Burst mitigation using TCPHTPS >> - PRR (partial rate reduction) see the RFC. >> >> Once built into your kernel, you can select this stack by either >> socket option with the name of the stack is "rack" or by setting >> the global sysctl so the default is rack. >> >> Note that any connection that does not support SACK will be kicked >> back to the "default" base FreeBSD stack (currently known as "default"). >> >> To build this into your kernel you will need to enable in your >> kernel: >> makeoptions WITH_EXTRA_TCP_STACKS=1 >> options TCPHPTS >> >> Sponsored by: Netflix Inc. >> Differential Revision: https://reviews.freebsd.org/D15525 >> >> Added: >> head/sys/modules/tcp/rack/ >> head/sys/modules/tcp/rack/Makefile (contents, props changed) >> head/sys/netinet/tcp_stacks/rack.c (contents, props changed) >> head/sys/netinet/tcp_stacks/rack_bbr_common.h (contents, props changed) >> head/sys/netinet/tcp_stacks/sack_filter.c (contents, props changed) >> head/sys/netinet/tcp_stacks/sack_filter.h (contents, props changed) >> head/sys/netinet/tcp_stacks/tcp_rack.h (contents, props changed) >> Modified: >> head/sys/kern/uipc_sockbuf.c >> head/sys/modules/tcp/Makefile >> head/sys/netinet/tcp.h >> head/sys/netinet/tcp_log_buf.h >> head/sys/netinet/tcp_output.c >> head/sys/netinet/tcp_stacks/fastpath.c >> head/sys/netinet/tcp_timer.c >> head/sys/netinet/tcp_timer.h >> head/sys/netinet/tcp_var.h >> head/sys/sys/mbuf.h >> head/sys/sys/queue.h >> head/sys/sys/sockbuf.h >> head/sys/sys/time.h > > I thought we'd have more time to review/test this. Looks like BSDCan > commit-spree in effect. :-) The Phabricator review has been up since May 22nd. Thats over 2.5 weeks, this was also discussed on the Thursday conference calls. > > A few questions: > 1) Does RACK work reliably without HPTS? If yes, has that config been > tested? > No it requires the pacer. > 2) It looks like PRR is tied to RACK. Why did we go that route? > Shouldn't it be easily used with the 'default' stack also? > It is what I developed.. and I had no desire to work with the default stack. That is a fifth rail that no one wants touched. > 3) Can new SACK be used with the traditional stack? Well if you want to rework the base stack you might be able to do that :) It would be quite some effort.. I think Robert wants eventually the old stack to be de-composed and then slowly work at getting more common code between them until eventually you can have a diff and somehow figure out how to integrate the two. > > 4) Where should manpage like info for RACK go? a new man-page or > extending tcp(4)? Info like how to enable system-wide or per socket > should go here. > The enable/disable or per-socket I think is in with the pluggable stack stuff. We might want a Rack man page.. have to think about it. > 5) Any perf numbers to go along with this commit? Synthetic or > production numbers showing improvements in transfer speed or any other > impact on CPU usage (specially with HPTS) that you can share? > CPU will be more but we see close to a drop in rebuffers by about 12% I am told. > 6) In your testing, have you found cases where RACK does poorly compared > to the 'default' stack? Any recommendations on when should RACK be > enabled? (Something like this could go in the manpage.) Nope. R > > Glad to finally see this in -head! > > Cheers, > Hiren Randall Stewart r...@netflix.com 803-317-4952 ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r334804 - in head/sys: kern modules/tcp modules/tcp/rack netinet netinet/tcp_stacks sys
On 06/07/18 at 06:18P, Randall Stewart wrote: > Author: rrs > Date: Thu Jun 7 18:18:13 2018 > New Revision: 334804 > URL: https://svnweb.freebsd.org/changeset/base/334804 > > Log: > This commit brings in a new refactored TCP stack called Rack. > Rack includes the following features: >- A different SACK processing scheme (the old sack structures are not > used). >- RACK (Recent acknowledgment) where counting dup-acks is no longer done > instead time is used to knwo when to retransmit. (see the I-D) >- TLP (Tail Loss Probe) where we will probe for tail-losses to attempt > to try not to take a retransmit time-out. (see the I-D) >- Burst mitigation using TCPHTPS >- PRR (partial rate reduction) see the RFC. > > Once built into your kernel, you can select this stack by either > socket option with the name of the stack is "rack" or by setting > the global sysctl so the default is rack. > > Note that any connection that does not support SACK will be kicked > back to the "default" base FreeBSD stack (currently known as "default"). > > To build this into your kernel you will need to enable in your > kernel: > makeoptions WITH_EXTRA_TCP_STACKS=1 > options TCPHPTS > > Sponsored by: Netflix Inc. > Differential Revision: https://reviews.freebsd.org/D15525 > > Added: > head/sys/modules/tcp/rack/ > head/sys/modules/tcp/rack/Makefile (contents, props changed) > head/sys/netinet/tcp_stacks/rack.c (contents, props changed) > head/sys/netinet/tcp_stacks/rack_bbr_common.h (contents, props changed) > head/sys/netinet/tcp_stacks/sack_filter.c (contents, props changed) > head/sys/netinet/tcp_stacks/sack_filter.h (contents, props changed) > head/sys/netinet/tcp_stacks/tcp_rack.h (contents, props changed) > Modified: > head/sys/kern/uipc_sockbuf.c > head/sys/modules/tcp/Makefile > head/sys/netinet/tcp.h > head/sys/netinet/tcp_log_buf.h > head/sys/netinet/tcp_output.c > head/sys/netinet/tcp_stacks/fastpath.c > head/sys/netinet/tcp_timer.c > head/sys/netinet/tcp_timer.h > head/sys/netinet/tcp_var.h > head/sys/sys/mbuf.h > head/sys/sys/queue.h > head/sys/sys/sockbuf.h > head/sys/sys/time.h I thought we'd have more time to review/test this. Looks like BSDCan commit-spree in effect. :-) A few questions: 1) Does RACK work reliably without HPTS? If yes, has that config been tested? 2) It looks like PRR is tied to RACK. Why did we go that route? Shouldn't it be easily used with the 'default' stack also? 3) Can new SACK be used with the traditional stack? 4) Where should manpage like info for RACK go? a new man-page or extending tcp(4)? Info like how to enable system-wide or per socket should go here. 5) Any perf numbers to go along with this commit? Synthetic or production numbers showing improvements in transfer speed or any other impact on CPU usage (specially with HPTS) that you can share? 6) In your testing, have you found cases where RACK does poorly compared to the 'default' stack? Any recommendations on when should RACK be enabled? (Something like this could go in the manpage.) Glad to finally see this in -head! Cheers, Hiren pgpeZ4e_GHrRL.pgp Description: PGP signature
svn commit: r334804 - in head/sys: kern modules/tcp modules/tcp/rack netinet netinet/tcp_stacks sys
Author: rrs Date: Thu Jun 7 18:18:13 2018 New Revision: 334804 URL: https://svnweb.freebsd.org/changeset/base/334804 Log: This commit brings in a new refactored TCP stack called Rack. Rack includes the following features: - A different SACK processing scheme (the old sack structures are not used). - RACK (Recent acknowledgment) where counting dup-acks is no longer done instead time is used to knwo when to retransmit. (see the I-D) - TLP (Tail Loss Probe) where we will probe for tail-losses to attempt to try not to take a retransmit time-out. (see the I-D) - Burst mitigation using TCPHTPS - PRR (partial rate reduction) see the RFC. Once built into your kernel, you can select this stack by either socket option with the name of the stack is "rack" or by setting the global sysctl so the default is rack. Note that any connection that does not support SACK will be kicked back to the "default" base FreeBSD stack (currently known as "default"). To build this into your kernel you will need to enable in your kernel: makeoptions WITH_EXTRA_TCP_STACKS=1 options TCPHPTS Sponsored by: Netflix Inc. Differential Revision:https://reviews.freebsd.org/D15525 Added: head/sys/modules/tcp/rack/ head/sys/modules/tcp/rack/Makefile (contents, props changed) head/sys/netinet/tcp_stacks/rack.c (contents, props changed) head/sys/netinet/tcp_stacks/rack_bbr_common.h (contents, props changed) head/sys/netinet/tcp_stacks/sack_filter.c (contents, props changed) head/sys/netinet/tcp_stacks/sack_filter.h (contents, props changed) head/sys/netinet/tcp_stacks/tcp_rack.h (contents, props changed) Modified: head/sys/kern/uipc_sockbuf.c head/sys/modules/tcp/Makefile head/sys/netinet/tcp.h head/sys/netinet/tcp_log_buf.h head/sys/netinet/tcp_output.c head/sys/netinet/tcp_stacks/fastpath.c head/sys/netinet/tcp_timer.c head/sys/netinet/tcp_timer.h head/sys/netinet/tcp_var.h head/sys/sys/mbuf.h head/sys/sys/queue.h head/sys/sys/sockbuf.h head/sys/sys/time.h Modified: head/sys/kern/uipc_sockbuf.c == --- head/sys/kern/uipc_sockbuf.cThu Jun 7 18:06:01 2018 (r334803) +++ head/sys/kern/uipc_sockbuf.cThu Jun 7 18:18:13 2018 (r334804) @@ -1283,6 +1283,55 @@ sbsndptr(struct sockbuf *sb, u_int off, u_int len, u_i return (ret); } +struct mbuf * +sbsndptr_noadv(struct sockbuf *sb, uint32_t off, uint32_t *moff) +{ + struct mbuf *m; + + KASSERT(sb->sb_mb != NULL, ("%s: sb_mb is NULL", __func__)); + if (sb->sb_sndptr == NULL || sb->sb_sndptroff > off) { + *moff = off; + if (sb->sb_sndptr == NULL) { + sb->sb_sndptr = sb->sb_mb; + sb->sb_sndptroff = 0; + } + return (sb->sb_mb); + } else { + m = sb->sb_sndptr; + off -= sb->sb_sndptroff; + } + *moff = off; + return (m); +} + +void +sbsndptr_adv(struct sockbuf *sb, struct mbuf *mb, uint32_t len) +{ + /* +* A small copy was done, advance forward the sb_sbsndptr to cover +* it. +*/ + struct mbuf *m; + + if (mb != sb->sb_sndptr) { + /* Did not copyout at the same mbuf */ + return; + } + m = mb; + while (m && (len > 0)) { + if (len >= m->m_len) { + len -= m->m_len; + if (m->m_next) { + sb->sb_sndptroff += m->m_len; + sb->sb_sndptr = m->m_next; + } + m = m->m_next; + } else { + len = 0; + } + } +} + /* * Return the first mbuf and the mbuf data offset for the provided * send offset without changing the "sb_sndptroff" field. Modified: head/sys/modules/tcp/Makefile == --- head/sys/modules/tcp/Makefile Thu Jun 7 18:06:01 2018 (r334803) +++ head/sys/modules/tcp/Makefile Thu Jun 7 18:18:13 2018 (r334804) @@ -7,10 +7,12 @@ SYSDIR?=${SRCTOP}/sys SUBDIR=\ ${_tcp_fastpath} \ +${_tcp_rack} \ ${_tcpmd5} \ .if ${MK_EXTRA_TCP_STACKS} != "no" || defined(ALL_MODULES) _tcp_fastpath= fastpath +_tcp_rack= rack .endif .if (${MK_INET_SUPPORT} != "no" || ${MK_INET6_SUPPORT} != "no") || \ Added: head/sys/modules/tcp/rack/Makefile == --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ head/sys/modules/tcp/rack/Makefile Thu Jun 7 18:18:13 2018 (r334804) @@ -0,0 +1,24 @@ +# +# $FreeBSD$ +# + +.PATH: ${.CURDIR}/../../../netinet/tcp_stacks + +STACKNAME=