Re: Enable TCP selective acknowledgements (SACK) on all kernels

2017-10-23 Thread Job Snijders
On Sun, Oct 22, 2017 at 04:04:30PM +0200, Mike Belopuhov wrote:
> > If this is as expected, OK job@
> 
> It's setting the option in my build here:
> 
> 15:55:20.336682 fe:e1:bb:d1:a2:f0 fe:e1:ba:d0:55:1e 0800 78: \
>   10.50.50.34.17078 > 10.50.50.1.80: S [tcp sum ok] 1313610867:1313610867(0) \
>   win 16384  3624645505 0> \
>   (DF) (ttl 64, id 25292, len 64)

I went back and tested again, but with a different packetsniffer. You
were right. I now question the tool I used to observe this traffic on a
middle box is giving me correct output.

I can confirm there is no difference observable with tcpdump between
miniroot ftp(1) and normal ftp(1). SACK is set in both contexts.

All good! :)

Kind regards,

Job



Re: Enable TCP selective acknowledgements (SACK) on all kernels

2017-10-22 Thread Mike Belopuhov
On Sun, Oct 22, 2017 at 11:23 +0200, Job Snijders wrote:
> On Thu, Oct 19, 2017 at 06:55:05PM +0200, Mike Belopuhov wrote:
> > SACK has been enabled in GENERIC kernels for over a decade and it's
> > time to make it an official part of the TCP stack. 
> 
> I tested your diff by doing an amd64 release build and testing both the
> newly created /bsd and /bsd.rd, I observed no problems and SACK was
> available in both boot scenarios.
> 
> One thing that stood out to me is that the miniroot's "SMALL" ftp(1)
> didn't sent SACK as permitted tcp option. However, after chrooting into
> my normal environment and using the real /usr/bin/ftp, I observed that
> SACK was available and used.
> 
> If this is as expected, OK job@
>

It's setting the option in my build here:

15:55:20.336682 fe:e1:bb:d1:a2:f0 fe:e1:ba:d0:55:1e 0800 78: \
  10.50.50.34.17078 > 10.50.50.1.80: S [tcp sum ok] 1313610867:1313610867(0) \
  win 16384  \
  (DF) (ttl 64, id 25292, len 64)

> > This grows bsd.rd on amd64 by 8k but Theo said it's within reasonable.
> > OK?
> 
> $ ls -latr /bsd.rd /tmp/bsd.rd
> -rw---  1 root  wheel  9787542 Oct 22 08:36 /bsd.rd
> -rw---  1 job   wheel  9782763 Oct 19 12:48 /old/bsd.rd
> 
> > diff --git sys/conf/GENERIC sys/conf/GENERIC
> > -option TCP_SACK# Selective Acknowledgements for TCP
> 
> I think the below patch may be an appropriate companion for removal of
> the option.
>

Yes, jmc@ has already notified me that I didn't include the
manpage diff, but please be my guest and check your diff in.

> Kind regards,
> 
> Job
> 
> diff --git share/man/man4/options.4 share/man/man4/options.4
> index 3e15d4c8c4f..3945611607e 100644
> --- share/man/man4/options.4
> +++ share/man/man4/options.4
> @@ -454,20 +454,6 @@ Turns on forward acknowledgements allowing a more 
> precise estimate of
>  outstanding data during the fast recovery phase by using
>  .Em SACK
>  information.
> -This option can only be used together with
> -.Em TCP_SACK .
> -.It Cd option TCP_SACK
> -Turns on selective acknowledgements.
> -Additional information about
> -segments already received can be transmitted back to the sender,
> -thus indicating segments that have been lost and allowing for
> -a swifter recovery.
> -Both communication endpoints need to support
> -.Em SACK .
> -The fallback behaviour is NewReno fast recovery phase, which allows
> -one lost segment to be recovered per round trip time.
> -When more than one segment has been dropped per window, the transmission can
> -continue without waiting for a retransmission timeout.
>  .It Cd option TCP_SIGNATURE
>  Turns on support for the TCP MD5 Signature option (RFC 2385).
>  This is used by
> 



Re: Enable TCP selective acknowledgements (SACK) on all kernels

2017-10-22 Thread Job Snijders
On Thu, Oct 19, 2017 at 06:55:05PM +0200, Mike Belopuhov wrote:
> SACK has been enabled in GENERIC kernels for over a decade and it's
> time to make it an official part of the TCP stack. 

I tested your diff by doing an amd64 release build and testing both the
newly created /bsd and /bsd.rd, I observed no problems and SACK was
available in both boot scenarios.

One thing that stood out to me is that the miniroot's "SMALL" ftp(1)
didn't sent SACK as permitted tcp option. However, after chrooting into
my normal environment and using the real /usr/bin/ftp, I observed that
SACK was available and used.

If this is as expected, OK job@

> This grows bsd.rd on amd64 by 8k but Theo said it's within reasonable.
> OK?

$ ls -latr /bsd.rd /tmp/bsd.rd
-rw---  1 root  wheel  9787542 Oct 22 08:36 /bsd.rd
-rw---  1 job   wheel  9782763 Oct 19 12:48 /old/bsd.rd

> diff --git sys/conf/GENERIC sys/conf/GENERIC
> -option   TCP_SACK# Selective Acknowledgements for TCP

I think the below patch may be an appropriate companion for removal of
the option.

Kind regards,

Job

diff --git share/man/man4/options.4 share/man/man4/options.4
index 3e15d4c8c4f..3945611607e 100644
--- share/man/man4/options.4
+++ share/man/man4/options.4
@@ -454,20 +454,6 @@ Turns on forward acknowledgements allowing a more precise 
estimate of
 outstanding data during the fast recovery phase by using
 .Em SACK
 information.
-This option can only be used together with
-.Em TCP_SACK .
-.It Cd option TCP_SACK
-Turns on selective acknowledgements.
-Additional information about
-segments already received can be transmitted back to the sender,
-thus indicating segments that have been lost and allowing for
-a swifter recovery.
-Both communication endpoints need to support
-.Em SACK .
-The fallback behaviour is NewReno fast recovery phase, which allows
-one lost segment to be recovered per round trip time.
-When more than one segment has been dropped per window, the transmission can
-continue without waiting for a retransmission timeout.
 .It Cd option TCP_SIGNATURE
 Turns on support for the TCP MD5 Signature option (RFC 2385).
 This is used by



Enable TCP selective acknowledgements (SACK) on all kernels

2017-10-19 Thread Mike Belopuhov
SACK has been enabled in GENERIC kernels for over a decade and it's
time to make it an official part of the TCP stack.  This grows bsd.rd
on amd64 by 8k but Theo said it's within reasonable.  OK?

diff --git sys/conf/GENERIC sys/conf/GENERIC
index 87dd069f514..cd68ae9e651 100644
--- sys/conf/GENERIC
+++ sys/conf/GENERIC
@@ -43,11 +43,10 @@ option  MSDOSFS # MS-DOS file system
 option FIFO# FIFOs; RECOMMENDED
 #optionTMPFS   # efficient memory file system
 option FUSE# FUSE
 
 option SOCKET_SPLICE   # Socket Splicing for TCP and UDP
-option TCP_SACK# Selective Acknowledgements for TCP
 option TCP_ECN # Explicit Congestion Notification for TCP
 option TCP_SIGNATURE   # TCP MD5 Signatures, for BGP routing sessions
 #optionTCP_FACK# Forward Acknowledgements for TCP
 
 option INET6   # IPv6
diff --git sys/netinet/tcp_input.c sys/netinet/tcp_input.c
index 52c206f0bf5..9951923bbdb 100644
--- sys/netinet/tcp_input.c
+++ sys/netinet/tcp_input.c
@@ -852,14 +852,12 @@ findpcb:
 */
tp->t_rcvtime = tcp_now;
if (TCPS_HAVEESTABLISHED(tp->t_state))
TCP_TIMER_ARM(tp, TCPT_KEEP, tcp_keepidle);
 
-#ifdef TCP_SACK
if (tp->sack_enable)
tcp_del_sackholes(tp, th); /* Delete stale SACK holes */
-#endif /* TCP_SACK */
 
/*
 * Process options.
 */
 #ifdef TCP_SIGNATURE
@@ -962,25 +960,23 @@ findpcb:
 */
if (tp->t_pmtud_mss_acked < acked)
tp->t_pmtud_mss_acked = acked;
 
tp->snd_una = th->th_ack;
-#if defined(TCP_SACK) || defined(TCP_ECN)
/*
 * We want snd_last to track snd_una so
 * as to avoid sequence wraparound problems
 * for very large transfers.
 */
 #ifdef TCP_ECN
if (SEQ_GT(tp->snd_una, tp->snd_last))
 #endif
tp->snd_last = tp->snd_una;
-#endif /* TCP_SACK */
-#if defined(TCP_SACK) && defined(TCP_FACK)
+#ifdef TCP_FACK
tp->snd_fack = tp->snd_una;
tp->retran_data = 0;
-#endif /* TCP_FACK */
+#endif
m_freem(m);
 
/*
 * If all outstanding data are acked, stop
 * retransmit timer, otherwise restart timer
@@ -1012,15 +1008,13 @@ findpcb:
/*
 * This is a pure, in-sequence data packet
 * with nothing on the reassembly queue and
 * we have enough buffer space to take it.
 */
-#ifdef TCP_SACK
/* Clean receiver SACK report if present */
if (tp->sack_enable && tp->rcv_numsacks)
tcp_clean_sackreport(tp);
-#endif /* TCP_SACK */
tcpstat_inc(tcps_preddat);
tp->rcv_nxt += tlen;
tcpstat_pkt(tcps_rcvpack, tcps_rcvbyte, tlen);
ND6_HINT(tp);
 
@@ -1137,19 +1131,17 @@ findpcb:
/* Reset initial window to 1 segment for retransmit */
if (tp->t_rxtshift > 0)
tp->snd_cwnd = tp->t_maxseg;
tcp_rcvseqinit(tp);
tp->t_flags |= TF_ACKNOW;
-#ifdef TCP_SACK
 /*
  * If we've sent a SACK_PERMITTED option, and the peer
  * also replied with one, then TF_SACK_PERMIT should have
  * been set in tcp_dooptions().  If it was not, disable SACKs.
  */
if (tp->sack_enable)
tp->sack_enable = tp->t_flags & TF_SACK_PERMIT;
-#endif
 #ifdef TCP_ECN
/*
 * if ECE is set but CWR is not set for SYN-ACK, or
 * both ECE and CWR are set for simultaneous open,
 * peer is ECN capable.
@@ -1569,11 +1561,11 @@ trimthenstep6:
 * to keep a constant cwnd packets in the
 * network.
 */
if (TCP_TIMER_ISARMED(tp, TCPT_REXMT) == 0)
tp->t_dupacks = 0;
-#if defined(TCP_SACK) && defined(TCP_FACK)
+#ifdef TCP_FACK
/*
 * In FACK, can enter fast rec. if the receiver
 * reports a reass. queue longer than 3 segs.
 */
else