Re: [PATCH net-next ] tcp: add new parameter tcp_inherit_buffsize to control the initial buffer size for new passive connetion
On Mon, 2017-05-08 at 15:49 +0800, Shan Wei wrote: > Indeed, HZ=250. What is the problem about autotuning with HZ=250? > You might look at recent patches. commit 645f4c6f2ebd040688cc2a5f626ffc909e66ccf2 ("tcp: switch rcv_rtt_est and rcvq_space to high resolution timestamps") Thanks.
Re: [PATCH net-next ] tcp: add new parameter tcp_inherit_buffsize to control the initial buffer size for new passive connetion
at 2017/5/5 21:52 > On Fri, 2017-05-05 at 18:22 +0800, wrote: >> >> 2017-05-02 23:05 GMT+08:00 Eric Dumazet: >> >> >> Hi Shan >> >> 1) Your patch never reached netdev, because it was sent in >> HTML format. >> >> 2) During Linus merge window, net-next is closed >> >> I am not really convinced that we need this with TCP >> autotuning anyway. >> Initial value of sk_sndbuf and sk_rcvbuf is really a hint. >> >> How often do you really tweak /proc/sys/net/ipv4 files in >> production. >> >> >> > Again you posted your reply in HTML, so netdev did not get it. Now, fix it with thunderbird instead of gmail web client. >> Historically, most products have not adjusted any wmem/rmem >> parameters. >> In our experience to adjust the parameters, you can get a better visit >> experience, faster. >> >> Now more and more products want to change this default value. >> But for a product, once we adjust the parameters, it is rarely >> adjusted. >> >> >> We hope that this patch can provide a capacity: adjust the parameters >> online and quickly take effect. >> >> Without this patch, it is time consuming for us to adjust the >> parameters and need to migrate flows. >> >> >> >> >> Please provide more information, like what actual values you >> change back >> and forth. >> >> >> >> There is no a fixed value here. big value, will take up more memory >> resources. >> But we are also willing to sacrifice memory for performance. >> >> Select the parameter value based on the characteristics of the >> product. >> >> For example, for 1MB files, we will set buffsize to 1611646 = (1MB / >> 1460) * 2244 >> Mss = 1460, SKB_TRUESIZE = 2244 >> >> >> With TCP autotuning, reach the buffer size until the cwnd equal 359 >> from 10. > Looks like a problem with autotuning because you might have HZ=100 or > HZ=250 maybe ? Indeed, HZ=250. What is the problem about autotuning with HZ=250?
Re: [PATCH net-next ] tcp: add new parameter tcp_inherit_buffsize to control the initial buffer size for new passive connetion
On Fri, 2017-05-05 at 18:22 +0800, 单卫 wrote: > > > 2017-05-02 23:05 GMT+08:00 Eric Dumazet: > > > Hi Shan > > 1) Your patch never reached netdev, because it was sent in > HTML format. > > 2) During Linus merge window, net-next is closed > > I am not really convinced that we need this with TCP > autotuning anyway. > Initial value of sk_sndbuf and sk_rcvbuf is really a hint. > > How often do you really tweak /proc/sys/net/ipv4 files in > production. > > > Again you posted your reply in HTML, so netdev did not get it. > > Historically, most products have not adjusted any wmem/rmem > parameters. > In our experience to adjust the parameters, you can get a better visit > experience, faster. > > Now more and more products want to change this default value. > But for a product, once we adjust the parameters, it is rarely > adjusted. > > > We hope that this patch can provide a capacity: adjust the parameters > online and quickly take effect. > > Without this patch, it is time consuming for us to adjust the > parameters and need to migrate flows. > > > > > Please provide more information, like what actual values you > change back > and forth. > > > > There is no a fixed value here. big value, will take up more memory > resources. > But we are also willing to sacrifice memory for performance. > > Select the parameter value based on the characteristics of the > product. > > > For example, for 1MB files, we will set buffsize to 1611646 = (1MB / > 1460) * 2244 > Mss = 1460, SKB_TRUESIZE = 2244 > > > With TCP autotuning, reach the buffer size until the cwnd equal 359 > from 10. Looks like a problem with autotuning because you might have HZ=100 or HZ=250 maybe ? We want to make autotuning better, so that tweaks like that are no longer needed. > >
Re: [PATCH net-next ] tcp: add new parameter tcp_inherit_buffsize to control the initial buffer size for new passive connetion
On Tue, 2017-05-02 at 12:57 +0800, 单卫 wrote: > Current sndbuff/rcvbuff value of new passive connetion inherit from > listen socket. > After then, tcp_init_buffer_space() initial them with init_cwnd and > tcp_should_expand_sndbuf() adjust them according the new cwnd. > > > But, For Operation & Maintenance engineer, can't control > sndbuff/rcvbuff with > tcp_wmem/tcp_rmem sysctl parameter online. They need to migrate flows > out > and migrate them in after restart services and turn sndbuff/rcvfuff > up. > > > This patch is useful for online servcie setting tcp_inherit_buffsize > with 0. > By default, keep be consistent whth before. > > > Signed-off-by: Shan Wei> --- > Documentation/networking/ip-sysctl.txt |8 > include/net/netns/ipv4.h |1 + > net/ipv4/sysctl_net_ipv4.c |7 +++ > net/ipv4/tcp_ipv4.c|1 + > net/ipv4/tcp_minisocks.c |6 ++ > 5 files changed, 23 insertions(+), 0 deletions(-) > > > diff --git a/Documentation/networking/ip-sysctl.txt > b/Documentation/networking/ip-sysctl.txt > index 974ab47..3292bbf 100644 > --- a/Documentation/networking/ip-sysctl.txt > +++ b/Documentation/networking/ip-sysctl.txt > @@ -308,6 +308,14 @@ tcp_frto - INTEGER > > By default it's enabled with a non-zero value. 0 disables F-RTO. > > +tcp_inherit_buffsize - BOOLEAN > + For a new passive TCP connection, can use current tcp_wmem/tcp_rmem > + parameter to set initial snd/rcv buffer size. This is useful for > online > + services which no need to be restarted just set it with 0. By > default, > + new passive connection inherits snd/rcv buffer size from lister > socket. > + > + Default: 1 > + > tcp_invalid_ratelimit - INTEGER > Limit the maximal rate for sending duplicate acknowledgments > in response to incoming TCP packets that are for an existing > diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h > index cd686c4..1fc85bd 100644 > --- a/include/net/netns/ipv4.h > +++ b/include/net/netns/ipv4.h > @@ -124,6 +124,7 @@ struct netns_ipv4 { > int sysctl_tcp_tw_reuse; > struct inet_timewait_death_row tcp_death_row; > int sysctl_max_syn_backlog; > + int sysctl_tcp_inherit_buffsize; > > #ifdef CONFIG_NET_L3_MASTER_DEV > int sysctl_udp_l3mdev_accept; > diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c > index 86957e9..58a8bec 100644 > --- a/net/ipv4/sysctl_net_ipv4.c > +++ b/net/ipv4/sysctl_net_ipv4.c > @@ -1078,6 +1078,13 @@ static int > proc_tfo_blackhole_detect_timeout(struct ctl_table *table, > .mode = 0644, > .proc_handler = proc_dointvec > }, > + { > + .procname = "tcp_inherit_buffsize", > + .data = _net.ipv4.sysctl_tcp_inherit_buffsize, > + .maxlen = sizeof(int), > + .mode = 0644, > + .proc_handler = proc_dointvec > + }, > #ifdef CONFIG_IP_ROUTE_MULTIPATH > { > .procname = "fib_multipath_use_neigh", > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c > index cbbafe5..7a8a2bb 100644 > --- a/net/ipv4/tcp_ipv4.c > +++ b/net/ipv4/tcp_ipv4.c > @@ -2457,6 +2457,7 @@ static int __net_init tcp_sk_init(struct net > *net) > net->ipv4.tcp_death_row.hashinfo = _hashinfo; > > net->ipv4.sysctl_max_syn_backlog = max(128, cnt / 256); > + net->ipv4.sysctl_tcp_inherit_buffsize = 1; > > return 0; > fail: > diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c > index 8f6373b..6090d7e 100644 > --- a/net/ipv4/tcp_minisocks.c > +++ b/net/ipv4/tcp_minisocks.c > @@ -474,6 +474,12 @@ struct sock *tcp_create_openreq_child(const > struct sock *sk, > tcp_init_xmit_timers(newsk); > newtp->write_seq = newtp->pushed_seq = treq->snt_isn + 1; > > + if (!(sock_net(newsk)->ipv4.sysctl_tcp_inherit_buffsize) && > + !(sk->sk_userlocks & SOCK_SNDBUF_LOCK)) { > + newsk->sk_sndbuf = sock_net(sk)->ipv4.sysctl_tcp_wmem[1]; > + newsk->sk_rcvbuf = sock_net(sk)->ipv4.sysctl_tcp_rmem[1]; > + } > + Hi Shan 1) Your patch never reached netdev, because it was sent in HTML format. 2) During Linus merge window, net-next is closed I am not really convinced that we need this with TCP autotuning anyway. Initial value of sk_sndbuf and sk_rcvbuf is really a hint. How often do you really tweak /proc/sys/net/ipv4 files in production. Please provide more information, like what actual values you change back and forth. Thanks.