Hi,
On Wed, Aug 26, 2020 at 12:47:43AM +0000, Yi Yang (杨燚)-云服务集团 wrote: > Aaron, thank for your comments, actually final value depends on > /proc/sys/net/core/rmem_max and /proc/sys/net/core/wmem_max, so it is still > configurable. setsockopt(...) will set it to minimum one among of > 1073741823 and w/rmem_max. > > -----邮件原件----- > 发件人: dev [mailto:[email protected]] 代表 Aaron Conole > 发送时间: 2020年8月25日 23:26 > 收件人: [email protected] > 抄送: [email protected]; [email protected]; [email protected] > 主题: Re: [ovs-dev] [PATCH] userspace: fix bad UDP performance issue of veth > > [email protected] writes: > > > From: Yi Yang <[email protected]> > > > > iperf3 UDP performance of veth to veth case is very very bad because > > of too many packet loss, the root cause is rmem_default and > > wmem_default are just 212992, but iperf3 UDP test used 8K UDP size > > which resulted in many UDP fragment in case that MTU size is 1500, one > > 8K UDP send would enqueue 6 UDP fragments to socket receive queue, the > > default small socket buffer size can't cache so many packets that many > > packets are lost. > > > > This commit fixed packet loss issue, it set socket receive and send > > buffer to maximum possible value, therefore there will not be packet > > loss forever, this also helps improve TCP performance because of no > > retransmit. > > > > By the way, big socket buffer doesn't mean it will allocate big buffer > > on creating socket, actually it won't alocate any extra buffer > > compared to default socket buffer size, it just means more skbuffs can > > be enqueued to socket receive queue and send queue, therefore there > > will not be packet loss. > > > > The below is for your reference. > > > > The result before apply this commit > > =================================== > > $ ip netns exec ns02 iperf3 -t 5 -i 1 -u -b 100M -c 10.15.2.6 > > --get-server-output -A 5 Connecting to host 10.15.2.6, port 5201 [ 4] > > local 10.15.2.2 port 59053 connected to 10.15.2.6 port 5201 > > [ ID] Interval Transfer Bandwidth Total Datagrams > > [ 4] 0.00-1.00 sec 10.8 MBytes 90.3 Mbits/sec 1378 > > [ 4] 1.00-2.00 sec 11.9 MBytes 100 Mbits/sec 1526 > > [ 4] 2.00-3.00 sec 11.9 MBytes 100 Mbits/sec 1526 > > [ 4] 3.00-4.00 sec 11.9 MBytes 100 Mbits/sec 1526 > > [ 4] 4.00-5.00 sec 11.9 MBytes 100 Mbits/sec 1526 > > - - - - - - - - - - - - - - - - - - - - - - - - - > > [ ID] Interval Transfer Bandwidth Jitter Lost/Total > > Datagrams > > [ 4] 0.00-5.00 sec 58.5 MBytes 98.1 Mbits/sec 0.047 ms 357/531 > > (67%) > > [ 4] Sent 531 datagrams > > > > Server output: > > ----------------------------------------------------------- > > Accepted connection from 10.15.2.2, port 60314 [ 5] local 10.15.2.6 > > port 5201 connected to 10.15.2.2 port 59053 > > [ ID] Interval Transfer Bandwidth Jitter Lost/Total > > Datagrams > > [ 5] 0.00-1.00 sec 1.36 MBytes 11.4 Mbits/sec 0.047 ms 357/531 > > (67%) > > [ 5] 1.00-2.00 sec 0.00 Bytes 0.00 bits/sec 0.047 ms 0/0 (-nan%) > > [ 5] 2.00-3.00 sec 0.00 Bytes 0.00 bits/sec 0.047 ms 0/0 (-nan%) > > [ 5] 3.00-4.00 sec 0.00 Bytes 0.00 bits/sec 0.047 ms 0/0 (-nan%) > > [ 5] 4.00-5.00 sec 0.00 Bytes 0.00 bits/sec 0.047 ms 0/0 (-nan%) > > > > iperf Done. > > > > The result after apply this commit > > =================================== > > $ sudo ip netns exec ns02 iperf3 -t 5 -i 1 -u -b 4G -c 10.15.2.6 > > --get-server-output -A 5 Connecting to host 10.15.2.6, port 5201 [ 4] > > local 10.15.2.2 port 48547 connected to 10.15.2.6 port 5201 > > [ ID] Interval Transfer Bandwidth Total Datagrams > > [ 4] 0.00-1.00 sec 440 MBytes 3.69 Gbits/sec 56276 > > [ 4] 1.00-2.00 sec 481 MBytes 4.04 Gbits/sec 61579 > > [ 4] 2.00-3.00 sec 474 MBytes 3.98 Gbits/sec 60678 > > [ 4] 3.00-4.00 sec 480 MBytes 4.03 Gbits/sec 61452 > > [ 4] 4.00-5.00 sec 480 MBytes 4.03 Gbits/sec 61441 > > - - - - - - - - - - - - - - - - - - - - - - - - - > > [ ID] Interval Transfer Bandwidth Jitter Lost/Total > > Datagrams > > [ 4] 0.00-5.00 sec 2.30 GBytes 3.95 Gbits/sec 0.024 ms 0/301426 > > (0%) > > [ 4] Sent 301426 datagrams > > > > Server output: > > ----------------------------------------------------------- > > Accepted connection from 10.15.2.2, port 60320 [ 5] local 10.15.2.6 > > port 5201 connected to 10.15.2.2 port 48547 > > [ ID] Interval Transfer Bandwidth Jitter Lost/Total > > Datagrams > > [ 5] 0.00-1.00 sec 209 MBytes 1.75 Gbits/sec 0.021 ms 0/26704 (0%) > > [ 5] 1.00-2.00 sec 258 MBytes 2.16 Gbits/sec 0.025 ms 0/32967 (0%) > > [ 5] 2.00-3.00 sec 258 MBytes 2.16 Gbits/sec 0.022 ms 0/32987 (0%) > > [ 5] 3.00-4.00 sec 257 MBytes 2.16 Gbits/sec 0.023 ms 0/32954 (0%) > > [ 5] 4.00-5.00 sec 257 MBytes 2.16 Gbits/sec 0.021 ms 0/32937 (0%) > > [ 5] 5.00-6.00 sec 255 MBytes 2.14 Gbits/sec 0.026 ms 0/32685 (0%) > > [ 5] 6.00-7.00 sec 254 MBytes 2.13 Gbits/sec 0.025 ms 0/32453 (0%) > > [ 5] 7.00-8.00 sec 255 MBytes 2.14 Gbits/sec 0.026 ms 0/32679 (0%) > > [ 5] 8.00-9.00 sec 255 MBytes 2.14 Gbits/sec 0.022 ms 0/32669 (0%) > > > > iperf Done. > > > > Signed-off-by: Yi Yang <[email protected]> > > --- > > I think we should make it configurable. Each RXQ will potentially allow a > huge number of skbuffs to be enqueued after this. That might, ironically, > lead to worse performance (since there could be some kind of buffer bloat > effect at higher rmem values as documented at > https://serverfault.com/questions/410230/higher-rmem-max-value-leading-to-more-packet-loss). > > I think it should be a decision that the operator can take. Currently, they > could modify it anyway via procfs, so we shouldn't break that. > Instead, I think there could be a config knob (or maybe reuse the > 'n_{r,t}xq_desc'?) that when set would be used, and otherwise could just be > from default. If my memory serves me right, calling setsockopt(SO_RCVBUF) disables kernel's auto-tune buffer which very often hurts TCP performance, especially under load. -- fbl _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
