Hi Darrell: the flow dump result is as below: Please help to check BEFORE:
ct_state(-new+est-rel-rpl-inv+trk),ct_label(0/0x1),recirc_id(0x1b),in_port(6),packet_type(ns=0,id=0),eth(src=fa:16:3e:12:d7:77,dst=fa:16:3e:33:02:d8),eth_type(0x0800),ipv4(dst=192.168.1.8/255.255.255.248,proto=6,frag=no),tcp_flags(psh|ack), packets:18934, bytes:26602222, used:0.000s, flags:P., actions:ct(zone=1),recirc(0x1c) ct_state(-new+est-rel-rpl-inv+trk),ct_label(0/0x1),recirc_id(0x1c),in_port(6),packet_type(ns=0,id=0),eth(src=fa:16:3e:12:d7:77,dst=fa:16:3e:33:02:d8),eth_type(0x0800),ipv4(src=192.168.1.10,dst=192.168.1.8,proto=6,frag=no), packets:5345996, bytes:7676256441, used:0.000s, flags:P., actions:5 AFTER: ct_state(-new+est-rel-rpl-inv+trk),ct_label(0/0x1),recirc_id(0x19),in_port(6),packet_type(ns=0,id=0),eth(src=fa:16:3e:12:d7:77,dst=fa:16:3e:33:02:d8),eth_type(0x0800),ipv4(dst=192.168.1.8/255.255.255.248,proto=6,frag=no),tcp_flags(ack), packets:2473174, bytes:3551472384, used:0.136s, flags:., actions:meter(0),ct(zone=1),recirc(0x1a) ct_state(-new+est-rel-rpl-inv+trk),ct_label(0/0x1),recirc_id(0x1a),in_port(6),packet_type(ns=0,id=0),eth(src=fa:16:3e:12:d7:77,dst=fa:16:3e:33:02:d8),eth_type(0x0800),ipv4(src=192.168.1.10,dst=192.168.1.8,proto=6,frag=no), packets:5292889, bytes:7599875381, used:0.046s, flags:P., actions:5 meter rate is 1Gbps, iperf result is around 800Mbps [ 5] 95.00-96.00 sec 104 MBytes 869 Mbits/sec [ 5] 96.00-97.00 sec 79.4 MBytes 666 Mbits/sec [ 5] 97.00-98.00 sec 107 MBytes 896 Mbits/sec [ 5] 98.00-99.00 sec 75.4 MBytes 632 Mbits/sec [ 5] 99.00-100.00 sec 98.3 MBytes 824 Mbits/sec [ 5] 100.00-100.04 sec 0.00 Bytes 0.00 bits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth [ 5] 0.00-100.04 sec 0.00 Bytes 0.00 bits/sec sender [ 5] 0.00-100.04 sec 9.29 GBytes 798 Mbits/sec receiver ------------------------------------------------------------------ :Darrell Ball <[email protected]> :2019年11月6日(星期三) 02:46 :txfh2007 <[email protected]> :Ben Pfaff <[email protected]>; ovs-discuss <[email protected]> :Re: [ovs-discuss] Re:Re: [HELP] Question about icmp pkt marked Invalid by userspace conntrack Hi Timo On Mon, Nov 4, 2019 at 11:29 PM txfh2007 <[email protected]> wrote: Hi Darrell: The meter rate limit is set as 1Gbps, but the actual rate is around 500Mbps.. I have read the meter patch, but this patch is to prevent delta_t changed to 0. But in my case, the delta_t is around 35500ms. It might be good to just include all known related fixes anyways, including this other one https://github.com/openvswitch/ovs/commit/acc5df0e3cb036524d49891fdb9ba89b609dd26a For my case, the meter action is on openflow table 46, and the ct action is on table 44, the output action is on table 65, so I guess the order is right? Could you dump the 'relevant' datapath flows before adding the meter rule and after adding the meter rule ? ovs-appctl dpif/dump-flows <bridge> Thanks Timo ------------------------------------------------------------------ :Darrell Ball <[email protected]> :2019年11月5日(星期二) 06:56 :txfh2007 <[email protected]> :Ben Pfaff <[email protected]>; ovs-discuss <[email protected]> :Re: [ovs-discuss] Re:Re: [HELP] Question about icmp pkt marked Invalid by userspace conntrack Hi Timo On Sun, Nov 3, 2019 at 5:12 PM txfh2007 <[email protected]> wrote: Hi Darrell: Sorry for my late reply. Yes, the two VMs under test are on same compute node , and pkts rx/tx via vhost user type port. Got it Firstly if I don't configure meter table, then Iperf TCP bandwidth result From VM1 to VM2 is around 5Gbps, then I set the meter entry and constraint the rate, and the deviation is larger than I throught. IIUC, pre-meter, you get 5 Gbps, then post-meter 0.5 Gpbs, which is less than you expected ? What did you expect the metered rate to be ? Note Ben pointed you to a meter related bug fix on the alias b4. I guess the recalculation of l4 checksum during conntrack would impact the actual rate? are you applying the meter rule at end of the complete pipeline ? Thank you Timo txfh2007 <[email protected]> Ben Pfaff <[email protected]>; ovs-discuss <[email protected]> Re: [ovs-discuss] Re:Re: [HELP] Question about icmp pkt marked Invalid by userspace conntrack Hi Timo I read thru this thread to get more context on what you are doing; you have a base OVS-DPDK use case and are measuring VM to VM performance across 2 compute nodes. You are probably using vhost-user-client ports ? Pls correct me if I am wrong. In this case, "per direction" you have one rx virtual interface to handle in OVS; there will be a tradeoff b/w checksum validation security and performance. JTBC, in terms of your measurements, how did you arrive at the 5Gbps - instrumented code or otherwise ?. (I can verify that later when I have a setup). Darrell On Thu, Oct 31, 2019 at 9:23 AM Darrell Ball <[email protected]> wrote: On Thu, Oct 31, 2019 at 3:04 AM txfh2007 via discuss <[email protected]> wrote: Hi Ben && Darrell: This patch works, but after merging this patch I have found the iperf throughout decrease from 5Gbps+ to 500Mbps. what is the 5Gbps number ? Is that the number with marking all packets as invalid in initial sanity checks ? Typically one wants to offload checksum checks. The code checks whether that has been done and skips doing it in software; can you verify that you have the capability and are using it ? Skipping checksum checks reduces security, of course, but it can be added if there is a common case of not being able to offload checksumming. I guess maybe we should add a switch to turn off layer4 checksum validation when doing userspace conntrack ? I have found for kernel conntrack, there is a related button named "nf_conntrack_checksum" . Any advice? Thank you ! ------------------------------------------------------------------ :Ben Pfaff <[email protected]> :ovs-discuss <[email protected]> :Re:Re:[ovs-discuss] [HELP] Question about icmp pkt marked Invalid by userspace conntrack Hi Ben && Darrell: Thanks, this patch works! Now the issue seems fixed Timo Re: Re:[ovs-discuss] [HELP] Question about icmp pkt marked Invalid by userspace conntrack I see. It sounds like Darrell pointed out the solution, but please let me know if it did not help. On Fri, Oct 11, 2019 at 08:57:58AM +0800, txfh2007 wrote: > Hi Ben: > > I just found the GCC_UNALIGNED_ACCESSORS error during gdb trace and not > sure this is a misaligned error or others. What I can confirm is during > "extract_l4" of this icmp reply packet, when we do "check_l4_icmp", the > unaligned error emits and the "extract_l4" returned false. So this packet be > marked as ct_state=invalid. > > Thank you for your help. > > Timo > > Topic:Re: [ovs-discuss] [HELP] Question about icmp pkt marked Invalid by > userspace conntrack > > > It's very surprising. > > Are you using a RISC architecture that insists on aligned accesses? On > the other hand, if you are using x86-64 or some other architecture that > ordinarily does not care, are you sure that this is about a misaligned > access (it is more likely to simply be a bad pointer)? > > On Thu, Oct 10, 2019 at 10:50:33PM +0800, txfh2007 via discuss wrote: > > > > Hi all: > > I was using OVS-DPDK(version 2.10-1), and I have found pinging between > > two VMs on different compute nodes failed. I have checked my env and found > > there is one node's NIC cannot strip CRC of a frame, the other node's NIC > > is normal(I mean it can strip CRC ). And the reason of ping fail is the > > icmp reply pkt (from node whose NIC cannot strip CRC) is marked as invalid > > . So the icmp request From Node A is 64 bytes, but the icmp reply From Node > > B is 68 bytes(with 4 bytes CRC). And when doing "check_l4_icmp", when we > > call csum task(in lib/csum.c). Gcc emits unaligned accessor error. The > > backtrace is as below: > > > > I just want to confirm if this phenomenon is reasonable? > > > > Many thanks > > > > Timo > > > > > > get_unaligned_be16 (p=0x7f2ad0b1ed5c) at lib/unaligned.h:89 > > 89 GCC_UNALIGNED_ACCESSORS(ovs_be16, be16); > > (gdb) bt > > #0 get_unaligned_be16 (p=0x7f2ad0b1ed5c) at lib/unaligned.h:89 > > #1 0x000000000075a584 in csum_continue (partial=0, data_=0x7f2ad0b1ed5c, > > n=68) at lib/csum.c:46 > > #2 0x000000000075a552 in csum (data=0x7f2ad0b1ed5c, n=68) at lib/csum.c:33 > > #3 0x00000000008ddf18 in check_l4_icmp (data=0x7f2ad0b1ed5c, size=68, > > validate_checksum=true) at lib/conntrack.c:1638 > > #4 0x00000000008de650 in extract_l4 (key=0x7f32a20df120, > > data=0x7f2ad0b1ed5c, size=68, related=0x7f32a20df15d, l3=0x7f2ad0b1ed48, > > validate_checksum=true) at lib/conntrack.c:1888 > > #5 0x00000000008de90d in conn_key_extract (ct=0x7f32b42a2d98, > > pkt=0x7f2ad0b1e9c0, dl_type=8, ctx=0x7f32a20df120, zone=4) > > at lib/conntrack.c:1973 > > #6 0x00000000008dd49c in conntrack_execute (ct=0x7f32b42a2d98, > > pkt_batch=0x7f32a20e08b0, dl_type=8, force=false, commit=false, > > zone=4, setmark=0x0, setlabel=0x0, tp_src=0, tp_dst=0, helper=0x0, > > nat_action_info=0x0, now=5395897849) at lib/conntrack.c:1318 > > #7 0x000000000076d651 in dp_execute_cb (aux_=0x7f32a20dfb00, > > packets_=0x7f32a20e08b0, a=0x7f32a20e0ac8, should_steal=false) > > at lib/dpif-netdev.c:6711 > > #8 0x00000000007b2d49 in odp_execute_actions (dp=0x7f32a20dfb00, > > batch=0x7f32a20e08b0, steal=true, actions=0x7f32a20e0ac8, > > actions_len=20, dp_execute_action=0x76ca60 <dp_execute_cb>) at > > lib/odp-execute.c:726 > > #9 0x000000000076d71b in dp_netdev_execute_actions (pmd=0x7f2a6e1ce010, > > packets=0x7f32a20e08b0, should_steal=true, > > flow=0x7f32a20dfb60, actions=0x7f32a20e0ac8, actions_len=20) at > > lib/dpif-netdev.c:6754 > > #10 0x000000000076b900 in handle_packet_upcall (pmd=0x7f2a6e1ce010, > > packet=0x7f2ad0b1e9c0, key=0x7f32a20e1100, > > actions=0x7f32a20e0a40, put_actions=0x7f32a20e0a80) at > > lib/dpif-netdev.c:6056 > > #11 0x000000000076bdf0 in fast_path_processing (pmd=0x7f2a6e1ce010, > > packets_=0x7f32a20e2b60, keys=0x7f32a20e10c0, > > batches=0x7f32a20e0f90, n_batches=0x7f32a20e13c0, in_port=15) at > > lib/dpif-netdev.c:6153 > > #12 0x000000000076c3df in dp_netdev_input__ (pmd=0x7f2a6e1ce010, > > packets=0x7f32a20e2b60, md_is_valid=true, port_no=0) > > at lib/dpif-netdev.c:6230 > > #13 0x000000000076c4d4 in dp_netdev_recirculate (pmd=0x7f2a6e1ce010, > > packets=0x7f32a20e2b60) at lib/dpif-netdev.c:6265 > > #14 0x000000000076ceae in dp_execute_cb (aux_=0x7f32a20e1db0, > > packets_=0x7f32a20e2b60, a=0x7f32a20e2d78, should_steal=true) > > > > > > _______________________________________________ > > discuss mailing list > > [email protected] > > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss > _______________________________________________ discuss mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-discuss _______________________________________________ discuss mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
