On 15 Mar 2023, at 11:07, Chris Mi wrote:
> On 3/15/2023 5:28 PM, Eelco Chaudron wrote: >> >> On 15 Mar 2023, at 4:40, Chris Mi wrote: >> >>> On 3/10/2023 10:02 PM, Eelco Chaudron wrote: >>>> On 10 Mar 2023, at 13:11, Eelco Chaudron wrote: >>>> >>>>> On 1 Mar 2023, at 8:22, Chris Mi wrote: >>>>> >>>>>> This patch set adds offload support for sFlow. >>>>>> >>>>>> Psample is a genetlink channel for packet sampling. TC action act_sample >>>>>> uses psample to send sampled packets to userspace. >>>>>> >>>>>> When offloading sample action to TC, userspace creates a unique ID to >>>>>> map sFlow action and tunnel info and passes this ID to kernel instead >>>>>> of the sFlow info. psample will send this ID and sampled packet to >>>>>> userspace. Using the ID, userspace can recover the sFlow info and send >>>>>> sampled packet to the right sFlow monitoring host. >>>>> I started to review, but the test cases are failing in my setups. Maybe >>>>> you need to re-base your build as some stuff has changed related to flow >>>>> statistic updates. >>>>> >>>>> On RHEL9/Fedora 37: >>>>> >>>>> 3: sflow offloads with sampling=1 - ping between two ports - offloads >>>>> enabled FAILED (system-offloads-traffic.at:133) >>>>> 4: sflow offloads with sampling=2 - ping between two ports - offloads >>>>> enabled FAILED (system-offloads-traffic.at:191) >>>>> >>>>> >>>>> 3: >>>>> >>>>> ./system-offloads-traffic.at:133: ovs-appctl dpctl/dump-flows >>>>> type=tc,offloaded | grep "in_port(2)" | grep "eth_type(0x0800)" | grep >>>>> "actions:userspace" | grep "sFlow" | sed -e >>>>> "s/used:[0-9].[0-9]*s/used:0.001s/;s/eth(src=[a-z0-9:]*,dst= >>>>> [a-z0-9:]*)/eth(macs)/;s/pid=[0-9]*/pid=1/;s/output=$P1_IFINDEX/output=1/" >>>>> --- - 2023-03-10 12:59:37.008456662 +0100 >>>>> +++ >>>>> /root/Documents/reviews/ovs_cmi_tc_sflow/OVS_master_DPDK_v22.11/ovs_github/tests/system-offloads-testsuite.dir/at-groups/3/stdout >>>>> 2023-03-10 12:59:37.007186662 +0100 >>>>> @@ -1,2 +1,2 @@ >>>>> -recirc_id(0),in_port(2),eth(macs),eth_type(0x0800),ipv4(frag=no), >>>>> packets:10, bytes:840, used:0.001s, >>>>> actions:userspace(pid=1,sFlow(vid=0,pcp=0,output=1),actions),3 >>>>> +recirc_id(0),in_port(2),eth(macs),eth_type(0x0800),ipv4(frag=no), >>>>> packets:19, bytes:1596, used:0.001s, >>>>> actions:userspace(pid=1,sFlow(vid=0,pcp=0,output=1),actions),3 >>>>> >>>>> >>>>> 4: >>>>> >>>>> ./system-offloads-traffic.at:191: check_logs >>>>> --- /dev/null 2023-03-08 13:28:57.628323765 +0100 >>>>> +++ >>>>> /root/Documents/reviews/ovs_cmi_tc_sflow/OVS_master_DPDK_v22.11/ovs_github/tests/system-offloads-testsuite.dir/at-groups/4/stdout >>>>> 2023-03-10 13:04:47.167727980 +0100 >>>>> @@ -0,0 +1 @@ >>>>> +2023-03-10T12:04:35.938Z|00001|dpif(handler1)|WARN|system@ovs-system: >>>>> recv failed (No such file or directory) >>>>> >>>>> >>>>> On Fedora 37: >>>>> >>>>> 3: sflow offloads with sampling=1 - ping between two ports - offloads >>>>> enabled FAILED (system-offloads-traffic.at:133) >>>>> 4: sflow offloads with sampling=2 - ping between two ports - offloads >>>>> enabled FAILED (system-offloads-traffic.at:191) >>>> More tests are failing with your patch applied in offloads, it looks like >>>> due to the same error message as above: >>>> >>>> 1: offloads - ping between two ports - offloads disabled FAILED >>>> (system-offloads-traffic.at:60) >>>> 5: offloads - set ingress_policing_rate and ingress_policing_burst - >>>> offloads disabled FAILED (system-offloads-traffic.at:217) >>>> 7: offloads - set ingress_policing_kpkts_rate and >>>> ingress_policing_kpkts_burst - offloads disabled FAILED >>>> (system-offloads-traffic.at:265) >>>> 9: offloads - check interface meter offloading - offloads disabled >>>> FAILED (system-offloads-traffic.at:339) >>>> 11: offloads - check_pkt_len action - offloads disabled FAILED >>>> (system-offloads-traffic.at:439) >>>> >>>> >>>> Even worse (almost) all kernel tests (make check-kernel) are failing due >>>> to this error, which is not a problem with out your patches :( >>>> >>>> system-kmod-testsuite: 1 2 3 4 5 6 7 10 11 12 13 14 15 16 17 18 19 20 22 >>>> 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 >>>> 48 49 50 51 52 53 54 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 >>>> 74 75 76 77 78 79 80 81 82 83 84 87 88 89 90 91 92 93 94 95 96 97 98 99 >>>> 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 >>>> 118 119 120 121 122 123 124 125 126 133 134 135 136 137 138 139 140 141 >>>> 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 >>>> 160 161 162 163 164 165 167 168 169 170 171 failed >>>> >>>> Can you please run all unit tests before submitting a new revision? Let me >>>> know what the change will be to fix all of this, so I can incorporate it >>>> in the review. >>>> >>>> //Eelco >>> I thought github actions would run all the tests. So I didn't run 'make >>> check-offloads' manually. >> No, the datapath tests are not run. So before you submit a patch like this >> you need to run make check/check-kerne/check-offloads/check-system-userspace. > OK. >> >>> I reproduced the "recv failed" error. The following diff can fix it: >>> >>> diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c >>> index 510ba922d..70ba45f99 100644 >>> --- a/lib/netdev-offload-tc.c >>> +++ b/lib/netdev-offload-tc.c >>> @@ -3451,7 +3451,7 @@ offload_recv(struct dpif_upcall *upcall, struct >>> ofpbuf *buf, >>> } >>> >>> if (!psample_sock) { >>> - return ENOENT; >>> + return EAGAIN; >>> } >>> >>> for (;;) { >>> >>> Test 3 "sflow offloads with sampling=1" also failed. I changed the test >>> based on test 4. >>> Now it sends 1000 packets instead of 10. After this change, the test result >>> is stable. >> Can you send me the diff also, I can test/review it as part of v24. > Sure. > > diff --git a/tests/system-offloads-traffic.at > b/tests/system-offloads-traffic.at > index e99ea7720..ba9b5d076 100644 > --- a/tests/system-offloads-traffic.at > +++ b/tests/system-offloads-traffic.at > @@ -120,30 +120,25 @@ ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24") > ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24") > AT_CHECK([ovs-appctl dpctl/dump-flows], [0], [ignore]) > > -NS_CHECK_EXEC([at_ns0], [ping -c 10 -w 12 10.1.1.2 | FORMAT_PING], [0], [dnl > -10 packets transmitted, 10 received, 0% packet loss, time 0ms > -]) > AT_CHECK([ovs-vsctl -- --id=@sflow create sflow agent=lo > target=\"127.0.0.1:$SFLOW_PORT\" header=128 sampling=1 polling=100 -- set > bridge br0 sflow=@sflow], [0], [ignore]) > -NS_CHECK_EXEC([at_ns0], [ping -c 10 -w 12 10.1.1.2 | FORMAT_PING], [0], [dnl > -10 packets transmitted, 10 received, 0% packet loss, time 0ms > -]) > +NS_CHECK_EXEC([at_ns0], [ping -c 1000 -i 0.01 -w 12 10.1.1.2], [0], [ignore]) This chance should not be needed, if you send 10 pings, or 1000 you should get 0% packet loss or else something is wrong. In other test cases these pings work just info. > > -m4_define([DUMP_SFLOW], [sed -e > "s/used:[[0-9]].[[0-9]]*s/used:0.001s/;s/eth(src=[[a-z0-9:]]*,dst=[[a-z0-9:]]*)/eth(macs)/;s/pid=[[0-9]]*/pid=1/;s/output=$P1_IFINDEX/output=1/"]) > P1_IFINDEX=$(cat /sys/class/net/ovs-p1/ifindex) > +m4_define([DUMP_SFLOW], [sed -e > "s/packets:[[0-9]]*/packets:1/;s/bytes:[[0-9]]*/bytes:1/;s/used:[[0-9]].[[0-9]]*s/used:0.001s/;s/eth(src=[[a-z0-9:]]*,dst=[[a-z0-9:]]*)/eth(macs)/;s/pid=[[0-9]]*/pid=1/;s/output=$P1_IFINDEX/output=1/"]) > AT_CHECK([ovs-appctl dpctl/dump-flows type=tc,offloaded | grep "in_port(2)" > | grep "eth_type(0x0800)" | grep "actions:userspace" | grep "sFlow" | > DUMP_SFLOW], [0], [dnl > -recirc_id(0),in_port(2),eth(macs),eth_type(0x0800),ipv4(frag=no), > packets:10, bytes:840, used:0.001s, > actions:userspace(pid=1,sFlow(vid=0,pcp=0,output=1),actions),3 > +recirc_id(0),in_port(2),eth(macs),eth_type(0x0800),ipv4(frag=no), packets:1, > bytes:1, used:0.001s, > actions:userspace(pid=1,sFlow(vid=0,pcp=0,output=1),actions),3 Same here we should keep packet and byte counts in the results. > ]) > > -m4_define([DUMP_SFLOW], [sed -e > "s/used:[[0-9]].[[0-9]]*s/used:0.001s/;s/eth(src=[[a-z0-9:]]*,dst=[[a-z0-9:]]*)/eth(macs)/;s/pid=[[0-9]]*/pid=1/;s/output=$P0_IFINDEX/output=1/"]) > P0_IFINDEX=$(cat /sys/class/net/ovs-p0/ifindex) > +m4_define([DUMP_SFLOW], [sed -e > "s/packets:[[0-9]]*/packets:1/;s/bytes:[[0-9]]*/bytes:1/;s/used:[[0-9]].[[0-9]]*s/used:0.001s/;s/eth(src=[[a-z0-9:]]*,dst=[[a-z0-9:]]*)/eth(macs)/;s/pid=[[0-9]]*/pid=1/;s/output=$P0_IFINDEX/output=1/"]) > AT_CHECK([ovs-appctl dpctl/dump-flows type=tc,offloaded | grep "in_port(3)" > | grep "eth_type(0x0800)" | grep "actions:userspace" | grep "sFlow" | > DUMP_SFLOW], [0], [dnl > -recirc_id(0),in_port(3),eth(macs),eth_type(0x0800),ipv4(frag=no), > packets:10, bytes:840, used:0.001s, > actions:userspace(pid=1,sFlow(vid=0,pcp=0,output=1),actions),2 > +recirc_id(0),in_port(3),eth(macs),eth_type(0x0800),ipv4(frag=no), packets:1, > bytes:1, used:0.001s, > actions:userspace(pid=1,sFlow(vid=0,pcp=0,output=1),actions),2 > ]) > > OVS_TRAFFIC_VSWITCHD_STOP > OVS_APP_EXIT_AND_WAIT([test-sflow]) > count=`cat sflow.log | wc -l` > -AT_CHECK([[[[ $count -le 22 && $count -ge 18 ]]]]) > +AT_CHECK([[[[ $count -le 2100 && $count -ge 1900 ]]]]) > AT_CLEANUP > > AT_SETUP([sflow offloads with sampling=2 - ping between two ports - offloads > enabled]) >> >>> Do you want me to submit a new version to continue the review? >> Hi Chris, >> >> I’m halfway through v24, so let’s wait so you can incorporate my other >> comments. I’ll apply these patches and reply to this email later with any >> comments. >> >> //Eelco >> > And there 2 other changes in file lib/netdev-offload-tc.c. > The first change is to avoid an assertion: "nla->nla_len >= NLA_HDRLEN + > size") at lib/util.c:89". > The second one is for Ilya's latest comment. > > --- a/lib/netdev-offload-tc.c > +++ b/lib/netdev-offload-tc.c > @@ -3425,8 +3655,10 @@ psample_parse_packet(struct offload_psample *psample, > return ENOENT; > } > > - upcall->key = NULL; > upcall->key_len = 0; > + upcall->key = NULL; > + upcall->mru = NULL; > + upcall->hash = NULL; > upcall->ufid = sflow->ufid; > upcall->userdata = sflow->userdata; > upcall->actions = sflow->actions; > > @@ -3475,11 +3707,8 @@ offload_recv(struct dpif_upcall *upcall, struct ofpbuf > *buf, > } > > error = psample_from_ofpbuf(&psample, buf); > - if (!error) { > - return psample_parse_packet(&psample, upcall); > - } else if (error) { > - return error; > - } > + > + return error ? error : psample_parse_packet(&psample, upcall); > } > > return EAGAIN; _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev