On 15 Mar 2023, at 11:07, Chris Mi wrote:

> On 3/15/2023 5:28 PM, Eelco Chaudron wrote:
>>
>> On 15 Mar 2023, at 4:40, Chris Mi wrote:
>>
>>> On 3/10/2023 10:02 PM, Eelco Chaudron wrote:
>>>> On 10 Mar 2023, at 13:11, Eelco Chaudron wrote:
>>>>
>>>>> On 1 Mar 2023, at 8:22, Chris Mi wrote:
>>>>>
>>>>>> This patch set adds offload support for sFlow.
>>>>>>
>>>>>> Psample is a genetlink channel for packet sampling. TC action act_sample
>>>>>> uses psample to send sampled packets to userspace.
>>>>>>
>>>>>> When offloading sample action to TC, userspace creates a unique ID to
>>>>>> map sFlow action and tunnel info and passes this ID to kernel instead
>>>>>> of the sFlow info. psample will send this ID and sampled packet to
>>>>>> userspace. Using the ID, userspace can recover the sFlow info and send
>>>>>> sampled packet to the right sFlow monitoring host.
>>>>> I started to review, but the test cases are failing in my setups. Maybe 
>>>>> you need to re-base your build as some stuff has changed related to flow 
>>>>> statistic updates.
>>>>>
>>>>> On RHEL9/Fedora 37:
>>>>>
>>>>>     3: sflow offloads with sampling=1 - ping between two ports - offloads 
>>>>> enabled FAILED (system-offloads-traffic.at:133)
>>>>>     4: sflow offloads with sampling=2 - ping between two ports - offloads 
>>>>> enabled FAILED (system-offloads-traffic.at:191)
>>>>>
>>>>>
>>>>> 3:
>>>>>
>>>>> ./system-offloads-traffic.at:133: ovs-appctl dpctl/dump-flows 
>>>>> type=tc,offloaded | grep "in_port(2)" | grep "eth_type(0x0800)" | grep 
>>>>> "actions:userspace" | grep "sFlow" | sed -e 
>>>>> "s/used:[0-9].[0-9]*s/used:0.001s/;s/eth(src=[a-z0-9:]*,dst=
>>>>> [a-z0-9:]*)/eth(macs)/;s/pid=[0-9]*/pid=1/;s/output=$P1_IFINDEX/output=1/"
>>>>> --- -     2023-03-10 12:59:37.008456662 +0100
>>>>> +++ 
>>>>> /root/Documents/reviews/ovs_cmi_tc_sflow/OVS_master_DPDK_v22.11/ovs_github/tests/system-offloads-testsuite.dir/at-groups/3/stdout
>>>>>      2023-03-10 12:59:37.007186662 +0100
>>>>> @@ -1,2 +1,2 @@
>>>>> -recirc_id(0),in_port(2),eth(macs),eth_type(0x0800),ipv4(frag=no), 
>>>>> packets:10, bytes:840, used:0.001s, 
>>>>> actions:userspace(pid=1,sFlow(vid=0,pcp=0,output=1),actions),3
>>>>> +recirc_id(0),in_port(2),eth(macs),eth_type(0x0800),ipv4(frag=no), 
>>>>> packets:19, bytes:1596, used:0.001s, 
>>>>> actions:userspace(pid=1,sFlow(vid=0,pcp=0,output=1),actions),3
>>>>>
>>>>>
>>>>> 4:
>>>>>
>>>>> ./system-offloads-traffic.at:191: check_logs
>>>>> --- /dev/null     2023-03-08 13:28:57.628323765 +0100
>>>>> +++ 
>>>>> /root/Documents/reviews/ovs_cmi_tc_sflow/OVS_master_DPDK_v22.11/ovs_github/tests/system-offloads-testsuite.dir/at-groups/4/stdout
>>>>>      2023-03-10 13:04:47.167727980 +0100
>>>>> @@ -0,0 +1 @@
>>>>> +2023-03-10T12:04:35.938Z|00001|dpif(handler1)|WARN|system@ovs-system: 
>>>>> recv failed (No such file or directory)
>>>>>
>>>>>
>>>>> On Fedora 37:
>>>>>
>>>>>     3: sflow offloads with sampling=1 - ping between two ports - offloads 
>>>>> enabled FAILED (system-offloads-traffic.at:133)
>>>>>     4: sflow offloads with sampling=2 - ping between two ports - offloads 
>>>>> enabled FAILED (system-offloads-traffic.at:191)
>>>> More tests are failing with your patch applied in offloads, it looks like 
>>>> due to the same error message as above:
>>>>
>>>>     1: offloads - ping between two ports - offloads disabled FAILED 
>>>> (system-offloads-traffic.at:60)
>>>>     5: offloads - set ingress_policing_rate and ingress_policing_burst - 
>>>> offloads disabled FAILED (system-offloads-traffic.at:217)
>>>>     7: offloads - set ingress_policing_kpkts_rate and 
>>>> ingress_policing_kpkts_burst - offloads disabled FAILED 
>>>> (system-offloads-traffic.at:265)
>>>>     9: offloads - check interface meter offloading -  offloads disabled 
>>>> FAILED (system-offloads-traffic.at:339)
>>>>    11: offloads - check_pkt_len action - offloads disabled FAILED 
>>>> (system-offloads-traffic.at:439)
>>>>
>>>>
>>>> Even worse (almost) all kernel tests (make check-kernel) are failing due 
>>>> to this error, which is not a problem with out your patches :(
>>>>
>>>> system-kmod-testsuite: 1 2 3 4 5 6 7 10 11 12 13 14 15 16 17 18 19 20 22 
>>>> 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 
>>>> 48 49 50 51 52 53 54 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 
>>>> 74 75 76 77 78 79 80 81 82 83 84 87 88 89 90 91 92 93 94 95 96 97 98 99 
>>>> 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 
>>>> 118 119 120 121 122 123 124 125 126 133 134 135 136 137 138 139 140 141 
>>>> 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 
>>>> 160 161 162 163 164 165 167 168 169 170 171 failed
>>>>
>>>> Can you please run all unit tests before submitting a new revision? Let me 
>>>> know what the change will be to fix all of this, so I can incorporate it 
>>>> in the review.
>>>>
>>>> //Eelco
>>> I thought github actions would run all the tests. So I didn't run 'make 
>>> check-offloads' manually.
>> No, the datapath tests are not run. So before you submit a patch like this 
>> you need to run make check/check-kerne/check-offloads/check-system-userspace.
> OK.
>>
>>> I reproduced the "recv failed" error. The following diff can fix it:
>>>
>>> diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
>>> index 510ba922d..70ba45f99 100644
>>> --- a/lib/netdev-offload-tc.c
>>> +++ b/lib/netdev-offload-tc.c
>>> @@ -3451,7 +3451,7 @@ offload_recv(struct dpif_upcall *upcall, struct 
>>> ofpbuf *buf,
>>>       }
>>>
>>>       if (!psample_sock) {
>>> -        return ENOENT;
>>> +        return EAGAIN;
>>>       }
>>>
>>>       for (;;) {
>>>
>>> Test 3 "sflow offloads with sampling=1" also failed. I changed the test 
>>> based on test 4.
>>> Now it sends 1000 packets instead of 10. After this change, the test result 
>>> is stable.
>> Can you send me the diff also, I can test/review it as part of v24.
> Sure.
>
> diff --git a/tests/system-offloads-traffic.at 
> b/tests/system-offloads-traffic.at
> index e99ea7720..ba9b5d076 100644
> --- a/tests/system-offloads-traffic.at
> +++ b/tests/system-offloads-traffic.at
> @@ -120,30 +120,25 @@ ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24")
>  ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24")
>  AT_CHECK([ovs-appctl dpctl/dump-flows], [0], [ignore])
>
> -NS_CHECK_EXEC([at_ns0], [ping -c 10 -w 12 10.1.1.2 | FORMAT_PING], [0], [dnl
> -10 packets transmitted, 10 received, 0% packet loss, time 0ms
> -])
>  AT_CHECK([ovs-vsctl -- --id=@sflow create sflow agent=lo 
> target=\"127.0.0.1:$SFLOW_PORT\" header=128 sampling=1 polling=100 -- set 
> bridge br0 sflow=@sflow], [0], [ignore])
> -NS_CHECK_EXEC([at_ns0], [ping -c 10 -w 12 10.1.1.2 | FORMAT_PING], [0], [dnl
> -10 packets transmitted, 10 received, 0% packet loss, time 0ms
> -])
> +NS_CHECK_EXEC([at_ns0], [ping -c 1000 -i 0.01 -w 12 10.1.1.2], [0], [ignore])

This chance should not be needed, if you send 10 pings, or 1000 you should get 
0% packet loss or else something is wrong. In other test cases these pings work 
just info.

>
> -m4_define([DUMP_SFLOW], [sed -e 
> "s/used:[[0-9]].[[0-9]]*s/used:0.001s/;s/eth(src=[[a-z0-9:]]*,dst=[[a-z0-9:]]*)/eth(macs)/;s/pid=[[0-9]]*/pid=1/;s/output=$P1_IFINDEX/output=1/"])
>  P1_IFINDEX=$(cat /sys/class/net/ovs-p1/ifindex)
> +m4_define([DUMP_SFLOW], [sed -e 
> "s/packets:[[0-9]]*/packets:1/;s/bytes:[[0-9]]*/bytes:1/;s/used:[[0-9]].[[0-9]]*s/used:0.001s/;s/eth(src=[[a-z0-9:]]*,dst=[[a-z0-9:]]*)/eth(macs)/;s/pid=[[0-9]]*/pid=1/;s/output=$P1_IFINDEX/output=1/"])
>  AT_CHECK([ovs-appctl dpctl/dump-flows type=tc,offloaded | grep "in_port(2)" 
> | grep "eth_type(0x0800)" | grep "actions:userspace" | grep "sFlow" | 
> DUMP_SFLOW], [0], [dnl
> -recirc_id(0),in_port(2),eth(macs),eth_type(0x0800),ipv4(frag=no), 
> packets:10, bytes:840, used:0.001s, 
> actions:userspace(pid=1,sFlow(vid=0,pcp=0,output=1),actions),3
> +recirc_id(0),in_port(2),eth(macs),eth_type(0x0800),ipv4(frag=no), packets:1, 
> bytes:1, used:0.001s, 
> actions:userspace(pid=1,sFlow(vid=0,pcp=0,output=1),actions),3

Same here we should keep packet and byte counts in the results.

>  ])
>
> -m4_define([DUMP_SFLOW], [sed -e 
> "s/used:[[0-9]].[[0-9]]*s/used:0.001s/;s/eth(src=[[a-z0-9:]]*,dst=[[a-z0-9:]]*)/eth(macs)/;s/pid=[[0-9]]*/pid=1/;s/output=$P0_IFINDEX/output=1/"])
>  P0_IFINDEX=$(cat /sys/class/net/ovs-p0/ifindex)
> +m4_define([DUMP_SFLOW], [sed -e 
> "s/packets:[[0-9]]*/packets:1/;s/bytes:[[0-9]]*/bytes:1/;s/used:[[0-9]].[[0-9]]*s/used:0.001s/;s/eth(src=[[a-z0-9:]]*,dst=[[a-z0-9:]]*)/eth(macs)/;s/pid=[[0-9]]*/pid=1/;s/output=$P0_IFINDEX/output=1/"])
>  AT_CHECK([ovs-appctl dpctl/dump-flows type=tc,offloaded | grep "in_port(3)" 
> | grep "eth_type(0x0800)" | grep "actions:userspace" | grep "sFlow" | 
> DUMP_SFLOW], [0], [dnl
> -recirc_id(0),in_port(3),eth(macs),eth_type(0x0800),ipv4(frag=no), 
> packets:10, bytes:840, used:0.001s, 
> actions:userspace(pid=1,sFlow(vid=0,pcp=0,output=1),actions),2
> +recirc_id(0),in_port(3),eth(macs),eth_type(0x0800),ipv4(frag=no), packets:1, 
> bytes:1, used:0.001s, 
> actions:userspace(pid=1,sFlow(vid=0,pcp=0,output=1),actions),2
>  ])
>
>  OVS_TRAFFIC_VSWITCHD_STOP
>  OVS_APP_EXIT_AND_WAIT([test-sflow])
>  count=`cat sflow.log | wc -l`
> -AT_CHECK([[[[ $count -le 22 && $count -ge 18 ]]]])
> +AT_CHECK([[[[ $count -le 2100 && $count -ge 1900 ]]]])
>  AT_CLEANUP
>
>  AT_SETUP([sflow offloads with sampling=2 - ping between two ports - offloads 
> enabled])
>>
>>> Do you want me to submit a new version to continue the review?
>> Hi Chris,
>>
>> I’m halfway through v24, so let’s wait so you can incorporate my other 
>> comments. I’ll apply these patches and reply to this email later with any 
>> comments.
>>
>> //Eelco
>>
> And there 2 other changes in file lib/netdev-offload-tc.c.
> The first change is to avoid an assertion: "nla->nla_len >= NLA_HDRLEN + 
> size") at lib/util.c:89".
> The second one is for Ilya's latest comment.
>
> --- a/lib/netdev-offload-tc.c
> +++ b/lib/netdev-offload-tc.c
> @@ -3425,8 +3655,10 @@ psample_parse_packet(struct offload_psample *psample,
>          return ENOENT;
>      }
>
> -    upcall->key = NULL;
>      upcall->key_len = 0;
> +    upcall->key = NULL;
> +    upcall->mru = NULL;
> +    upcall->hash = NULL;
>      upcall->ufid = sflow->ufid;
>      upcall->userdata = sflow->userdata;
>      upcall->actions = sflow->actions;
>
> @@ -3475,11 +3707,8 @@ offload_recv(struct dpif_upcall *upcall, struct ofpbuf 
> *buf,
>          }
>
>          error = psample_from_ofpbuf(&psample, buf);
> -        if (!error) {
> -            return psample_parse_packet(&psample, upcall);
> -        } else if (error) {
> -            return error;
> -        }
> +
> +        return error ? error : psample_parse_packet(&psample, upcall);
>      }
>
>      return EAGAIN;

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to