On 8 Sep 2021, at 13:52, Chris Mi wrote:
> On 9/6/2021 5:47 PM, Eelco Chaudron wrote: >> >> On 6 Sep 2021, at 11:14, Chris Mi wrote: >> >>> On 9/3/2021 8:54 PM, Eelco Chaudron wrote: >>>> >>>> On 3 Sep 2021, at 14:02, Eelco Chaudron wrote: >>>> >>>> On 15 Jul 2021, at 8:01, Chris Mi wrote: >>>> >>>> This patch set adds offload support for sFlow. >>>> >>>> Psample is a genetlink channel for packet sampling. TC action >>>> act_sample >>>> uses psample to send sampled packets to userspace. >>>> >>>> When offloading sample action to TC, userspace creates a >>>> unique ID to >>>> map sFlow action and tunnel info and passes this ID to kernel >>>> instead >>>> of the sFlow info. psample will send this ID and sampled packet to >>>> userspace. Using the ID, userspace can recover the sFlow info >>>> and send >>>> sampled packet to the right sFlow monitoring host. >>>> >>>> Hi Chris, >>>> >>>> One thing missing from this patchset is a test case. I think we >>>> should add it, as I’m going over this manually every patch iteration. >>>> >>>> I would add the following: >>>> >>>> * >>>> >>>> Set the sample rate to 1, send 100 packets and make sure you >>>> receive all of them >>>> >>>> o Also, verify the output of “ovs-appctl dpctl/dump-flows >>>> system@ovs-system type=tc” is correct, i.e., matches the >>>> kernel output >>>> * >>>> >>>> Set the sample rate to 10, send 100 packets and make sure you >>>> receive 10. >>>> >>>> o Also, verify the output of “ovs-appctl dpctl/dump-flows >>>> system@ovs-system type=tc” is correct, i.e., matches the >>>> kernel output >>>> >>>> Cheers, >>>> >>>> Eelco >>>> >>>> PS: I also had a problem where only one packet got sent to the >>>> collector, and then no more packets would arrive. Of course, when >>>> I added some debug code, it never happened, and when removing the >>>> debug code, it also worked fine :( Did you ever experience >>>> something like this? I will play a bit more when reviewing >>>> specific code, maybe it will happen again. >>>> >>>> >>>> One additional thing I’m seeing is the following: >>>> >>>> $ ovs-appctl dpctl/dump-flows system@ovs-system type=tc >>>> recirc_id(0),in_port(3),eth(src=52:54:00:88:51:38,dst=04:f4:bc:28:57:01),eth_type(0x0800),ipv4(frag=no), >>>> packets:7144, bytes:600096, used:7.430s, >>>> actions:sample(sample=10.0%,actions(userspace(pid=3925927229,sFlow(vid=0,pcp=0,output=2147483650),actions))),2 >>>> >>>> Sometimes I get a rather large sFlow(output=) value in the sFlow output. >>>> Converting the above to hex, 0x80000002, where I see this in >>>> fix_sflow_action() as being mentioned multiple output ports? This seems >>>> wrong to me as it should be 3 (as it always is in the none hw offload >>>> case). This odd value is also reported to the sFlow samples. >>>> >>>> Unfortunately, I do not have a reproducer for this. >>>> >>> Actually, in the very beginning of the development when I have a lot of >>> debug code, I also observed such odd port number sometimes. >>> Such rule will disappear soon. So it is hard to test such corner case. And >>> it doesn't affect the functionality. >>> So current code didn't deal with such corner case. >> I’m getting this almost every time I restart OVS and initiate a flow (using >> a simple ping). This does not go away by itself, it stays until the flow >> times out. So this should be investigated and fixed. >> >> My test setup is rather simple, just two ports, one is a VM one is a >> external host, which I ping from the VM. All default config, so using the >> NORMAL rule. This is my sFlow config: >> >> ovs-vsctl -- --id=@sflow create sflow agent=lo \ >> target="127.0.0.1" header=128 \ >> sampling=10 polling=4 \ >> -- set bridge ovs_pvp_br0 sflow=@sflow >> > When writing the test, I can reproduce it. Not sure what changed. But it > happens even without offload. I don't know what need to be fixed. > OVS generates the DP rule and installs it. If offload is enabled, > sFlow(vid=0,pcp=0,output=2147483650) (actually the whole action) is saved in > dpif_sflow_attr->action when installing the DP rule. When receiving sampled > packets, tc or driver passes the gid to ovs daemon and > the daemon will find the dpif_sflow_attr->action using the gid. TC didn't > change it. It is up to OVS sflow engine to process it. Nice! I did not see it in the none-offload case, but if you do, it might not be related to your patchset and we should not worry about it for now :) I’m assuming that you do get the 2147483650 value from the nlmsg attributes passed to the offload handler. _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
