On 9/8/2021 9:57 PM, Eelco Chaudron wrote:
Yes, I got 2147483650. But give a second look, my rule is a little different:On 8 Sep 2021, at 13:52, Chris Mi wrote:On 9/6/2021 5:47 PM, Eelco Chaudron wrote:On 6 Sep 2021, at 11:14, Chris Mi wrote:On 9/3/2021 8:54 PM, Eelco Chaudron wrote:On 3 Sep 2021, at 14:02, Eelco Chaudron wrote: On 15 Jul 2021, at 8:01, Chris Mi wrote: This patch set adds offload support for sFlow. Psample is a genetlink channel for packet sampling. TC action act_sample uses psample to send sampled packets to userspace. When offloading sample action to TC, userspace creates a unique ID to map sFlow action and tunnel info and passes this ID to kernel instead of the sFlow info. psample will send this ID and sampled packet to userspace. Using the ID, userspace can recover the sFlow info and send sampled packet to the right sFlow monitoring host. Hi Chris, One thing missing from this patchset is a test case. I think we should add it, as I’m going over this manually every patch iteration. I would add the following: * Set the sample rate to 1, send 100 packets and make sure you receive all of them o Also, verify the output of “ovs-appctl dpctl/dump-flows system@ovs-system type=tc” is correct, i.e., matches the kernel output * Set the sample rate to 10, send 100 packets and make sure you receive 10. o Also, verify the output of “ovs-appctl dpctl/dump-flows system@ovs-system type=tc” is correct, i.e., matches the kernel output Cheers, Eelco PS: I also had a problem where only one packet got sent to the collector, and then no more packets would arrive. Of course, when I added some debug code, it never happened, and when removing the debug code, it also worked fine :( Did you ever experience something like this? I will play a bit more when reviewing specific code, maybe it will happen again. One additional thing I’m seeing is the following: $ ovs-appctl dpctl/dump-flows system@ovs-system type=tc recirc_id(0),in_port(3),eth(src=52:54:00:88:51:38,dst=04:f4:bc:28:57:01),eth_type(0x0800),ipv4(frag=no), packets:7144, bytes:600096, used:7.430s, actions:sample(sample=10.0%,actions(userspace(pid=3925927229,sFlow(vid=0,pcp=0,output=2147483650),actions))),2 Sometimes I get a rather large sFlow(output=) value in the sFlow output. Converting the above to hex, 0x80000002, where I see this in fix_sflow_action() as being mentioned multiple output ports? This seems wrong to me as it should be 3 (as it always is in the none hw offload case). This odd value is also reported to the sFlow samples. Unfortunately, I do not have a reproducer for this.Actually, in the very beginning of the development when I have a lot of debug code, I also observed such odd port number sometimes. Such rule will disappear soon. So it is hard to test such corner case. And it doesn't affect the functionality. So current code didn't deal with such corner case.I’m getting this almost every time I restart OVS and initiate a flow (using a simple ping). This does not go away by itself, it stays until the flow times out. So this should be investigated and fixed. My test setup is rather simple, just two ports, one is a VM one is a external host, which I ping from the VM. All default config, so using the NORMAL rule. This is my sFlow config: ovs-vsctl -- --id=@sflow create sflow agent=lo \ target="127.0.0.1" header=128 \ sampling=10 polling=4 \ -- set bridge ovs_pvp_br0 sflow=@sflowWhen writing the test, I can reproduce it. Not sure what changed. But it happens even without offload. I don't know what need to be fixed. OVS generates the DP rule and installs it. If offload is enabled, sFlow(vid=0,pcp=0,output=2147483650) (actually the whole action) is saved in dpif_sflow_attr->action when installing the DP rule. When receiving sampled packets, tc or driver passes the gid to ovs daemon and the daemon will find the dpif_sflow_attr->action using the gid. TC didn't change it. It is up to OVS sflow engine to process it.Nice! I did not see it in the none-offload case, but if you do, it might not be related to your patchset and we should not worry about it for now :) I’m assuming that you do get the 2147483650 value from the nlmsg attributes passed to the offload handler.
recirc_id(0),in_port(3),eth(src=26:f2:ed:17:7d:eb,dst=33:33:00:00:00:16),eth_type(0x86dd),ipv6(frag=no), packets:0, bytes:0, used:0.000s, actions:userspace(pid=4294967295,sFlow(vid=0,pcp=0,output=2147483650),actions),1,2
orrecirc_id(0),in_port(enp8s0f0_1),eth(src=02:25:d0:10:01:02,dst=33:33:00:00:00:02),eth_type(0x86dd),ipv6(frag=no), packets:2, bytes:140, used:1.666s, actions:userspace(pid=4294967295,sFlow(vid=0,pcp=0,output=2147483650),actions),br,enp8s0f0_2
It seems the above rule is expected because the output is multiple ports. But your rule only outputs to 1 port. So to be accurate, I haven't reproduced the issue you hit. The suspicious code is dpif_sflow_attr_equal(). In the early version, I compared the whole dpif_sflow_attr, but now I only compare the ufid. Maybe the ufid is duplicate. But it should not happen. Anyway, I will keep an eye on this issue. Thanks for your carefulness.
-Chris _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
