On 9/8/2021 9:57 PM, Eelco Chaudron wrote:

On 8 Sep 2021, at 13:52, Chris Mi wrote:

On 9/6/2021 5:47 PM, Eelco Chaudron wrote:
On 6 Sep 2021, at 11:14, Chris Mi wrote:

On 9/3/2021 8:54 PM, Eelco Chaudron wrote:
On 3 Sep 2021, at 14:02, Eelco Chaudron wrote:

      On 15 Jul 2021, at 8:01, Chris Mi wrote:

          This patch set adds offload support for sFlow.

          Psample is a genetlink channel for packet sampling. TC action
          act_sample
          uses psample to send sampled packets to userspace.

          When offloading sample action to TC, userspace creates a
          unique ID to
          map sFlow action and tunnel info and passes this ID to kernel
          instead
          of the sFlow info. psample will send this ID and sampled packet to
          userspace. Using the ID, userspace can recover the sFlow info
          and send
          sampled packet to the right sFlow monitoring host.

      Hi Chris,

      One thing missing from this patchset is a test case. I think we
      should add it, as I’m going over this manually every patch iteration.

      I would add the following:

       *

          Set the sample rate to 1, send 100 packets and make sure you
          receive all of them

            o Also, verify the output of “ovs-appctl dpctl/dump-flows
              system@ovs-system type=tc” is correct, i.e., matches the
              kernel output
       *

          Set the sample rate to 10, send 100 packets and make sure you
          receive 10.

            o Also, verify the output of “ovs-appctl dpctl/dump-flows
              system@ovs-system type=tc” is correct, i.e., matches the
              kernel output

      Cheers,

      Eelco

      PS: I also had a problem where only one packet got sent to the
      collector, and then no more packets would arrive. Of course, when
      I added some debug code, it never happened, and when removing the
      debug code, it also worked fine :( Did you ever experience
      something like this? I will play a bit more when reviewing
      specific code, maybe it will happen again.


One additional thing I’m seeing is the following:

$ ovs-appctl dpctl/dump-flows system@ovs-system type=tc
recirc_id(0),in_port(3),eth(src=52:54:00:88:51:38,dst=04:f4:bc:28:57:01),eth_type(0x0800),ipv4(frag=no),
 packets:7144, bytes:600096, used:7.430s, 
actions:sample(sample=10.0%,actions(userspace(pid=3925927229,sFlow(vid=0,pcp=0,output=2147483650),actions))),2

Sometimes I get a rather large sFlow(output=) value in the sFlow output. 
Converting the above to hex, 0x80000002, where I see this in fix_sflow_action() 
as being mentioned multiple output ports? This seems wrong to me as it should 
be 3 (as it always is in the none hw offload case). This odd value is also 
reported to the sFlow samples.

Unfortunately, I do not have a reproducer for this.

Actually, in the very beginning of the development when I have a lot of debug 
code, I also observed such odd port number sometimes.
Such rule will disappear soon. So it is hard to test such corner case. And it 
doesn't affect the functionality.
So current code didn't deal with such corner case.
I’m getting this almost every time I restart OVS and initiate a flow (using a 
simple ping). This does not go away by itself, it stays until the flow times 
out. So this should be investigated and fixed.

My test setup is rather simple, just two ports, one is a VM one is a external 
host, which I ping from the VM. All default config, so using the NORMAL rule. 
This is my sFlow config:

   ovs-vsctl -- --id=@sflow create sflow agent=lo \
        target="127.0.0.1" header=128 \
        sampling=10 polling=4 \
        -- set bridge ovs_pvp_br0 sflow=@sflow

When writing the test, I can reproduce it. Not sure what changed. But it 
happens even without offload. I don't know what need to be fixed.
OVS generates the DP rule and installs it. If offload is enabled, 
sFlow(vid=0,pcp=0,output=2147483650) (actually the whole action) is saved in
dpif_sflow_attr->action when installing the DP rule. When receiving sampled 
packets, tc or driver passes the gid to ovs daemon and
the daemon will find the dpif_sflow_attr->action using the gid. TC didn't 
change it. It is up to OVS sflow engine to process it.
Nice! I did not see it in the none-offload case, but if you do, it might not be 
related to your patchset and we should not worry about it for now :)
I’m assuming that you do get the 2147483650 value from the nlmsg attributes 
passed to the offload handler.

Yes, I got 2147483650. But give a second look, my rule is a little different:

recirc_id(0),in_port(3),eth(src=26:f2:ed:17:7d:eb,dst=33:33:00:00:00:16),eth_type(0x86dd),ipv6(frag=no), packets:0, bytes:0, used:0.000s, actions:userspace(pid=4294967295,sFlow(vid=0,pcp=0,output=2147483650),actions),1,2

or

recirc_id(0),in_port(enp8s0f0_1),eth(src=02:25:d0:10:01:02,dst=33:33:00:00:00:02),eth_type(0x86dd),ipv6(frag=no), packets:2, bytes:140, used:1.666s, actions:userspace(pid=4294967295,sFlow(vid=0,pcp=0,output=2147483650),actions),br,enp8s0f0_2

It seems the above rule is expected because the output is multiple ports. But your rule only outputs to 1 port. So to be accurate, I haven't reproduced the issue you hit. The suspicious code is dpif_sflow_attr_equal(). In the early version, I compared the whole dpif_sflow_attr, but now I only compare the ufid. Maybe the ufid is duplicate. But it should not happen. Anyway, I will keep an eye on this issue. Thanks for your carefulness.

-Chris
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to