On Wed, Sep 26, 2018 at 11:07 PM Justin Pettit <[email protected]> wrote:
>
> Hi, Han.  I'm still trying to come up with a mechanism I like, but in the
meantime, can you try applying this patch and re-running your trace?  This
should provide a better indication of what's causing that field to be
un-wildcarded.
>
> Thanks,
>
> --Justin
>
>
Thanks Justin for helping. I haven't used the patch yet since it causes
ovs-vswitchd crash in test case "ovn -- 3 HVs, 3 LS, 3 lports/LS, 1 LR".

However, I believe I found the cause of the problem - it is enabling BFD
(for GW HA) that causes the un-wildcarding of the UDP flows. BFD uses UDP
(port 3784).

E.g.
recirc_id(0),tunnel(tun_id=0x0,src=10.169.108.123,dst=10.169.98.204,flags(-df+csum+key)),in_port(1),eth(),eth_type(0x0800),ipv4(proto=17,frag=no),udp(dst=3784),
packets:476, bytes:31416, used:0.068s,
actions:userspace(pid=3497583927,slow_path(bfd))

There is no user space flows for udp because BFD is handled implicitly
without specific open flow rules, but it impacts the megaflow wildcarding.
I verified this in my local test env by enabling BFD and then doing hping3
-2 ... to generate UDP packets destined to same IP but different udp ports,
and seeing the un-wildcarded flows in datapath, exactly like the example I
provided before. The problem disappears if BFD is not enabled (i.e. when
there is only one GW without HA).

Justin, do you have any thoughts on how to solve this problem? (I am not
familiar with this part of OVS, need more study)

For the patch that traces the un-wildcarding, I think it is still going to
be very useful for trouble-shooting such problems in the future. Please let
me know when you fix it and I can test with current use case.

Thanks,
Han
>
>
>
> > On Sep 6, 2018, at 7:07 PM, Han Zhou <[email protected]> wrote:
> >
> > As mentioned in today's OVN meeting, I observed a problem regarding
megaflow effectiveness. Now I have more details but really need help from
folks here. Originally I thought it is due to the way OVN is programming
the flows, but now I even wonder it could be a problem in OVS (or my
misunderstanding about OVS).
> >
> > In this OVN setup I observed unreasonably high flow-miss rate on
chassis nodes, especially the gateway nodes. Then with ovs-dpctl dump-flows
I saw many udp flows installed for same match conditions expect the udp dst
port, i.e. only the udp(dst=xxxxx) part is different. This would cause UDP
sessions latency, since in this scenario most of those udp packets are
short lived flows and all processed in slowpath. However I didn't see such
problem for TCP.
> >
> > At first I suspected the dhcp port related flows in port-security stage
could cause the megaflow always have the udp port in the match condition,
but it turns out this is not the case because the dhcp flows matches the
specific broadcast IP and 0.0.0.0 only. I confirmed this by debugging on
the gateway node - there is no udp related flows installed at all in
gateway node userspace because port-security is not relevant for gateway
node, and there is no flow with tp_dst in match condition either. I
verified by ovs-ofctl dump-flows br-int | grep udp. i.e., below commands
returns no result:
> >
> > # ovs-ofctl dump-flows br-int | grep udp
> > # ovs-ofctl dump-flows br-int | grep tp_dst
> > # ovs-ofctl dump-flows br-int | grep tp_src
> >
> > So could anyone explain what could cause so many megaflows installed in
datapath, each with specific udp dst port?
> >
> > Please find the example flows and trace, ovn-detrace output here:
> > https://gist.github.com/hzhou8/763e18c209aab40e7f3d6bd0690cea6f
> >
> > Thanks,
> > Han
> >
> > _______________________________________________
> > discuss mailing list
> > [email protected]
> > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
>

I
_______________________________________________
discuss mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to