On Tue, Mar 14, 2017 at 12:57 AM, Russell Bryant <[email protected]> wrote:
> On Fri, Mar 10, 2017 at 4:48 PM, Russell Bryant <[email protected]> wrote: > > On Fri, Mar 10, 2017 at 2:35 PM, Russell Bryant <[email protected]> wrote: > >> On Thu, Mar 9, 2017 at 11:52 PM, Numan Siddique <[email protected]> > wrote: > >> I don't think it's a Neutron issue. > >> > >> I see the conntrack entry remaining in the UNREPLIED state, even in > >> the working case where there's not an ACL dropping the reply. > >> > >> icmp 1 29 src=10.0.0.10 dst=10.0.0.1 type=8 code=0 id=25857 > >> [UNREPLIED] src=10.0.0.1 dst=10.0.0.10 type=0 code=0 id=25857 mark=0 > >> zone=8 use=1 > >> > >> If I ping a different address (something past the logical router), I > >> see a proper conntrack entry that's not in the UNREPLIED state. > >> > >> I wonder if there's something about how we are generating the ICMP > >> response from the logical router that's making conntrack not properly > >> associate it with the request? > > > > I checked into this and there's no meaningful difference in how we > > form the ICMP reply. > > > > I'm guessing this has to do with the request and reply both going > > through conntrack as a part of processing the same packet in the OVS > > data path. That's not behaving how we would expect. I'll keep > > looking next week to try to identify the root cause here, but I would > > appreciate any insight others may have about the behavior we should > > expect in this scenario. > > I'm able to reproduce this outside of OpenStack. > > https://gist.github.com/russellb/4ab0a9641f12f8ac66fdd6822ee7789e > > This creates an OVN setup with two logical switches connected by a > logical router. > > switch d47412f9-e64a-4734-be26-80ee71ded805 (sw1) > port sw1-port1 > addresses: ["50:54:00:00:00:03 11.0.0.2"] > switch a1730d73-ccab-4f00-b748-3cafb683e9b8 (sw0) > port sw0-port2 > addresses: ["50:54:00:00:00:02 192.168.0.3"] > port sw0-port1 > addresses: ["50:54:00:00:00:01 192.168.0.2"] > router 193f68cd-4a93-4a04-ad3b-3ddf7b5c66f3 (lr0) > > The ping commands at the end demonstrate the problem. My expectation > is that the very last ping command should be successful, but is not > due to the issue we're seeing here. > > This issue seems to be fixed with the below code diff. From my observation, I could see that when the router datapath generates the icmp response and when the reply packet hits the ovs_ct_execute function in datapath - the "sw_flow_key - key" param has old conntrack values - in the function "__ovs_ct_lookup", skb_nfct_cached(net, key, info, skb) returns false. - and the state is not set with the flag- OVS_CS_F_ESTABLISHED. I want to check, if this is the right fix and get your feedback before sending the upstream patch to net-next. >************************************************< diff --git a/datapath/flow.c b/datapath/flow.c index 2bc1ad0..2c59acd 100644 --- a/datapath/flow.c +++ b/datapath/flow.c @@ -845,6 +845,8 @@ int ovs_flow_key_extract_userspace(struct net *net, const struct nlattr *attr, if (err) return err; + ovs_ct_fill_key(skb, key); + /* Check that we have conntrack original direction tuple metadata only * for packets for which it makes sense. Otherwise the key may be * corrupted due to overlapping key fields. >************************************************< Thanks Numan > -- > Russell Bryant > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
