On 2017/1/3 12:59, Numan Siddique wrote:


On Tue, Jan 3, 2017 at 2:06 AM, Mickey Spiegel <[email protected] <mailto:[email protected]>> wrote:


    On Mon, Jan 2, 2017 at 3:46 AM, Numan Siddique
    <[email protected] <mailto:[email protected]>> wrote:



        On Mon, Jan 2, 2017 at 2:07 AM, Mickey Spiegel
        <[email protected] <mailto:[email protected]>> wrote:


            On Sun, Jan 1, 2017 at 10:31 AM, Numan Siddique
            <[email protected] <mailto:[email protected]>> wrote:



                On Sun, Jan 1, 2017 at 6:39 AM, Mickey Spiegel
                <[email protected] <mailto:[email protected]>>
                wrote:


                    On Sat, Dec 31, 2016 at 1:19 AM, Mickey Spiegel
                    <[email protected]
                    <mailto:[email protected]>> wrote:


                        On Fri, Dec 30, 2016 at 11:37 AM, Mickey
                        Spiegel <[email protected]
                        <mailto:[email protected]>> wrote:


                            On Fri, Dec 30, 2016 at 7:46 AM, Numan
                            Siddique <[email protected]
                            <mailto:[email protected]>> wrote:

                                On Fri, Dec 30, 2016 at 5:36 PM, Dong
                                Jun <[email protected]
                                <mailto:[email protected]>> wrote:


                            <snip>

                                ​
                                Hi Dong Jun, I am also facing the same
                                issue on my setup.
                                ​
                                These are the findings of my
                                investigation so far

                                Looks like this issue is seen after
                                the commit
                                
https://github.com/openvswitch/ovs/commit/f1a8bd06d58f2c5312622fbaeacbc6ce7576e347
                                
<https://github.com/openvswitch/ovs/commit/f1a8bd06d58f2c5312622fbaeacbc6ce7576e347>
                                ​
                                which removes the usage of patch ports
                                and uses the clone action instead.
                                ​

                                I reverted to the commit just before
                                it and SNAT/DNAT is working as
                                expected.

                                In my case, the gateway router is
                                hosted on node 1 and the I am trying to
                                reach a VM (192.168.0.5) hosted on
                                node 2 using the external ip
                                (10.2.7.105) associated ​with it. I
                                could see that the node 1 is sending
                                the packet to node 2 through the
                                geneve tunnel, but it is dropped by node 2
                                flows.

                                Below is the tcpdump of the packet

                                **************************
                                19:39:44.709907 IP 182.16.0.16.60069 >
                                182.16.0.15.geneve: Geneve, Flags
                                [none], vni 0x1: IP
                                nusiddiq.blr.redhat.com
                                <http://nusiddiq.blr.redhat.com> >
                                192.168.0.5 <http://192.168.0.5>: ICMP
                                echo
                                request, id 13240, seq 1, length 64
                                ***************************

                                Below is the tcpdump of the packet
                                with the ovn-controller (without the
                                above commit) in the working case

                                **************************
                                19:41:56.783570 IP 182.16.0.12.29778 >
                                182.16.0.15.geneve: Geneve, Flags
                                [C], vni 0x1, options [8 bytes]: IP
                                nusiddiq.blr.redhat.com
                                <http://nusiddiq.blr.redhat.com> >
                                192.168.0.5 <http://192.168.0.5>:
                                ICMP echo request, id 13308, seq 1,
                                length 64
                                19:41:56.784270 IP 182.16.0.15.14539 >
                                182.16.0.12.geneve: Geneve, Flags
                                [C], vni 0xf, options [8 bytes]: IP
                                192.168.0.5 > nusiddiq.blr.redhat.com
                                <http://nusiddiq.blr.redhat.com>:
                                ICMP echo reply, id 13308, seq 1,
                                length 64
                                **************************

                                The options data has - 00030005

                                From the packet, I could see that the
                                packet from node 1 is missing the
                                geneve option fields which has inport
                                and outport keys.


                            I am facing the same issue running my
                            distributed NAT patch set.
                            Between UNSNAT recirc and output to
                            tunnel, a megaflow is installed that
                            is missing the geneve option fields.

                            I verified that the table=32 openflow rule
                            has the geneve option fields.
                            ofproto/trace shows geneve in the
                            "Datapath actions" at the end, so no
                            problem with whatever ofproto/trace is using.


                        Throwing some logs in, I see that
                        flow->metadata.present.map is 0 rather
                        than 1 coming into
                        tun_metadata_to_geneve_nlattr() in
                        lib/tun-metadata.c,
                        when the problem occurs. That is why the
                        geneve option fields are missing.

                        I have not yet figured out why
                        flow->metadata.present.map is 0. It should
                        be modified when tun_metadata_write() is
                        called due to actions setting
                        tunnel metadata values. I have not checked
                        that yet.


                    I just posted a fix. I did not try it with the
                    gateway router or with OpenStack,
                    but with this bug fix all distributed NAT manual
                    test cases are now passing.


                ​Thanks for the fix. I just tested it. Its working
                when I am trying to reach the ​VM using its floating
                ip. But not when trying to ping www.google.com
                <http://www.google.com> from the VM (SNAT use case)


            With distributed NAT, most of my debugging and tests were
            using SNAT. The bug fix that I posted fixed the problem
            that was causing ICMP echo replies to be dropped. The
            openflow path for distributed SNAT is similar to that for
            SNAT on gateway routers, but there are still some
            differences, notably one router instead of two routers and
            no "join" switch. Also I did not try it with DNS.

            Are you able to debug further, to see whether a missing
            geneve options field is still the culprit?
            It is possible that removal of patch ports within br-int
            uncovered other issues.



        ​With some testing I could see that in the node where the
        gateway is hosted
         - The ​reply packet reaches the gateway router pipeline -> to
        the otls switch pipeline (via clone) -> to the router pipeline
        -> to the peer port of the switch.
        ​The packet gets dropped at table 22

         table=22, n_packets=275, n_bytes=26686,
        priority=65535,ct_state=+inv+trk,metadata=0x1 actions=drop

        Not sure why it is happening. I will try to debug further.


    I added stateful ACLs, but I am unable to reproduce this. Nothing hits
    the invalid ct_state flow, trying switch -> router -> switch, across
    localnet at the end, and with various distributed NAT flavors
    including
    DNAT and SNAT. The pings always succeed.

    As I suggested on IRC, I think that conntrack state should be cleared
    when crossing an OVN patch port. Specifically, in
    ovn/controller/physical.c,
    inside the clone, it should clear ct_state (MFF_CT_STATE, be16),
    ct_mark (MFF_CT_MARK, be32), and ct_label (MFF_CT_LABEL, be128).


​Thanks for the suggestion. I couldn't clear the ct fields in clone action as these are not writable fields.
Instead i tried with the below patch and it worked.

diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index 44fe3d1..7a4b782 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -4332,7 +4332,11 @@ static void
 compose_clone_action(struct xlate_ctx *ctx, const struct ofpact_nest *oc)
 {
     struct flow old_flow = ctx->xin->flow;
+    bool old_conntrack = ctx->conntracked;
+  ctx->conntracked = false;
+  clear_conntrack(&ctx->xin->flow);
 do_xlate_actions(oc->actions, ofpact_nest_get_action_len(oc), ctx);
+  ctx->conntracked = old_conntrack;
 ctx->xin->flow = old_flow;
 }

​Thanks
Numan​

I tested this patch in my original case, it worked well.

​


    Mickey


        Numan



            I primarily used ovs-dpctl dump-flows to see installed
            megaflows, ovs-appctl ofproto/trace (with recirc_id), and
            ovs-ofctl dump-flows for initial debugging. In particular
            I could see that the installed megaflows were lacking the
            geneve options field in the actions.

            Mickey


                Numan

                    Mickey


                        Mickey


                            Mickey




                                Thanks
                                Numan


                                >
                                _______________________________________________
                                > dev mailing list
                                > [email protected]
                                <mailto:[email protected]>
                                >
                                
https://mail.openvswitch.org/mailman/listinfo/ovs-dev
                                
<https://mail.openvswitch.org/mailman/listinfo/ovs-dev>
                                >
                                _______________________________________________
                                dev mailing list
                                [email protected]
                                <mailto:[email protected]>
                                
https://mail.openvswitch.org/mailman/listinfo/ovs-dev
                                
<https://mail.openvswitch.org/mailman/listinfo/ovs-dev>










_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to