On 21 June 2016 at 10:29, Flaviof <[email protected]> wrote:
> On Tue, Jun 21, 2016 at 10:46 AM, Guru Shetty <[email protected]> wrote:
>
> >
> >
> > On 20 June 2016 at 19:36, Flaviof <[email protected]> wrote:
> >
> >> On Mon, Jun 13, 2016 at 6:45 AM, Gurucharan Shetty <[email protected]>
> wrote:
> >>
> >> > For traffic from physical space to virtual space we need DNAT.
> >> > The DNAT happens in the gateway router and reaches the logical
> >> > port. The return traffic should be unDNATed.
> >> >
> >> > Traffic originating in virtual space heading to physical space
> >> > should be SNATed. The return traffic is unSNATted.
> >> >
> >> > East-west traffic with the public destination IP address needs
> >> > a DNAT. This traffic is punted to the l3 gateway where DNAT
> >> > takes place. This traffic is also SNATed and eventually loops back to
> >> > its destination. The SNAT is needed because we need the reverse
> traffic
> >> > to go back to the l3 gateway and not short-circuit directly to the
> >> source.
> >> >
> >> > This commit introduces 4 new logical actions.
> >> > 1. ct_snat: To send the packet through SNAT zone to unSNAT packets.
> >> > 2. ct_snat(IP): To SNAT to the provided IP address.
> >> > 3. ct_dnat: To send the packet throgh DNAT zone to unDNAT packets.
> >> > 4. ct_dnat(IP): To DNAT to the provided IP.
> >> >
> >> > This commit only provides the ability to do IP based NAT. This will
> >> > eventually be enhanced to do PORT based NAT too.
> >> >
> >> > Command hints:
> >> >
> >> > Consider a distributed router "R1" that has switch foo (
> 192.168.1.0/24)
> >> > with a lport foo1 (192.168.1.2) and bar (192.168.2.0/24) with lport
> >> bar1
> >> > (192.168.2.2) connected to it. You connect "R1" to
> >> > a gateway router "R2" via a switch "join" in (20.0.0.0/24) network.
> >> >
> >> > R2 has a switch "alice" (172.16.1.0/24) connected to it (to simulate
> >> > external network).
> >> >
> >> > case: Add pure DNAT (north-south)
> >> >
> >> > Add a DNAT rule in R2:
> >> > ovn-nbctl -- --id=@nat create nat type="dnat" logical_ip=192.168.1.2 \
> >> > external_ip=30.0.0.2 -- add logical_router R2 nat @nat
> >> >
> >> > Now alice1 should be able to ping 192.168.1.2 via 30.0.0.2.
> >> >
> >> > case2 : Add pure SNAT (south-north)
> >> >
> >> > Add a SNAT rule in R2:
> >> >
> >> > ovn-nbctl -- --id=@nat create nat type="snat" logical_ip=192.168.2.2 \
> >> > external_ip=30.0.0.1 -- add logical_router R2 nat @nat
> >> >
> >> > (You need a static route in R1 to send packets destined to outside
> >> > world to go through R2. The logical_ip can be a subnet.)
> >> >
> >> > When bar1 pings alice1, alice1 receives traffic from 30.0.0.1
> >> >
> >> > case3 : SNAT and DNAT (east-west traffic)
> >> >
> >> > When bar1 pings 30.0.0.2, the traffic jumps to the gateway router
> >> > and loops back to foo1 with a source ip address of 30.0.0.1
> >> >
> >> >
> >> So, is 30.0.0.0/x network an external network that R2 has a port too?
> >>
> >
> > The example above does not have that. In the above example 30.0.0.0/x is
> > being treated as virtual address. But in a real setup (non-simulated),
> you
> > are right. R2 will be connected to a 30.0.0.0/x network and will have a
> > port in it. It will also have a static route (0.0.0.0/0) or a
> > default_gateway to point to the physical router IP address as its next
> hop.
> > (I have not tested it as I do not have a real setup at hand, but based on
> > the simulation, it should ideally work.)
> >
> >
> >> What is the next hop that R2 would use to reach a destination beyond
> >> that subnet?
> >>
> > Answered above.
> >
>
> Ack!
>
>
> >
> >>
> >> I think this may be clear when a test is added to ovn.at, which uses
> foo,
> >> bar, join, alice
> >>
> > The unit tests do not have the ability to do conntrack NAT right now. I
> > think we should add one once Daniele introduces NAT to usespace
> conntrack.
> > But the unit test "ovn -- 2 HVs, 2 LRs connected via LS, gateway router"
> > does something very similar (it has foo - R1 - join - R2 - alice).
> >
>
> Right, I saw that test and it makes perfect sense. Adding the 'bar' logical
> switch, net 30.0.0.x and the nat rules are the few lines that it currently
> does not have.
>
>
> >
> >>
> >> Based on the code and my little test setup, there seems to be a high
> cost
> >> for DNAT entries in that an ARP response rule will be added per DNAT x
> all
> >> router ports.
> >
> > The intention was to add only on the router where DNAT entry is defined
> > and not on all router ports of all routers. Is it not true? (If so, this
> is
> > a bug. ). The for loop which adds this entry, only looks at that
> datapath's
> > NAT entries.
> >
> > On the gateway router itself, there would be typically two DNAT entries.
> > One of them connected to internal network (for east-west) and another one
> > at external port (facing physical router).
> >
> >
> Understood.
>
>
> >
> >
> >> In the example used by the commit message, ingress table 1 of
> >> the logical router will have arp response entries for inports alice and
> >> R2_join.
> >>
> > Right. That is because as explained above, I need to do DNAT for both
> > east-west as well as north-south. (It is very possible that I did not
> > understand your concern)
> >
>
> Nah, you set me straight. If there were multiple internal subnets I imagine
> we will need a DNAT
> rule for each, since the response needs to be slightly different for each
> router port. Not an issue, just an observation.
>
>
> >
> >
> >>
> >>
> >> Table 3: do we really intend to apply the actions 'inport = "";
> ct_dnat;'
> >> to all ip packets that do not have an explicit dnat mapping?
> >>
> > Yes. This is a little tricky. I have tried to explain the rationale in a
> > comment above. The general idea is that in a gateway router, there will
> be
> > atleast one DNAT or SNAT entry. Otherwise, why have a gateway router?
> Also,
> > a re-circulation is considered to be very expensive. What we want is to
> > minimize re-circulations. With the code above, we have a minimum of
> > one-recirculation no matter what and a maximum of two re-circulations. I
> > have tried different ways to optimize it. There was a possibility of 3
> > re-circulations as a worst case if I did not force the minimum one
> > re-circulation. Probably there is a different way to optimize it (that I
> > haven't thought about).
> >
> >
> >
> Thanks for the clarification. I don't know enough about the implications of
> calling
> the ct_dnat action, but I imagine that is just noise and -- like you point
> out -- this is only in the
> gateway router and saves on recirculations.
>
>
>
> >
> >
> >>
> >> SNAT: do we need ARP reply rules for the SNAT addresses, similar to the
> >> ones added for DNAT?
> >>
> > I don't think we need ARP reply rules for SNAT entries. What is the use
> > case?
> >
>
> This is likely a moot point in my part. It is just that because in my
> example, the gateway
> router did not have a port in the 30.0.0.x network. So it was not obvious
> to me that if
> it did, it would have the ARP response rule for it's own address, which is
> masking the
> internal ips for foo and bar. Sorry for not understanding that before
> making the noise. :)
>
>
> >
> >>
> >> SNAT: looking at the openflow table I see n mentioning of the address
> >> added
> >> to support SNAT. Ist that because that is all handled by connect_tracker
> >> and there is nothing to be done via openflow? Or maybe part of another
> >> patchset?
> >>
> >
> > We do add SNAT specific rules. Search for S_ROUTER_IN_UNSNAT
> > and S_ROUTER_OUT_SNAT.
> >
> >
>
> Ack, I missed that in the egress datapath. *facepalm*
>
>
>
> >
> >> Thanks,
> >>
> >> -- flaviof
> >>
> >>
> >>
> >>
> >> > Signed-off-by: Gurucharan Shetty <[email protected]>
> >>
> >
>
> Acked-by: Flavio Fernandes <[email protected]>
>
> Thank you for taking a look. I applied this. We will fix issues that come
up in real world testing.
>
>
>
>
>
> > > ---
> >> > ovn/lib/actions.c | 83 ++++++++++++++++++++
> >> > ovn/northd/ovn-northd.8.xml | 131 ++++++++++++++++++++++++++++---
> >> > ovn/northd/ovn-northd.c | 187
> >> > ++++++++++++++++++++++++++++++++++++++++++--
> >> > ovn/ovn-nb.ovsschema | 19 ++++-
> >> > ovn/ovn-nb.xml | 65 +++++++++++++--
> >> > ovn/ovn-sb.xml | 41 ++++++++++
> >> > ovn/utilities/ovn-nbctl.c | 5 ++
> >> > tests/ovn.at | 17 ++++
> >> > 8 files changed, 524 insertions(+), 24 deletions(-)
> >> >
> >> > diff --git a/ovn/lib/actions.c b/ovn/lib/actions.c
> >> > index 5f0bf19..4a486a0 100644
> >> > --- a/ovn/lib/actions.c
> >> > +++ b/ovn/lib/actions.c
> >> > @@ -442,6 +442,85 @@ emit_ct(struct action_context *ctx, bool
> >> recirc_next,
> >> > bool commit)
> >> > add_prerequisite(ctx, "ip");
> >> > }
> >> >
> >> > +static void
> >> > +parse_ct_nat(struct action_context *ctx, bool snat)
> >> > +{
> >> > + const size_t ct_offset = ctx->ofpacts->size;
> >> > + ofpbuf_pull(ctx->ofpacts, ct_offset);
> >> > +
> >> > + struct ofpact_conntrack *ct = ofpact_put_CT(ctx->ofpacts);
> >> > +
> >> > + if (ctx->ap->cur_ltable < ctx->ap->n_tables) {
> >> > + ct->recirc_table = ctx->ap->first_ptable +
> ctx->ap->cur_ltable
> >> +
> >> > 1;
> >> > + } else {
> >> > + action_error(ctx,
> >> > + "\"ct_[sd]nat\" action not allowed in last
> >> table.");
> >> > + return;
> >> > + }
> >> > +
> >> > + if (snat) {
> >> > + ct->zone_src.field = mf_from_id(MFF_LOG_SNAT_ZONE);
> >> > + } else {
> >> > + ct->zone_src.field = mf_from_id(MFF_LOG_DNAT_ZONE);
> >> > + }
> >> > + ct->zone_src.ofs = 0;
> >> > + ct->zone_src.n_bits = 16;
> >> > + ct->flags = 0;
> >> > + ct->alg = 0;
> >> > +
> >> > + add_prerequisite(ctx, "ip");
> >> > +
> >> > + struct ofpact_nat *nat;
> >> > + size_t nat_offset;
> >> > + nat_offset = ctx->ofpacts->size;
> >> > + ofpbuf_pull(ctx->ofpacts, nat_offset);
> >> > +
> >> > + nat = ofpact_put_NAT(ctx->ofpacts);
> >> > + nat->flags = 0;
> >> > + nat->range_af = AF_UNSPEC;
> >> > +
> >> > + int commit = 0;
> >> > + if (lexer_match(ctx->lexer, LEX_T_LPAREN)) {
> >> > + ovs_be32 ip;
> >> > + if (ctx->lexer->token.type == LEX_T_INTEGER
> >> > + && ctx->lexer->token.format == LEX_F_IPV4) {
> >> > + ip = ctx->lexer->token.value.ipv4;
> >> > + } else {
> >> > + action_syntax_error(ctx, "invalid ip");
> >> > + return;
> >> > + }
> >> > +
> >> > + nat->range_af = AF_INET;
> >> > + nat->range.addr.ipv4.min = ip;
> >> > + if (snat) {
> >> > + nat->flags |= NX_NAT_F_SRC;
> >> > + } else {
> >> > + nat->flags |= NX_NAT_F_DST;
> >> > + }
> >> > + commit = NX_CT_F_COMMIT;
> >> > + lexer_get(ctx->lexer);
> >> > + if (!lexer_match(ctx->lexer, LEX_T_RPAREN)) {
> >> > + action_syntax_error(ctx, "expecting `)'");
> >> > + return;
> >> > + }
> >> > + }
> >> > +
> >> > + ctx->ofpacts->header = ofpbuf_push_uninit(ctx->ofpacts,
> >> nat_offset);
> >> > + ct = ctx->ofpacts->header;
> >> > + ct->flags |= commit;
> >> > +
> >> > + /* XXX: For performance reasons, we try to prevent additional
> >> > + * recirculations. So far, ct_snat which is used in a gateway
> >> router
> >> > + * does not need a recirculation. ct_snat(IP) does need a
> >> > recirculation.
> >> > + * Should we consider a method to let the actions specify
> whether a
> >> > action
> >> > + * needs recirculation if there more use cases?. */
> >> > + if (!commit && snat) {
> >> > + ct->recirc_table = NX_CT_RECIRC_NONE;
> >> > + }
> >> > + ofpact_finish(ctx->ofpacts, &ct->ofpact);
> >> > + ofpbuf_push_uninit(ctx->ofpacts, ct_offset);
> >> > +}
> >> > +
> >> > static bool
> >> > parse_action(struct action_context *ctx)
> >> > {
> >> > @@ -469,6 +548,10 @@ parse_action(struct action_context *ctx)
> >> > emit_ct(ctx, true, false);
> >> > } else if (lexer_match_id(ctx->lexer, "ct_commit")) {
> >> > emit_ct(ctx, false, true);
> >> > + } else if (lexer_match_id(ctx->lexer, "ct_dnat")) {
> >> > + parse_ct_nat(ctx, false);
> >> > + } else if (lexer_match_id(ctx->lexer, "ct_snat")) {
> >> > + parse_ct_nat(ctx, true);
> >> > } else if (lexer_match_id(ctx->lexer, "arp")) {
> >> > parse_arp_action(ctx);
> >> > } else if (lexer_match_id(ctx->lexer, "get_arp")) {
> >> > diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
> >> > index 1983812..c237604 100644
> >> > --- a/ovn/northd/ovn-northd.8.xml
> >> > +++ b/ovn/northd/ovn-northd.8.xml
> >> > @@ -517,11 +517,40 @@ next;
> >> >
> >> > <li>
> >> > <p>
> >> > - Reply to ARP requests. These flows reply to ARP requests
> for
> >> > the
> >> > - router's own IP address. For each router port <var>P</var>
> >> > that owns
> >> > - IP address <var>A</var> and Ethernet address <var>E</var>,
> a
> >> > - priority-90 flow matches <code>inport == <var>P</var>
> >> &&
> >> > - arp.op == 1 && arp.tpa == <var>A</var></code> (ARP
> >> > request)
> >> > + Reply to ARP requests.
> >> > + </p>
> >> > +
> >> > + <p>
> >> > + These flows reply to ARP requests for the router's own IP
> >> > address.
> >> > + For each router port <var>P</var> that owns IP address
> >> > <var>A</var>
> >> > + and Ethernet address <var>E</var>, a priority-90 flow
> matches
> >> > + <code>inport == <var>P</var> && arp.op == 1
> >> &&
> >> > + arp.tpa == <var>A</var></code> (ARP request) with the
> >> following
> >> > + actions:
> >> > + </p>
> >> > +
> >> > + <pre>
> >> > +eth.dst = eth.src;
> >> > +eth.src = <var>E</var>;
> >> > +arp.op = 2; /* ARP reply. */
> >> > +arp.tha = arp.sha;
> >> > +arp.sha = <var>E</var>;
> >> > +arp.tpa = arp.spa;
> >> > +arp.spa = <var>A</var>;
> >> > +outport = <var>P</var>;
> >> > +inport = ""; /* Allow sending out inport. */
> >> > +output;
> >> > + </pre>
> >> > + </li>
> >> > +
> >> > + <li>
> >> > + <p>
> >> > + These flows reply to ARP requests for the virtual IP
> >> addresses
> >> > + configured in the router for DNAT. For a configured DNAT IP
> >> > address
> >> > + <var>A</var>, for each router port <var>P</var> with
> Ethernet
> >> > + address <var>E</var>, a priority-90 flow matches
> >> > + <code>inport == <var>P</var> && arp.op == 1
> >> &&
> >> > + arp.tpa == <var>A</var></code> (ARP request)
> >> > with the following actions:
> >> > </p>
> >> >
> >> > @@ -663,7 +692,62 @@ icmp4 {
> >> > </li>
> >> > </ul>
> >> >
> >> > - <h3>Ingress Table 2: IP Routing</h3>
> >> > + <h3>Ingress Table 2: UNSNAT</h3>
> >> > +
> >> > + <p>
> >> > + This is for already established connections' reverse traffic.
> >> > + i.e., SNAT has already been done in egress pipeline and now the
> >> > + packet has entered the ingress pipeline as part of a reply. It
> >> is
> >> > + unSNATted here.
> >> > + </p>
> >> > +
> >> > + <ul>
> >> > + <li>
> >> > + <p>
> >> > + For each configuration in the OVN Northbound database, that
> >> asks
> >> > + to change the source IP address of a packet from
> >> <var>A</var> to
> >> > + <var>B</var>, a priority-100 flow matches <code>ip
> &&
> >> > + ip4.dst == <var>B</var></code> with an action
> >> > + <code>ct_snat; next;</code>.
> >> > + </p>
> >> > +
> >> > + <p>
> >> > + A priority-0 logical flow with match <code>1</code> has
> >> actions
> >> > + <code>next;</code>.
> >> > + </p>
> >> > + </li>
> >> > + </ul>
> >> > +
> >> > + <h3>Ingress Table 3: DNAT</h3>
> >> > +
> >> > + <p>
> >> > + Packets enter the pipeline with destination IP address that
> >> needs to
> >> > + be DNATted from a virtual IP address to a real IP address.
> >> Packets
> >> > + in the reverse direction needs to be unDNATed.
> >> > + </p>
> >> > + <ul>
> >> > + <li>
> >> > + <p>
> >> > + For each configuration in the OVN Northbound database, that
> >> asks
> >> > + to change the destination IP address of a packet from
> >> > <var>A</var> to
> >> > + <var>B</var>, a priority-100 flow matches <code>ip
> &&
> >> > + ip4.dst == <var>A</var></code> with an action <code>inport
> =
> >> "";
> >> > + ct_dnat(<var>B</var>);</code>.
> >> > + </p>
> >> > +
> >> > + <p>
> >> > + For all IP packets of a Gateway router, a priority-50 flow
> >> with
> >> > an
> >> > + action <code>inport = ""; ct_dnat;</code>.
> >> > + </p>
> >> > +
> >> > + <p>
> >> > + A priority-0 logical flow with match <code>1</code> has
> >> actions
> >> > + <code>next;</code>.
> >> > + </p>
> >> > + </li>
> >> > + </ul>
> >> > +
> >> > + <h3>Ingress Table 4: IP Routing</h3>
> >> >
> >> > <p>
> >> > A packet that arrives at this table is an IP packet that should
> >> be
> >> > routed
> >> > @@ -672,7 +756,7 @@ icmp4 {
> >> > <code>ip4.dst</code>, the packet's final destination,
> unchanged)
> >> and
> >> > advances to the next table for ARP resolution. It also sets
> >> > <code>reg1</code> to the IP address owned by the selected
> router
> >> > port
> >> > - (which is used later in table 4 as the IP source address for an
> >> ARP
> >> > + (which is used later in table 6 as the IP source address for an
> >> ARP
> >> > request, if needed).
> >> > </p>
> >> >
> >> > @@ -743,7 +827,7 @@ icmp4 {
> >> > </li>
> >> > </ul>
> >> >
> >> > - <h3>Ingress Table 3: ARP Resolution</h3>
> >> > + <h3>Ingress Table 5: ARP Resolution</h3>
> >> >
> >> > <p>
> >> > Any packet that reaches this table is an IP packet whose
> >> next-hop IP
> >> > @@ -798,7 +882,7 @@ icmp4 {
> >> > </li>
> >> > </ul>
> >> >
> >> > - <h3>Ingress Table 4: ARP Request</h3>
> >> > + <h3>Ingress Table 6: ARP Request</h3>
> >> >
> >> > <p>
> >> > In the common case where the Ethernet destination has been
> >> > resolved, this
> >> > @@ -823,7 +907,7 @@ arp {
> >> > </pre>
> >> >
> >> > <p>
> >> > - (Ingress table 2 initialized <code>reg1</code> with the IP
> >> > address
> >> > + (Ingress table 4 initialized <code>reg1</code> with the IP
> >> > address
> >> > owned by <code>outport</code>.)
> >> > </p>
> >> >
> >> > @@ -838,7 +922,32 @@ arp {
> >> > </li>
> >> > </ul>
> >> >
> >> > - <h3>Egress Table 0: Delivery</h3>
> >> > + <h3>Egress Table 0: SNAT</h3>
> >> > +
> >> > + <p>
> >> > + Packets that are configured to be SNATed get their source IP
> >> address
> >> > + changed based on the configuration in the OVN Northbound
> >> database.
> >> > + </p>
> >> > + <ul>
> >> > + <li>
> >> > + <p>
> >> > + For each configuration in the OVN Northbound database, that
> >> asks
> >> > + to change the source IP address of a packet from an IP
> >> address
> >> > of
> >> > + <var>A</var> or to change the source IP address of a packet
> >> that
> >> > + belongs to network <var>A</var> to <var>B</var>, a flow
> >> matches
> >> > + <code>ip && ip4.src == <var>A</var></code> with an
> >> > action
> >> > + <code>ct_snat(<var>B</var>);</code>. The priority of the
> >> flow
> >> > + is calculated based on the mask of <var>A</var>, with
> matches
> >> > + having larger masks getting higher priorities.
> >> > + </p>
> >> > + <p>
> >> > + A priority-0 logical flow with match <code>1</code> has
> >> actions
> >> > + <code>next;</code>.
> >> > + </p>
> >> > + </li>
> >> > + </ul>
> >> > +
> >> > + <h3>Egress Table 1: Delivery</h3>
> >> >
> >> > <p>
> >> > Packets that reach this table are ready for delivery. It
> >> contains
> >> > diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
> >> > index cac0148..4683780 100644
> >> > --- a/ovn/northd/ovn-northd.c
> >> > +++ b/ovn/northd/ovn-northd.c
> >> > @@ -105,12 +105,15 @@ enum ovn_stage {
> >> > /* Logical router ingress stages. */
> \
> >> > PIPELINE_STAGE(ROUTER, IN, ADMISSION, 0, "lr_in_admission")
> \
> >> > PIPELINE_STAGE(ROUTER, IN, IP_INPUT, 1, "lr_in_ip_input")
> \
> >> > - PIPELINE_STAGE(ROUTER, IN, IP_ROUTING, 2, "lr_in_ip_routing")
> \
> >> > - PIPELINE_STAGE(ROUTER, IN, ARP_RESOLVE, 3,
> "lr_in_arp_resolve") \
> >> > - PIPELINE_STAGE(ROUTER, IN, ARP_REQUEST, 4,
> "lr_in_arp_request") \
> >> > + PIPELINE_STAGE(ROUTER, IN, UNSNAT, 2, "lr_in_unsnat")
> \
> >> > + PIPELINE_STAGE(ROUTER, IN, DNAT, 3, "lr_in_dnat")
> \
> >> > + PIPELINE_STAGE(ROUTER, IN, IP_ROUTING, 4, "lr_in_ip_routing")
> \
> >> > + PIPELINE_STAGE(ROUTER, IN, ARP_RESOLVE, 5,
> "lr_in_arp_resolve") \
> >> > + PIPELINE_STAGE(ROUTER, IN, ARP_REQUEST, 6,
> "lr_in_arp_request") \
> >> >
> \
> >> > /* Logical router egress stages. */
> \
> >> > - PIPELINE_STAGE(ROUTER, OUT, DELIVERY, 0, "lr_out_delivery")
> >> > + PIPELINE_STAGE(ROUTER, OUT, SNAT, 0, "lr_out_snat")
> \
> >> > + PIPELINE_STAGE(ROUTER, OUT, DELIVERY, 1, "lr_out_delivery")
> >> >
> >> > #define PIPELINE_STAGE(DP_TYPE, PIPELINE, STAGE, TABLE, NAME) \
> >> > S_##DP_TYPE##_##PIPELINE##_##STAGE \
> >> > @@ -1998,6 +2001,51 @@ build_lrouter_flows(struct hmap *datapaths,
> >> struct
> >> > hmap *ports,
> >> > free(match);
> >> > free(actions);
> >> >
> >> > + /* ARP handling for external IP addresses.
> >> > + *
> >> > + * DNAT IP addresses are external IP addresses that need ARP
> >> > + * handling. */
> >> > + for (int i = 0; i < op->od->nbr->n_nat; i++) {
> >> > + const struct nbrec_nat *nat;
> >> > +
> >> > + nat = op->od->nbr->nat[i];
> >> > +
> >> > + if(!strcmp(nat->type, "snat")) {
> >> > + continue;
> >> > + }
> >> > +
> >> > + ovs_be32 ip;
> >> > + if (!ip_parse(nat->external_ip, &ip) || !ip) {
> >> > + static struct vlog_rate_limit rl =
> >> > VLOG_RATE_LIMIT_INIT(5, 1);
> >> > + VLOG_WARN_RL(&rl, "bad ip address %s in dnat
> >> > configuration "
> >> > + "for router %s", nat->external_ip,
> >> op->key);
> >> > + continue;
> >> > + }
> >> > +
> >> > + match = xasprintf(
> >> > + "inport == %s && arp.tpa == "IP_FMT" && arp.op == 1",
> >> > + op->json_key, IP_ARGS(ip));
> >> > + actions = xasprintf(
> >> > + "eth.dst = eth.src; "
> >> > + "eth.src = "ETH_ADDR_FMT"; "
> >> > + "arp.op = 2; /* ARP reply */ "
> >> > + "arp.tha = arp.sha; "
> >> > + "arp.sha = "ETH_ADDR_FMT"; "
> >> > + "arp.tpa = arp.spa; "
> >> > + "arp.spa = "IP_FMT"; "
> >> > + "outport = %s; "
> >> > + "inport = \"\"; /* Allow sending out inport. */ "
> >> > + "output;",
> >> > + ETH_ADDR_ARGS(op->mac),
> >> > + ETH_ADDR_ARGS(op->mac),
> >> > + IP_ARGS(ip),
> >> > + op->json_key);
> >> > + ovn_lflow_add(lflows, op->od, S_ROUTER_IN_IP_INPUT, 90,
> >> > + match, actions);
> >> > + free(match);
> >> > + free(actions);
> >> > + }
> >> > +
> >> > /* Drop IP traffic to this router. */
> >> > match = xasprintf("ip4.dst == "IP_FMT, IP_ARGS(op->ip));
> >> > ovn_lflow_add(lflows, op->od, S_ROUTER_IN_IP_INPUT, 60,
> >> > @@ -2005,6 +2053,135 @@ build_lrouter_flows(struct hmap *datapaths,
> >> struct
> >> > hmap *ports,
> >> > free(match);
> >> > }
> >> >
> >> > + /* NAT in Gateway routers. */
> >> > + HMAP_FOR_EACH (od, key_node, datapaths) {
> >> > + if (!od->nbr) {
> >> > + continue;
> >> > + }
> >> > +
> >> > + /* Packets are allowed by default. */
> >> > + ovn_lflow_add(lflows, od, S_ROUTER_IN_UNSNAT, 0, "1",
> "next;");
> >> > + ovn_lflow_add(lflows, od, S_ROUTER_OUT_SNAT, 0, "1",
> "next;");
> >> > + ovn_lflow_add(lflows, od, S_ROUTER_IN_DNAT, 0, "1", "next;");
> >> > +
> >> > + /* NAT rules are only valid on Gateway routers. */
> >> > + if (!smap_get(&od->nbr->options, "chassis")) {
> >> > + continue;
> >> > + }
> >> > +
> >> > + for (int i = 0; i < od->nbr->n_nat; i++) {
> >> > + const struct nbrec_nat *nat;
> >> > +
> >> > + nat = od->nbr->nat[i];
> >> > +
> >> > + ovs_be32 ip, mask;
> >> > +
> >> > + char *error = ip_parse_masked(nat->external_ip, &ip,
> >> &mask);
> >> > + if (error || mask != OVS_BE32_MAX) {
> >> > + static struct vlog_rate_limit rl =
> >> > VLOG_RATE_LIMIT_INIT(5, 1);
> >> > + VLOG_WARN_RL(&rl, "bad external ip %s for nat",
> >> > + nat->external_ip);
> >> > + free(error);
> >> > + continue;
> >> > + }
> >> > +
> >> > + /* Check the validity of nat->logical_ip. 'logical_ip'
> can
> >> > + * be a subnet when the type is "snat". */
> >> > + error = ip_parse_masked(nat->logical_ip, &ip, &mask);
> >> > + if (!strcmp(nat->type, "snat")) {
> >> > + if (error) {
> >> > + static struct vlog_rate_limit rl =
> >> > + VLOG_RATE_LIMIT_INIT(5, 1);
> >> > + VLOG_WARN_RL(&rl, "bad ip network or ip %s for
> >> snat "
> >> > + "in router "UUID_FMT"",
> >> > + nat->logical_ip,
> UUID_ARGS(&od->key));
> >> > + free(error);
> >> > + continue;
> >> > + }
> >> > + } else {
> >> > + if (error || mask != OVS_BE32_MAX) {
> >> > + static struct vlog_rate_limit rl =
> >> > + VLOG_RATE_LIMIT_INIT(5, 1);
> >> > + VLOG_WARN_RL(&rl, "bad ip %s for dnat in router "
> >> > + ""UUID_FMT"", nat->logical_ip,
> >> > UUID_ARGS(&od->key));
> >> > + free(error);
> >> > + continue;
> >> > + }
> >> > + }
> >> > +
> >> > +
> >> > + char *match, *actions;
> >> > +
> >> > + /* Ingress UNSNAT table: It is for already established
> >> > connections'
> >> > + * reverse traffic. i.e., SNAT has already been done in
> >> egress
> >> > + * pipeline and now the packet has entered the ingress
> >> > pipeline as
> >> > + * part of a reply. We undo the SNAT here.
> >> > + *
> >> > + * Undoing SNAT has to happen before DNAT processing.
> >> This is
> >> > + * because when the packet was DNATed in ingress
> pipeline,
> >> it
> >> > did
> >> > + * not know about the possibility of eventual additional
> >> SNAT
> >> > in
> >> > + * egress pipeline. */
> >> > + if (!strcmp(nat->type, "snat")
> >> > + || !strcmp(nat->type, "dnat_and_snat")) {
> >> > + match = xasprintf("ip && ip4.dst == %s",
> >> > nat->external_ip);
> >> > + ovn_lflow_add(lflows, od, S_ROUTER_IN_UNSNAT, 100,
> >> > + match, "ct_snat; next;");
> >> > + free(match);
> >> > + }
> >> > +
> >> > + /* Ingress DNAT table: Packets enter the pipeline with
> >> > destination
> >> > + * IP address that needs to be DNATted from a external IP
> >> > address
> >> > + * to a logical IP address. */
> >> > + if (!strcmp(nat->type, "dnat")
> >> > + || !strcmp(nat->type, "dnat_and_snat")) {
> >> > + /* Packet when it goes from the initiator to
> >> destination.
> >> > + * We need to zero the inport because the router can
> >> > + * send the packet back through the same interface.
> */
> >> > + match = xasprintf("ip && ip4.dst == %s",
> >> > nat->external_ip);
> >> > + actions = xasprintf("inport = \"\"; ct_dnat(%s);",
> >> > + nat->logical_ip);
> >> > + ovn_lflow_add(lflows, od, S_ROUTER_IN_DNAT, 100,
> >> > + match, actions);
> >> > + free(match);
> >> > + free(actions);
> >> > + }
> >> > +
> >> > + /* Egress SNAT table: Packets enter the egress pipeline
> >> with
> >> > + * source ip address that needs to be SNATted to a
> >> external ip
> >> > + * address. */
> >> > + if (!strcmp(nat->type, "snat")
> >> > + || !strcmp(nat->type, "dnat_and_snat")) {
> >> > + match = xasprintf("ip && ip4.src == %s",
> >> nat->logical_ip);
> >> > + actions = xasprintf("ct_snat(%s);",
> nat->external_ip);
> >> > +
> >> > + /* The priority here is calculated such that the
> >> > + * nat->logical_ip with the longest mask gets a
> higher
> >> > + * priority. */
> >> > + ovn_lflow_add(lflows, od, S_ROUTER_OUT_SNAT,
> >> > + count_1bits(ntohl(mask)) + 1, match,
> >> > actions);
> >> > + free(match);
> >> > + free(actions);
> >> > + }
> >> > + }
> >> > +
> >> > + /* Re-circulate every packet through the DNAT zone.
> >> > + * This helps with two things.
> >> > + *
> >> > + * 1. Any packet that needs to be unDNATed in the reverse
> >> > + * direction gets unDNATed. Ideally this could be done in
> >> > + * the egress pipeline. But since the gateway router
> >> > + * does not have any feature that depends on the source
> >> > + * ip address being external IP address for IP routing,
> >> > + * we can do it here, saving a future re-circulation.
> >> > + *
> >> > + * 2. Any packet that was sent through SNAT zone in the
> >> > + * previous table automatically gets re-circulated to get
> >> > + * back the new destination IP address that is needed for
> >> > + * routing in the openflow pipeline. */
> >> > + ovn_lflow_add(lflows, od, S_ROUTER_IN_DNAT, 50,
> >> > + "ip", "inport = \"\"; ct_dnat;");
> >> > + }
> >> > +
> >> > /* Logical router ingress table 2: IP Routing.
> >> > *
> >> > * A packet that arrives at this table is an IP packet that
> should
> >> be
> >> > @@ -2205,7 +2382,7 @@ build_lrouter_flows(struct hmap *datapaths,
> struct
> >> > hmap *ports,
> >> > ovn_lflow_add(lflows, od, S_ROUTER_IN_ARP_REQUEST, 0, "1",
> >> > "output;");
> >> > }
> >> >
> >> > - /* Logical router egress table 0: Delivery (priority 100).
> >> > + /* Logical router egress table 1: Delivery (priority 100).
> >> > *
> >> > * Priority 100 rules deliver packets to enabled logical ports.
> */
> >> > HMAP_FOR_EACH (op, key_node, ports) {
> >> > diff --git a/ovn/ovn-nb.ovsschema b/ovn/ovn-nb.ovsschema
> >> > index fa21b30..ac6ca14 100644
> >> > --- a/ovn/ovn-nb.ovsschema
> >> > +++ b/ovn/ovn-nb.ovsschema
> >> > @@ -1,7 +1,7 @@
> >> > {
> >> > "name": "OVN_Northbound",
> >> > - "version": "2.1.2",
> >> > - "cksum": "429668869 5325",
> >> > + "version": "2.1.3",
> >> > + "cksum": "3631923697 6121",
> >> > "tables": {
> >> > "Logical_Switch": {
> >> > "columns": {
> >> > @@ -78,6 +78,11 @@
> >> > "max": "unlimited"}},
> >> > "default_gw": {"type": {"key": "string", "min": 0,
> >> "max":
> >> > 1}},
> >> > "enabled": {"type": {"key": "boolean", "min": 0,
> "max":
> >> > 1}},
> >> > + "nat": {"type": {"key": {"type": "uuid",
> >> > + "refTable": "NAT",
> >> > + "refType": "strong"},
> >> > + "min": 0,
> >> > + "max": "unlimited"}},
> >> > "options": {
> >> > "type": {"key": "string",
> >> > "value": "string",
> >> > @@ -104,6 +109,16 @@
> >> > "ip_prefix": {"type": "string"},
> >> > "nexthop": {"type": "string"},
> >> > "output_port": {"type": {"key": "string", "min": 0,
> >> > "max": 1}}},
> >> > + "isRoot": false},
> >> > + "NAT": {
> >> > + "columns": {
> >> > + "external_ip": {"type": "string"},
> >> > + "logical_ip": {"type": "string"},
> >> > + "type": {"type": {"key": {"type": "string",
> >> > + "enum": ["set", ["dnat",
> >> > + "snat",
> >> > +
> >> > "dnat_and_snat"
> >> > +
> ]]}}}},
> >> > "isRoot": false}
> >> > }
> >> > }
> >> > diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml
> >> > index 130b63b..36d1158 100644
> >> > --- a/ovn/ovn-nb.xml
> >> > +++ b/ovn/ovn-nb.xml
> >> > @@ -631,18 +631,31 @@
> >> > router has all ingress and egress traffic dropped.
> >> > </column>
> >> >
> >> > + <column name="nat">
> >> > + One or more NAT rules for the router. NAT rules only work on
> the
> >> > + Gateway routers.
> >> > + </column>
> >> > +
> >> > <group title="Options">
> >> > <p>
> >> > Additional options for the logical router.
> >> > </p>
> >> >
> >> > <column name="options" key="chassis">
> >> > - If set, indicates that the logical router in question is
> >> > - a Gateway router (which is centralized) and resides in the
> set
> >> > - chassis. The same value is also used by
> >> > <code>ovn-controller</code>
> >> > - to uniquely identify the chassis in the OVN deployment and
> >> > - comes from <code>external_ids:system-id</code> in the
> >> > - <code>Open_vSwitch</code> table of Open_vSwitch database.
> >> > + <p>
> >> > + If set, indicates that the logical router in question is a
> >> > Gateway
> >> > + router (which is centralized) and resides in the set
> chassis.
> >> > The
> >> > + same value is also used by <code>ovn-controller</code> to
> >> > + uniquely identify the chassis in the OVN deployment and
> >> > + comes from <code>external_ids:system-id</code> in the
> >> > + <code>Open_vSwitch</code> table of Open_vSwitch database.
> >> > + </p>
> >> > +
> >> > + <p>
> >> > + The Gateway router can only be connected to a distributed
> >> router
> >> > + via a switch if SNAT and DNAT are to be configured in the
> >> > Gateway
> >> > + router.
> >> > + </p>
> >> > </column>
> >> > </group>
> >> >
> >> > @@ -765,4 +778,44 @@
> >> > </column>
> >> > </table>
> >> >
> >> > + <table name="NAT" title="NAT rules for a Gateway router.">
> >> > + <p>
> >> > + Each record represents a NAT rule in a Gateway router.
> >> > + </p>
> >> > +
> >> > + <column name="type">
> >> > + <p>Type of the NAT rule.</p>
> >> > + <ul>
> >> > + <li>
> >> > + When <ref column="type"/> is <code>dnat</code>, the
> >> externally
> >> > + visible IP address <ref column="external_ip"/> is DNATted
> to
> >> > the IP
> >> > + address <ref column="logical_ip"/> in the logical space.
> >> > + </li>
> >> > + <li>
> >> > + When <ref column="type"/> is <code>snat</code>, IP packets
> >> > + with their source IP address that either matches the IP
> >> address
> >> > + in <ref column="logical_ip"/> or is in the network provided
> >> by
> >> > + <ref column="logical_ip"/> is SNATed into the IP address in
> >> > + <ref column="external_ip"/>.
> >> > + </li>
> >> > + <li>
> >> > + When <ref column="type"/> is <code>dnat_and_snat</code>,
> the
> >> > + externally visible IP address <ref column="external_ip"/>
> is
> >> > + DNATted to the IP address <ref column="logical_ip"/> in the
> >> > + logical space. In addition, IP packets with the source IP
> >> > + address that matches <ref column="logical_ip"/> is SNATed
> >> into
> >> > + the IP address in <ref column="external_ip"/>.
> >> > + </li>
> >> > + </ul>
> >> > + </column>
> >> > +
> >> > + <column name="external_ip">
> >> > + An IPv4 address.
> >> > + </column>
> >> > +
> >> > + <column name="logical_ip">
> >> > + An IPv4 network (e.g 192.168.1.0/24) or an IPv4 address.
> >> > + </column>
> >> > + </table>
> >> > +
> >> > </database>
> >> > diff --git a/ovn/ovn-sb.xml b/ovn/ovn-sb.xml
> >> > index 1231b4e..5665871 100644
> >> > --- a/ovn/ovn-sb.xml
> >> > +++ b/ovn/ovn-sb.xml
> >> > @@ -951,6 +951,47 @@
> >> > </p>
> >> > </dd>
> >> >
> >> > + <dt><code>ct_dnat;</code></dt>
> >> > + <dt><code>ct_dnat(<var>IP</var>);</code></dt>
> >> > + <dd>
> >> > + <p>
> >> > + <code>ct_dnat</code> sends the packet through the DNAT
> >> zone in
> >> > + connection tracking table to unDNAT any packet that was
> >> > DNATed in
> >> > + the opposite direction. The packet is then automatically
> >> > sent to
> >> > + to the next tables as if followed by <code>next;</code>
> >> > action.
> >> > + The next tables will see the changes in the packet caused
> >> by
> >> > + the connection tracker.
> >> > + </p>
> >> > + <p>
> >> > + <code>ct_dnat(<var>IP</var>)</code> sends the packet
> >> through
> >> > the
> >> > + DNAT zone to change the destination IP address of the
> >> packet
> >> > to
> >> > + the one provided inside the parenthesis and commits the
> >> > connection.
> >> > + The packet is then automatically sent to the next tables
> >> as if
> >> > + followed by <code>next;</code> action. The next tables
> >> will
> >> > see
> >> > + the changes in the packet caused by the connection
> tracker.
> >> > + </p>
> >> > + </dd>
> >> > +
> >> > + <dt><code>ct_snat;</code></dt>
> >> > + <dt><code>ct_snat(<var>IP</var>);</code></dt>
> >> > + <dd>
> >> > + <p>
> >> > + <code>ct_snat</code> sends the packet through the SNAT
> >> zone to
> >> > + unSNAT any packet that was SNATed in the opposite
> >> direction.
> >> > If
> >> > + the packet needs to be sent to the next tables, then it
> >> > should be
> >> > + followed by a <code>next;</code> action. The next tables
> >> > will not
> >> > + see the changes in the packet caused by the connection
> >> > tracker.
> >> > + </p>
> >> > + <p>
> >> > + <code>ct_snat(<var>IP</var>)</code> sends the packet
> >> through
> >> > the
> >> > + SNAT zone to change the source IP address of the packet
> to
> >> > + the one provided inside the parenthesis and commits the
> >> > connection.
> >> > + The packet is then automatically sent to the next tables
> >> as if
> >> > + followed by <code>next;</code> action. The next tables
> >> will
> >> > see the
> >> > + changes in the packet caused by the connection tracker.
> >> > + </p>
> >> > + </dd>
> >> > +
> >> > <dt><code>arp { <var>action</var>; </code>...<code>
> >> };</code></dt>
> >> > <dd>
> >> > <p>
> >> > diff --git a/ovn/utilities/ovn-nbctl.c b/ovn/utilities/ovn-nbctl.c
> >> > index 321040e..b821307 100644
> >> > --- a/ovn/utilities/ovn-nbctl.c
> >> > +++ b/ovn/utilities/ovn-nbctl.c
> >> > @@ -1449,6 +1449,11 @@ static const struct ctl_table_class tables[] =
> {
> >> > NULL},
> >> > {NULL, NULL, NULL}}},
> >> >
> >> > + {&nbrec_table_nat,
> >> > + {{&nbrec_table_nat, NULL,
> >> > + NULL},
> >> > + {NULL, NULL, NULL}}},
> >> > +
> >> > {NULL, {{NULL, NULL, NULL}, {NULL, NULL, NULL}}}
> >> > };
> >> >
> >> > diff --git a/tests/ovn.at b/tests/ovn.at
> >> > index 633cf35..19d5c73 100644
> >> > --- a/tests/ovn.at
> >> > +++ b/tests/ovn.at
> >> > @@ -507,6 +507,23 @@ ip.ttl => Syntax error at end of input expecting
> >> `--'.
> >> > ct_next; => actions=ct(table=27,zone=NXM_NX_REG5[0..15]), prereqs=ip
> >> > ct_commit; => actions=ct(commit,zone=NXM_NX_REG5[0..15]), prereqs=ip
> >> >
> >> > +# dnat
> >> > +ct_dnat; => actions=ct(table=27,zone=NXM_NX_REG3[0..15],nat),
> >> prereqs=ip
> >> > +ct_dnat(192.168.1.2); =>
> >> >
> >>
> actions=ct(commit,table=27,zone=NXM_NX_REG3[0..15],nat(dst=192.168.1.2)),
> >> > prereqs=ip
> >> > +ct_dnat(192.168.1.2, 192.168.1.3); => Syntax error at `,' expecting
> >> `)'.
> >> > +ct_dnat(foo); => Syntax error at `foo' invalid ip.
> >> > +ct_dnat(foo, bar); => Syntax error at `foo' invalid ip.
> >> > +ct_dnat(); => Syntax error at `)' invalid ip.
> >> > +
> >> > +# snat
> >> > +ct_snat; => actions=ct(zone=NXM_NX_REG4[0..15],nat), prereqs=ip
> >> > +ct_snat(192.168.1.2); =>
> >> >
> >>
> actions=ct(commit,table=27,zone=NXM_NX_REG4[0..15],nat(src=192.168.1.2)),
> >> > prereqs=ip
> >> > +ct_snat(192.168.1.2, 192.168.1.3); => Syntax error at `,' expecting
> >> `)'.
> >> > +ct_snat(foo); => Syntax error at `foo' invalid ip.
> >> > +ct_snat(foo, bar); => Syntax error at `foo' invalid ip.
> >> > +ct_snat(); => Syntax error at `)' invalid ip.
> >> > +
> >> > +
> >> > # arp
> >> > arp { eth.dst = ff:ff:ff:ff:ff:ff; output; }; =>
> >> >
> >>
> actions=controller(userdata=00.00.00.00.00.00.00.00.00.19.00.10.80.00.06.06.ff.ff.ff.ff.ff.ff.00.00.ff.ff.00.10.00.00.23.20.00.0e.ff.f8.40.00.00.00),
> >> > prereqs=ip4
> >> >
> >> > --
> >> > 1.9.1
> >> >
> >> >
> >> _______________________________________________
> >> dev mailing list
> >> [email protected]
> >> http://openvswitch.org/mailman/listinfo/dev
> >>
> >
> >
> _______________________________________________
> dev mailing list
> [email protected]
> http://openvswitch.org/mailman/listinfo/dev
>
_______________________________________________
dev mailing list
[email protected]
http://openvswitch.org/mailman/listinfo/dev