On Fri, Jan 13, 2017 at 4:21 PM, Ben Pfaff <[email protected]> wrote:

> On Fri, Jan 13, 2017 at 02:19:21PM -0800, Mickey Spiegel wrote:
> > On Thu, Jan 12, 2017 at 5:12 PM, Mickey Spiegel <[email protected]>
> > wrote:
> >
> > >
> > > On Sun, Jan 8, 2017 at 10:30 PM, Mickey Spiegel <[email protected]
> >
> > > wrote:
> > >
> > >>
> > >> On Fri, Jan 6, 2017 at 8:31 PM, Mickey Spiegel <[email protected]
> >
> > >> wrote:
> > >>
> > >>>
> > >>> On Fri, Jan 6, 2017 at 4:21 PM, Mickey Spiegel <
> [email protected]>
> > >>> wrote:
> > >>>
> > >>>>
> > >>>> On Fri, Jan 6, 2017 at 4:11 PM, Ben Pfaff <[email protected]> wrote:
> > >>>>
> > >>>>> On Fri, Jan 06, 2017 at 03:47:03PM -0800, Mickey Spiegel wrote:
> > >>>>> > On Fri, Jan 6, 2017 at 3:20 PM, Ben Pfaff <[email protected]> wrote:
> > >>>>> >
> > >>>>> > > On Fri, Jan 06, 2017 at 12:00:30PM -0800, Mickey Spiegel wrote:
> > >>>>> > > > Currently OVN handles all logical router ports in a
> distributed
> > >>>>> manner,
> > >>>>> > > > creating instances on each chassis.  The logical router
> ingress
> > >>>>> and
> > >>>>> > > > egress pipelines are traversed locally on the source chassis.
> > >>>>> > > >
> > >>>>> > > > In order to support advanced features such as one-to-many NAT
> > >>>>> (aka IP
> > >>>>> > > > masquerading), where multiple private IP addresses spread
> across
> > >>>>> > > > multiple chassis are mapped to one public IP address, it
> will be
> > >>>>> > > > necessary to handle some of the logical router processing on
> a
> > >>>>> specific
> > >>>>> > > > chassis in a centralized manner.
> > >>>>> > > >
> > >>>>> > > > The goal of this patch is to develop abstractions that allow
> for
> > >>>>> a
> > >>>>> > > > subset of router gateway traffic to be handled in a
> centralized
> > >>>>> manner
> > >>>>> > > > (e.g. one-to-many NAT traffic), while allowing for other
> subsets
> > >>>>> of
> > >>>>> > > > router gateway traffic to be handled in a distributed manner
> > >>>>> (e.g.
> > >>>>> > > > floating IP traffic).
> > >>>>> > > >
> > >>>>> > > > This patch introduces a new type of SB port_binding called
> > >>>>> > > > "chassisredirect".  A "chassisredirect" port represents a
> > >>>>> particular
> > >>>>> > > > instance, bound to a specific chassis, of an otherwise
> > >>>>> distributed
> > >>>>> > > > port.  The ovn-controller on that chassis populates the
> "chassis"
> > >>>>> > > > column for this record as an indication for other
> > >>>>> ovn-controllers of
> > >>>>> > > > its physical location.  Other ovn-controllers do not treat
> this
> > >>>>> port
> > >>>>> > > > as a local port.
> > >>>>> > > >
> > >>>>> > > > A "chassisredirect" port should never be used as an "inport".
> > >>>>> When an
> > >>>>> > > > ingress pipeline sets the "outport", it may set the value to
> a
> > >>>>> logical
> > >>>>> > > > port of type "chassisredirect".  This will cause the packet
> to be
> > >>>>> > > > directed to a specific chassis to carry out the egress
> logical
> > >>>>> router
> > >>>>> > > > pipeline, in the same way that a logical switch forwards
> egress
> > >>>>> traffic
> > >>>>> > > > to a VIF port residing on a specific chassis.  At the
> beginning
> > >>>>> of the
> > >>>>> > > > egress pipeline, the "outport" will be reset to the value of
> the
> > >>>>> > > > distributed port.
> > >>>>> > > >
> > >>>>> > > > For outbound traffic to be handled in a centralized manner,
> the
> > >>>>> > > > "outport" should be set to the "chassisredirect" port
> > >>>>> representing
> > >>>>> > > > centralized gateway functionality in the otherwise
> distributed
> > >>>>> router.
> > >>>>> > > > For outbound traffic to be handled in a distributed manner,
> > >>>>> locally on
> > >>>>> > > > the source chassis, the "outport" should be set to the
> existing
> > >>>>> "patch"
> > >>>>> > > > port representing distributed gateway functionality.
> > >>>>> > > >
> > >>>>> > > > Inbound traffic will be directed to the appropriate chassis
> by
> > >>>>> > > > restricting source MAC address usage and ARP responses to
> that
> > >>>>> chassis,
> > >>>>> > > > or by running dynamic routing protocols.
> > >>>>> > > >
> > >>>>> > > > Note that "chassisredirect" ports have no associated IP or
> MAC
> > >>>>> addresses.
> > >>>>> > > > Any pipeline stages that depend on port specific IP or MAC
> > >>>>> addresses
> > >>>>> > > > should be carried out in the context of the distributed port.
> > >>>>> > > >
> > >>>>> > > > Although the abstraction represented by the "chassisredirect"
> > >>>>> port
> > >>>>> > > > binding is generalized, in this patch the "chassisredirect"
> port
> > >>>>> binding
> > >>>>> > > > is only created for NB logical router ports that specify the
> new
> > >>>>> > > > "redirect-chassis" option.  There is no explicit notion of a
> > >>>>> > > > "chassisredirect" port in the NB database.  The expectation
> is
> > >>>>> when
> > >>>>> > > > capabilities are implemented that take advantage of
> > >>>>> "chassisredirect"
> > >>>>> > > > ports (e.g. NAT), the addition of flows specifying a
> > >>>>> "chassisredirect"
> > >>>>> > > > port as the outport will also be triggered by the presence
> of the
> > >>>>> > > > "redirect-chassis" option.  Such flows are added for NB
> logical
> > >>>>> router
> > >>>>> > > > ports that specify the "redirect-chassis" option.
> > >>>>> > > >
> > >>>>> > > > Signed-off-by: Mickey Spiegel <[email protected]>
> > >>>>> > >
> > >>>>> > > chassisredirect ports seem incredibly similar to vif ports.  Is
> > >>>>> the only
> > >>>>> > > difference that the output port is changed at the beginning of
> the
> > >>>>> > > egress pipeline?  That's something that could be implemented
> in the
> > >>>>> > > logical egress pipeline with 'outport = "...";'.  We do say
> that
> > >>>>> the
> > >>>>> > > outport isn't supposed to be modified in an egress pipeline,
> but
> > >>>>> nothing
> > >>>>> > > enforces that and if it's actually useful then we could just
> > >>>>> change the
> > >>>>> > > documentation.
> > >>>>> > >
> > >>>>> >
> > >>>>> > I don't get the similarity to vif ports.
> > >>>>> >
> > >>>>> > I need to create two different ports for each logical router port
> > >>>>> > specifying a "redirect-chassis". One represents the centralized
> > >>>>> > instance, for traffic that needs to be centralized. The other
> > >>>>> > represents the distributed instance, i.e. just take the local
> patch
> > >>>>> > port and go to/from the local logical router instance. I wanted
> the
> > >>>>> > egress pipeline processing to be the same regardless of whether
> > >>>>> > the packet arrived at the egress pipeline on the port
> representing
> > >>>>> > the centralized instance, or whether the packet arrived at the
> > >>>>> > egress pipeline on the port representing the distributed
> instance.
> > >>>>> >
> > >>>>> > There is no pipeline processing of the chassisredirect port,
> > >>>>> > except as the outport in the ingress pipeline. Everything else
> > >>>>> > happens in tables 32 and 33.
> > >>>>>
> > >>>>> OK, then I'm having trouble following the description.  For me,
> here's
> > >>>>> the key paragraphs that led me to my conclusions:
> > >>>>>
> > >>>>>     This patch introduces a new type of SB port_binding called
> > >>>>>     "chassisredirect".  A "chassisredirect" port represents a
> > >>>>> particular
> > >>>>>     instance, bound to a specific chassis, of an otherwise
> distributed
> > >>>>>     port.  The ovn-controller on that chassis populates the
> "chassis"
> > >>>>>     column for this record as an indication for other
> ovn-controllers
> > >>>>> of
> > >>>>>     its physical location.  Other ovn-controllers do not treat this
> > >>>>> port
> > >>>>>     as a local port.
> > >>>>>
> > >>>>>     A "chassisredirect" port should never be used as an "inport".
> When
> > >>>>>     an ingress pipeline sets the "outport", it may set the value
> to a
> > >>>>>     logical port of type "chassisredirect".  This will cause the
> packet
> > >>>>>     to be directed to a specific chassis to carry out the egress
> > >>>>> logical
> > >>>>>     router pipeline, in the same way that a logical switch forwards
> > >>>>>     egress traffic to a VIF port residing on a specific chassis.
> At
> > >>>>> the
> > >>>>>     beginning of the egress pipeline, the "outport" will be reset
> to
> > >>>>> the
> > >>>>>     value of the distributed port.
> > >>>>>
> > >>>>> The first paragraph appears to say that a chassisredirect port is a
> > >>>>> port
> > >>>>> on a particular chassis and that its chassis column says what
> chassis
> > >>>>> it's on.  OK, that's the same as a vif port, right?
> > >>>>>
> > >>>>
> > >>>> Yes, the same as vif, l2gateway, or l3gateway in the sense that this
> > >>>> port is bound to a chassis. No differences there.
> > >>>>
> > >>>>>
> > >>>>> The second paragraph appears to me to say, first, that packets
> would
> > >>>>> never originate from a chassisredirect port.  OK, fine, no problem.
> > >>>>> Second, it directly makes an analogy to vif ports, and then says
> that
> > >>>>> the outport changes.  No problem.
> > >>>>>
> > >>>>
> > >>>> Two main differences from vif:
> > >>>> 1. The outport changes. I want the ct_zone assignments in table 33
> > >>>>    and the loopback check in table 34 to be according to the new
> > >>>>    outport.
> > >>>>
> > >>>> 2. There is no pipeline processing of this port. This port has no
> > >>>>    addresses or other configuration. The purpose of the port is to
> > >>>>    tell table 32 to go to a particular chassis, and then tell table
> 33
> > >>>>    what the real outport should be.
> > >>>>
> > >>>> I got to this notion because a port is the way to tell table 32 to
> > >>>> go to a particular chassis. The first thought was two regular patch
> > >>>> ports, but the idea of two patch ports with the same addresses
> > >>>> is confusing and dangerous. By changing back to the real patch
> > >>>> port right away in the egress pipeline, it avoids those problems.
> > >>>>
> > >>>> Mickey
> > >>>>
> > >>>
> > >>> Let me go back to first principles. I need three sorts of chassis
> > >>> specific behaviors for distributed NAT:
> > >>> 1. Install some flows only on the chassis where a certain logical
> > >>>    port resides. That is is_chassis_resident which you already
> > >>>    reviewed and acked. The nat flows patch at the end of the
> > >>>    patch set uses this mechanism.
> > >>> 2. Install a different set of flows associated with the distributed
> > >>>    gateway port only on the redirect-chassis. There are several
> > >>>    such flows in this patch.
> > >>> 3. Direct some traffic with outport being the distributed gateway
> > >>>    port to the instance of the distributed gateway port on the
> > >>>    redirect-chassis. When this traffic hits table 32, it gets
> > >>>    sent through the normal tunnel to the redirect-chassis.
> > >>>
> > >>> I needed some handle that triggers 3. I decided to make that
> > >>> handle be a port, which I called a "chassisredirect" port. That
> > >>> also allows me to use is_chassis_resident(chassisredirect_port)
> > >>> to solve 2.
> > >>>
> > >>> It is possible to make that handle be something other than a
> > >>> port, as long as table 32 is modified to act on that. In that case,
> > >>> I will need another match "condition" (as I called it) based on
> > >>> that handle, similar to is_chassis_resident but based on
> > >>> whatever handle we decide on instead of port.
> > >>>
> > >>
> > >> I realized earlier tonight that there is a straightforward
> > >> alternative, though it does have one potentially confusing
> > >> aspect.
> > >>
> > >> For some reason, I had been assuming that a port_binding is
> > >> either exclusive to a chassis (in the previous implementation
> > >> with OVS patch ports, it had an ofport), or the port_binding
> > >> exists everywhere and does not have a chassis association
> > >> (is_remote in the previous implementation with OVS patch
> > >> ports).
> > >>
> > >> If this is relaxed and we allow logical patch ports to be
> > >> associated with a chassis, then all I need is a new
> > >> MLF_FORCE_CHASSIS_REDIRECT flag rather than
> > >> a second port_binding with a new "chassisredirect" type.
> > >>
> > >> The potentially confusing aspect is that even though the
> > >> mechanism for associating a logical patch port with a
> > >> chassis is identical to that for other port_binding types such
> > >> as "l3gateway", the association of a chassis with a logical
> > >> patch port has a different meaning than the association of a
> > >> chassis with a VIF, a type "l3gateway" port_binding, or a
> > >> type "l2gateway" port_binding.  For the latter, the association
> > >> is exclusive, i.e. the port only exists on that chassis.  For
> > >> logical patch ports, whether there is an association with a
> > >> chassis or not, the logical patch port exists everywhere
> > >> (subject to the constraints of conditional monitoring).
> > >>
> > >> The chassis association would only be used for a new
> > >> table 32 flow similar to other flows sending packets to
> > >> remote hypervisors for other port_binding types, but with
> > >> a different match condition:
> > >>     match_set_metadata(&match, htonll(dp_key))
> > >>     match_set_reg(&match, MFF_LOG_OUTPORT - MFF_REG0, port_key);
> > >>     match_set_reg_masked(&match, MFF_LOG_FLAGS - MFF_REG0,
> > >>                          1, MLF_FORCE_CHASSIS_REDIRECT);
> > >>
> > >> Depending on whether the
> > >> MLF_FORCE_CHASSIS_REDIRECT flag is set, the
> > >> packet would either be sent to the remote hypervisor,
> > >> or it would fall through to the table 32 priority 0 fallback
> > >> flow and be processed locally.
> > >>
> > >> The chassis association could also be used for
> > >> evaluation of is_chassis_resident("l3dgw_port") functions
> > >> in flow matches.
> > >>
> > >> If you agree that this approach is more promising than
> > >> type "chassisredirect" ports, I can code this up tomorrow.
> > >>
> > >
> > > I am having trouble making this approach work with the
> > > ARP request table. With the approach of replacing the
> > > logical outport, the ARP request goes to the controller
> > > with the new outport of type "chassisredirect". When the
> > > packet is reinjected, it does eventually end up at the
> > > redirect chassis.
> > >
> > > With the approach of using a flag, the packet is not
> > > hitting the table 32 entry matching the flag. I am not sure
> > > what happens to the packet after it goes up to the
> > > controller, and I am not sure how to debug it further or
> > > what to change to make it work.
> > >
> >
> > I found the bug. It was affecting all packets, not just arp, and
> > was a simple fix. I am still checking all scenarios, but I think
> > I have the approach with the flag instead of a new port type
> > working. I can move forward with either approach, a flag or
> > a new port type as originally proposed.
>
> Do you mind posting the version with the flag?  We'll do one or the
> other.
>

Should I post just that, or the patch set?
I have not integrated it with later patches yet.

Mickey
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to