On Thu, Jan 12, 2017 at 5:12 PM, Mickey Spiegel <mickeys....@gmail.com> wrote:
> > On Sun, Jan 8, 2017 at 10:30 PM, Mickey Spiegel <mickeys....@gmail.com> > wrote: > >> >> On Fri, Jan 6, 2017 at 8:31 PM, Mickey Spiegel <mickeys....@gmail.com> >> wrote: >> >>> >>> On Fri, Jan 6, 2017 at 4:21 PM, Mickey Spiegel <mickeys....@gmail.com> >>> wrote: >>> >>>> >>>> On Fri, Jan 6, 2017 at 4:11 PM, Ben Pfaff <b...@ovn.org> wrote: >>>> >>>>> On Fri, Jan 06, 2017 at 03:47:03PM -0800, Mickey Spiegel wrote: >>>>> > On Fri, Jan 6, 2017 at 3:20 PM, Ben Pfaff <b...@ovn.org> wrote: >>>>> > >>>>> > > On Fri, Jan 06, 2017 at 12:00:30PM -0800, Mickey Spiegel wrote: >>>>> > > > Currently OVN handles all logical router ports in a distributed >>>>> manner, >>>>> > > > creating instances on each chassis. The logical router ingress >>>>> and >>>>> > > > egress pipelines are traversed locally on the source chassis. >>>>> > > > >>>>> > > > In order to support advanced features such as one-to-many NAT >>>>> (aka IP >>>>> > > > masquerading), where multiple private IP addresses spread across >>>>> > > > multiple chassis are mapped to one public IP address, it will be >>>>> > > > necessary to handle some of the logical router processing on a >>>>> specific >>>>> > > > chassis in a centralized manner. >>>>> > > > >>>>> > > > The goal of this patch is to develop abstractions that allow for >>>>> a >>>>> > > > subset of router gateway traffic to be handled in a centralized >>>>> manner >>>>> > > > (e.g. one-to-many NAT traffic), while allowing for other subsets >>>>> of >>>>> > > > router gateway traffic to be handled in a distributed manner >>>>> (e.g. >>>>> > > > floating IP traffic). >>>>> > > > >>>>> > > > This patch introduces a new type of SB port_binding called >>>>> > > > "chassisredirect". A "chassisredirect" port represents a >>>>> particular >>>>> > > > instance, bound to a specific chassis, of an otherwise >>>>> distributed >>>>> > > > port. The ovn-controller on that chassis populates the "chassis" >>>>> > > > column for this record as an indication for other >>>>> ovn-controllers of >>>>> > > > its physical location. Other ovn-controllers do not treat this >>>>> port >>>>> > > > as a local port. >>>>> > > > >>>>> > > > A "chassisredirect" port should never be used as an "inport". >>>>> When an >>>>> > > > ingress pipeline sets the "outport", it may set the value to a >>>>> logical >>>>> > > > port of type "chassisredirect". This will cause the packet to be >>>>> > > > directed to a specific chassis to carry out the egress logical >>>>> router >>>>> > > > pipeline, in the same way that a logical switch forwards egress >>>>> traffic >>>>> > > > to a VIF port residing on a specific chassis. At the beginning >>>>> of the >>>>> > > > egress pipeline, the "outport" will be reset to the value of the >>>>> > > > distributed port. >>>>> > > > >>>>> > > > For outbound traffic to be handled in a centralized manner, the >>>>> > > > "outport" should be set to the "chassisredirect" port >>>>> representing >>>>> > > > centralized gateway functionality in the otherwise distributed >>>>> router. >>>>> > > > For outbound traffic to be handled in a distributed manner, >>>>> locally on >>>>> > > > the source chassis, the "outport" should be set to the existing >>>>> "patch" >>>>> > > > port representing distributed gateway functionality. >>>>> > > > >>>>> > > > Inbound traffic will be directed to the appropriate chassis by >>>>> > > > restricting source MAC address usage and ARP responses to that >>>>> chassis, >>>>> > > > or by running dynamic routing protocols. >>>>> > > > >>>>> > > > Note that "chassisredirect" ports have no associated IP or MAC >>>>> addresses. >>>>> > > > Any pipeline stages that depend on port specific IP or MAC >>>>> addresses >>>>> > > > should be carried out in the context of the distributed port. >>>>> > > > >>>>> > > > Although the abstraction represented by the "chassisredirect" >>>>> port >>>>> > > > binding is generalized, in this patch the "chassisredirect" port >>>>> binding >>>>> > > > is only created for NB logical router ports that specify the new >>>>> > > > "redirect-chassis" option. There is no explicit notion of a >>>>> > > > "chassisredirect" port in the NB database. The expectation is >>>>> when >>>>> > > > capabilities are implemented that take advantage of >>>>> "chassisredirect" >>>>> > > > ports (e.g. NAT), the addition of flows specifying a >>>>> "chassisredirect" >>>>> > > > port as the outport will also be triggered by the presence of the >>>>> > > > "redirect-chassis" option. Such flows are added for NB logical >>>>> router >>>>> > > > ports that specify the "redirect-chassis" option. >>>>> > > > >>>>> > > > Signed-off-by: Mickey Spiegel <mickeys....@gmail.com> >>>>> > > >>>>> > > chassisredirect ports seem incredibly similar to vif ports. Is >>>>> the only >>>>> > > difference that the output port is changed at the beginning of the >>>>> > > egress pipeline? That's something that could be implemented in the >>>>> > > logical egress pipeline with 'outport = "...";'. We do say that >>>>> the >>>>> > > outport isn't supposed to be modified in an egress pipeline, but >>>>> nothing >>>>> > > enforces that and if it's actually useful then we could just >>>>> change the >>>>> > > documentation. >>>>> > > >>>>> > >>>>> > I don't get the similarity to vif ports. >>>>> > >>>>> > I need to create two different ports for each logical router port >>>>> > specifying a "redirect-chassis". One represents the centralized >>>>> > instance, for traffic that needs to be centralized. The other >>>>> > represents the distributed instance, i.e. just take the local patch >>>>> > port and go to/from the local logical router instance. I wanted the >>>>> > egress pipeline processing to be the same regardless of whether >>>>> > the packet arrived at the egress pipeline on the port representing >>>>> > the centralized instance, or whether the packet arrived at the >>>>> > egress pipeline on the port representing the distributed instance. >>>>> > >>>>> > There is no pipeline processing of the chassisredirect port, >>>>> > except as the outport in the ingress pipeline. Everything else >>>>> > happens in tables 32 and 33. >>>>> >>>>> OK, then I'm having trouble following the description. For me, here's >>>>> the key paragraphs that led me to my conclusions: >>>>> >>>>> This patch introduces a new type of SB port_binding called >>>>> "chassisredirect". A "chassisredirect" port represents a >>>>> particular >>>>> instance, bound to a specific chassis, of an otherwise distributed >>>>> port. The ovn-controller on that chassis populates the "chassis" >>>>> column for this record as an indication for other ovn-controllers >>>>> of >>>>> its physical location. Other ovn-controllers do not treat this >>>>> port >>>>> as a local port. >>>>> >>>>> A "chassisredirect" port should never be used as an "inport". When >>>>> an ingress pipeline sets the "outport", it may set the value to a >>>>> logical port of type "chassisredirect". This will cause the packet >>>>> to be directed to a specific chassis to carry out the egress >>>>> logical >>>>> router pipeline, in the same way that a logical switch forwards >>>>> egress traffic to a VIF port residing on a specific chassis. At >>>>> the >>>>> beginning of the egress pipeline, the "outport" will be reset to >>>>> the >>>>> value of the distributed port. >>>>> >>>>> The first paragraph appears to say that a chassisredirect port is a >>>>> port >>>>> on a particular chassis and that its chassis column says what chassis >>>>> it's on. OK, that's the same as a vif port, right? >>>>> >>>> >>>> Yes, the same as vif, l2gateway, or l3gateway in the sense that this >>>> port is bound to a chassis. No differences there. >>>> >>>>> >>>>> The second paragraph appears to me to say, first, that packets would >>>>> never originate from a chassisredirect port. OK, fine, no problem. >>>>> Second, it directly makes an analogy to vif ports, and then says that >>>>> the outport changes. No problem. >>>>> >>>> >>>> Two main differences from vif: >>>> 1. The outport changes. I want the ct_zone assignments in table 33 >>>> and the loopback check in table 34 to be according to the new >>>> outport. >>>> >>>> 2. There is no pipeline processing of this port. This port has no >>>> addresses or other configuration. The purpose of the port is to >>>> tell table 32 to go to a particular chassis, and then tell table 33 >>>> what the real outport should be. >>>> >>>> I got to this notion because a port is the way to tell table 32 to >>>> go to a particular chassis. The first thought was two regular patch >>>> ports, but the idea of two patch ports with the same addresses >>>> is confusing and dangerous. By changing back to the real patch >>>> port right away in the egress pipeline, it avoids those problems. >>>> >>>> Mickey >>>> >>> >>> Let me go back to first principles. I need three sorts of chassis >>> specific behaviors for distributed NAT: >>> 1. Install some flows only on the chassis where a certain logical >>> port resides. That is is_chassis_resident which you already >>> reviewed and acked. The nat flows patch at the end of the >>> patch set uses this mechanism. >>> 2. Install a different set of flows associated with the distributed >>> gateway port only on the redirect-chassis. There are several >>> such flows in this patch. >>> 3. Direct some traffic with outport being the distributed gateway >>> port to the instance of the distributed gateway port on the >>> redirect-chassis. When this traffic hits table 32, it gets >>> sent through the normal tunnel to the redirect-chassis. >>> >>> I needed some handle that triggers 3. I decided to make that >>> handle be a port, which I called a "chassisredirect" port. That >>> also allows me to use is_chassis_resident(chassisredirect_port) >>> to solve 2. >>> >>> It is possible to make that handle be something other than a >>> port, as long as table 32 is modified to act on that. In that case, >>> I will need another match "condition" (as I called it) based on >>> that handle, similar to is_chassis_resident but based on >>> whatever handle we decide on instead of port. >>> >> >> I realized earlier tonight that there is a straightforward >> alternative, though it does have one potentially confusing >> aspect. >> >> For some reason, I had been assuming that a port_binding is >> either exclusive to a chassis (in the previous implementation >> with OVS patch ports, it had an ofport), or the port_binding >> exists everywhere and does not have a chassis association >> (is_remote in the previous implementation with OVS patch >> ports). >> >> If this is relaxed and we allow logical patch ports to be >> associated with a chassis, then all I need is a new >> MLF_FORCE_CHASSIS_REDIRECT flag rather than >> a second port_binding with a new "chassisredirect" type. >> >> The potentially confusing aspect is that even though the >> mechanism for associating a logical patch port with a >> chassis is identical to that for other port_binding types such >> as "l3gateway", the association of a chassis with a logical >> patch port has a different meaning than the association of a >> chassis with a VIF, a type "l3gateway" port_binding, or a >> type "l2gateway" port_binding. For the latter, the association >> is exclusive, i.e. the port only exists on that chassis. For >> logical patch ports, whether there is an association with a >> chassis or not, the logical patch port exists everywhere >> (subject to the constraints of conditional monitoring). >> >> The chassis association would only be used for a new >> table 32 flow similar to other flows sending packets to >> remote hypervisors for other port_binding types, but with >> a different match condition: >> match_set_metadata(&match, htonll(dp_key)) >> match_set_reg(&match, MFF_LOG_OUTPORT - MFF_REG0, port_key); >> match_set_reg_masked(&match, MFF_LOG_FLAGS - MFF_REG0, >> 1, MLF_FORCE_CHASSIS_REDIRECT); >> >> Depending on whether the >> MLF_FORCE_CHASSIS_REDIRECT flag is set, the >> packet would either be sent to the remote hypervisor, >> or it would fall through to the table 32 priority 0 fallback >> flow and be processed locally. >> >> The chassis association could also be used for >> evaluation of is_chassis_resident("l3dgw_port") functions >> in flow matches. >> >> If you agree that this approach is more promising than >> type "chassisredirect" ports, I can code this up tomorrow. >> > > I am having trouble making this approach work with the > ARP request table. With the approach of replacing the > logical outport, the ARP request goes to the controller > with the new outport of type "chassisredirect". When the > packet is reinjected, it does eventually end up at the > redirect chassis. > > With the approach of using a flag, the packet is not > hitting the table 32 entry matching the flag. I am not sure > what happens to the packet after it goes up to the > controller, and I am not sure how to debug it further or > what to change to make it work. > I found the bug. It was affecting all packets, not just arp, and was a simple fix. I am still checking all scenarios, but I think I have the approach with the flag instead of a new port type working. I can move forward with either approach, a flag or a new port type as originally proposed. Mickey > > Mickey > > >> Mickey >> >> >> >>> Mickey >>> >>> >>>> >>>>> I guess that I must be missing important points, but that's why I >>>>> interpreted the text as I did. Can you help me figure out why I'm not >>>>> following? >>>>> >>>>> Thanks, >>>>> >>>>> Ben. >>>>> >>>> >>>> >>> >> > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev