On Mon, Dec 09, 2024 at 02:20:03PM +0100, Dumitru Ceara wrote:
> On 12/9/24 2:14 PM, [email protected] wrote:
> > On Mon, 2024-12-09 at 13:37 +0100, Felix Huettner wrote:
> >> On Fri, Dec 06, 2024 at 10:10:21AM +0100,
> >> [email protected] wrote:
> >>> On Thu, 2024-12-05 at 15:28 +0100, Felix Huettner wrote:
> >>>> On Thu, Dec 05, 2024 at 02:08:37PM +0100,
> >>>> [email protected] wrote:
> >>>>> Hi Felix and Dumitru,
> >>>>> Thanks for posting this standalone change Felix. I'm going to
> >>>>> take
> >>>>> it
> >>>>> for a spin and try to integrate it to Frode's original "route
> >>>>> leaking"
> >>>>> RFC.
> >>>>> I have one question/suggestion for the schema that I'll leave
> >>>>> below.
> >>>>>
> >>>>> On Thu, 2024-12-05 at 14:00 +0100, Felix Huettner via dev
> >>>>> wrote:
> >>>>>> The Route table will be used in the future to coordinate
> >>>>>> routing
> >>>>>> information between northd and ovn-controller.
> >>>>>> Northd will insert routes that should be advertised to the
> >>>>>> outside
> >>>>>> fabric.
> >>>>>> Ovn-controller will insert routes that have been learned from
> >>>>>> the
> >>>>>> outside fabric.
> >>>>>>
> >>>>>> Signed-off-by: Felix Huettner <[email protected]>
> >>>>>> ---
> >>>>>> v2->v3:
> >>>>>>   * refType is now strong
> >>>>>>   * fixed typos
> >>>>>>         
> >>>>>>
> >>>>>>  ovn-sb.ovsschema | 29 ++++++++++++++++--
> >>>>>>  ovn-sb.xml       | 80
> >>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++
> >>>>>>  2 files changed, 107 insertions(+), 2 deletions(-)
> >>>>>>
> >>>>>> diff --git a/ovn-sb.ovsschema b/ovn-sb.ovsschema
> >>>>>> index 73abf2c8d..88763f429 100644
> >>>>>> --- a/ovn-sb.ovsschema
> >>>>>> +++ b/ovn-sb.ovsschema
> >>>>>> @@ -1,7 +1,7 @@
> >>>>>>  {
> >>>>>>      "name": "OVN_Southbound",
> >>>>>> -    "version": "20.37.0",
> >>>>>> -    "cksum": "1950136776 31493",
> >>>>>> +    "version": "20.38.0",
> >>>>>> +    "cksum": "1392903395 32893",
> >>>>>>      "tables": {
> >>>>>>          "SB_Global": {
> >>>>>>              "columns": {
> >>>>>> @@ -617,6 +617,31 @@
> >>>>>>                      "type": {"key": "string", "value":
> >>>>>> "string",
> >>>>>>                               "min": 0, "max":
> >>>>>> "unlimited"}}},
> >>>>>>              "indexes": [["chassis"]],
> >>>>>> +            "isRoot": true},
> >>>>>> +        "Route": {
> >>>>>> +            "columns": {
> >>>>>> +                "datapath":
> >>>>>> +                    {"type": {"key": {"type": "uuid",
> >>>>>> +                                      "refTable":
> >>>>>> "Datapath_Binding"}}},
> >>>>>> +                "logical_port": {"type": {"key": {"type":
> >>>>>> "uuid",
> >>>>>> +                                                 
> >>>>>> "refTable":
> >>>>>> "Port_Binding",
> >>>>>> +                                                  "refType":
> >>>>>> "strong"}}},
> >>>>>> +                "ip_prefix": {"type": "string"},
> >>>>>> +                "nexthop": {"type": "string"},
> >>>>>> +                "tracked_port": {"type": {"key": {"type":
> >>>>>> "uuid",
> >>>>>> +                                                 
> >>>>>> "refTable":
> >>>>>> "Port_Binding",
> >>>>>> +                                                  "refType":
> >>>>>> "strong"},
> >>>>>> +                                          "min": 0,
> >>>>>> +                                          "max": 1}},
> >>>>>> +                "type": {"type": {"key": {"type": "string",
> >>>>>> +                                          "enum": ["set",
> >>>>>> ["advertise",
> >>>>>> +                                                          
> >>>>>> "receive"]]},
> >>>>>> +                                    "min": 1, "max": 1}},
> >>>>>
> >>>>> I wonder if this field is necessary. We could use presence or
> >>>>> absence
> >>>>> of "nexthop" value to determine whether the route is being
> >>>>> learned
> >>>>> or
> >>>>> advertised. I'm not sure if ovsdb is doing something clever
> >>>>> with
> >>>>> enums
> >>>>> or stores them as raw values in each row, but removing this
> >>>>> could
> >>>>> save
> >>>>> us some space. What do you think? 
> >>>>
> >>>> Hi Martin,
> >>>>
> >>>> i think generally we could use the information in the "nexthop"
> >>>> column
> >>>> for this. However i would find this rather surprising if i am not
> >>>> deeply
> >>>> in the implementation. For debugging at least it is not obvious.
> >>>>
> >>>> I am not sure how much space saving we would get out of this, but
> >>>> i
> >>>> don't think that it is worth the cost of less debuggability.
> >>>> Especially
> >>>> since each learned/announced route will cause additionally once
> >>>> or
> >>>> multiple logical flows to be created which are significantly
> >>>> larger
> >>>> than
> >>>> this type field.
> >>>>
> >>>> But i am open to other opinions as well.
> >>>>
> >>>> Thanks a lot
> >>>> Felix
> >>>
> >>> Hi Felix,
> >>> I'm not hell-bent on this change and I agree that it would make the
> >>> code less explicit. It just occurred to me because I recently saw
> >>> similar pattern applied in NAT table [0] where NAT rules is
> >>> considered
> >>> "distributed" if the record contains both logical_port and
> >>> external_nat.
> >>
> >> Hi Martin,
> >>
> >> i think a big difference here is that we need to have both ovn-northd
> >> and ovn-controller agree to what each of these things mean.
> >> In the above example it looked like something only northd needs to
> >> know
> >> internally to derive the correct flows.
> >>
> >>>
> >>> I don't have enough experience with ovsdb in large deployments, it
> >>> was
> >>> just a gut feel that potentially saving ~8bytes for a table that
> >>> could
> >>> have potentially 100s or 1000s of records could be impactful.
> >>> There's also the fact that without the server-side validation (i.e.
> >>> if
> >>> tpye is "advertised" then nexthop must not be empty), it's up to
> >>> the
> >>> clients to enforce that invalid records are not created.
> >>
> >> I just spend some time measureing it by adding a few thousand Routes
> >> and
> >> checking the RSS difference.
> >> With the current schema (actually with Dumitru's requested
> >> adjustments)
> >> we are at ~3230 Byte per route. If we remove the "type" we are at
> >> ~2514
> >> Byte. So ~716 Byte difference.
> >>
> >> If we calculate with 10.000 Routes then we end up at roughly 7MB of
> >> difference.
> >> I would think this is worth the clarity.
> >>
> >> Additionally it allows us to later actually use nexthop also for
> >> advertised routes. Allthough honestly i don't know a usecase for
> >> this.
> >>
> >> Whats your opinion on that one?
> >>
> >> Thanks a lot
> >> Felix
> > 
> > Hi Felix,
> > thanks for running numbers on this. The difference indeed seems
> > negligible even on large deployments. You are right that the trade-off
> > for removing the "type" column is probably not worth it.
> > 
> > I've put this version (v3) to work on Friday by integrating it to
> > fnordahl's original route leaking RFC and one more though occurred to
> > me.
> > 
> > Would it be worth to split the `Route` table to `Advertised_Route` and
> > `Learned_Route`? I think this could be beneficial because:
> >   * Both "types" of routes have different "owners". While "Learned
> > routes" are primarily managed "from the bottom", by the controller,
> > "Advertised routes" are managed "from the top", by northd. Having two
> > different components being in charge of keeping the table up to date
> > seems to complicate things.
> >   * "Advertised routes" could do with just following attributes:
> >     * IP prefix
> >     * logical_port (strong ref to LRP that's supposed to advertise this
> > prefix)
> >   * With two separate tables, development of learning/advertising would
> > be more independent as changing requirements in one feature wouldn't
> > affect development of the other.
> > 
> > With the simplified structure of the `Advertised_Route` table that I
> > mentioned above, northd would be the only component that adds data to
> > this table and records would be either explicitly removed by northd
> > (when NAT or LB gets removed) or implicitly when the LRP is removed and
> > `logical_port` reference goes away.
> > 
> > What do you think?
> > 
> 
> Good point, Martin.  Now that you mention it it kind of makes sense that
> we don't mix Advertised with Learnt.  It reduces a bit complexity and
> chance of getting I-P wrong in ovn-controller / ovn-northd too.
> 
> +1 from me for separate tables.

Hi Martin, Hi Dumitru,

thanks for the comment.
This should also make the logic in northd and ovn-controller more
simple.
I will try to incorporate this and the comments on v2 into my current
code to see if i find any kind of issue with that and would afterwards
post the next version.

Thanks a lot
Felix

> 
> Regards,
> Dumitru
> 
> > Thanks,
> > Martin.
> > 
> >>
> >>>
> >>> The issue of transparency could be solved with documentation and
> >>> abstraction in parsing method (like in the case of the NAT).
> >>>
> >>> Anyway, as I said, this is not a blocker from my side, just a
> >>> suggestion for consideration.
> >>>
> >>> Martin
> >>>
> >>> [0]
> >>> https://github.com/ovn-org/ovn/blob/3b32b7d08ebcd005681db5131c798da07a89133a/northd/northd.c#L16506-L16530
> >>>>
> >>>>>
> >>>>> Martin.
> >>>>>
> >>>>>> +                "external_ids": {
> >>>>>> +                    "type": {"key": "string", "value":
> >>>>>> "string",
> >>>>>> +                             "min": 0, "max":
> >>>>>> "unlimited"}}},
> >>>>>> +            "indexes": [["datapath", "logical_port",
> >>>>>> "ip_prefix",
> >>>>>> "nexthop",
> >>>>>> +                         "type"]],
> >>>>>>              "isRoot": true}
> >>>>>>      }
> >>>>>>  }
> >>>>>> diff --git a/ovn-sb.xml b/ovn-sb.xml
> >>>>>> index ea4adc1c3..e435cf243 100644
> >>>>>> --- a/ovn-sb.xml
> >>>>>> +++ b/ovn-sb.xml
> >>>>>> @@ -5217,4 +5217,84 @@ tcp.flags = RST;
> >>>>>>        The set of variable values for a given chassis.
> >>>>>>      </column>
> >>>>>>    </table>
> >>>>>> +
> >>>>>> +  <table name="Route">
> >>>>>> +    <p>
> >>>>>> +      Each record represents a route that is exported from
> >>>>>> ovn
> >>>>>> or
> >>>>>> imported to
> >>>>>> +      ovn using some dynamic routing logic outside of ovn.
> >>>>>> +      It is populated on the one hand by <code>ovn-
> >>>>>> northd</code>
> >>>>>> based on the
> >>>>>> +      addresses, routes and NAT Entries of a
> >>>>>> +      <code>OVN_Northbound.Logical_Router_Port</code>.
> >>>>>> +      On the other hand <code>ovn-controller</code>
> >>>>>> populates it
> >>>>>> with routes
> >>>>>> +      learned from outside of OVN.
> >>>>>> +    </p>
> >>>>>> +
> >>>>>> +    <column name="datapath">
> >>>>>> +      The datapath belonging to the
> >>>>>> +      <code>OVN_Northbound.Logical_Router</code> that this
> >>>>>> route
> >>>>>> is
> >>>>>> valid
> >>>>>> +      for.
> >>>>>> +    </column>
> >>>>>> +
> >>>>>> +    <column name="logical_port">
> >>>>>> +      <p>
> >>>>>> +        If the type is <code>advertise</code> then this is
> >>>>>> the
> >>>>>> logical_port
> >>>>>> +        the router will send packets out.
> >>>>>> +      </p>
> >>>>>> +
> >>>>>> +      <p>
> >>>>>> +        If the type is <code>receive</code> then this is the
> >>>>>> logical_port
> >>>>>> +        the route was learned on.
> >>>>>> +      </p>
> >>>>>> +    </column>
> >>>>>> +
> >>>>>> +    <column name="ip_prefix">
> >>>>>> +      <p>
> >>>>>> +        IP prefix of this route (e.g. 192.168.100.0/24).
> >>>>>> +      </p>
> >>>>>> +    </column>
> >>>>>> +
> >>>>>> +    <column name="nexthop">
> >>>>>> +      <p>
> >>>>>> +        If the type is <code>advertise</code> then this is
> >>>>>> empty.
> >>>>>> +      </p>
> >>>>>> +
> >>>>>> +      <p>
> >>>>>> +        If the type is <code>receive</code> then this is the
> >>>>>> nexthop
> >>>>>> ip we
> >>>>>> +        from the outside.
> >>>>>> +      </p>
> >>>>>> +    </column>
> >>>>>> +
> >>>>>> +    <column name="tracked_port">
> >>>>>> +      <p>
> >>>>>> +        Only relevant for type <code>advertise</code>.
> >>>>>> +
> >>>>>> +        In combination with a host <code>ip_prefix</code>
> >>>>>> this
> >>>>>> tracks the port
> >>>>>> +        OVN will forward the packets for this destination
> >>>>>> to.
> >>>>>> +
> >>>>>> +        An announcing chassis can use this information to
> >>>>>> check
> >>>>>> if
> >>>>>> this
> >>>>>> +        destination is local and adjust the route priorities
> >>>>>> based
> >>>>>> on that.
> >>>>>> +      </p>
> >>>>>> +    </column>
> >>>>>> +
> >>>>>> +    <column name="type">
> >>>>>> +      <p>
> >>>>>> +        If the route is to be exported from OVN to the
> >>>>>> outside
> >>>>>> network or if
> >>>>>> +        it is imported from the outside network.
> >>>>>> +      </p>
> >>>>>> +      <ul>
> >>>>>> +        <li>
> >>>>>> +          <code>advertise</code>: This route should be
> >>>>>> advertised to
> >>>>>> the
> >>>>>> +          outside network.
> >>>>>> +       </li>
> >>>>>> +        <li>
> >>>>>> +          <code>receive</code>: This route has been learned
> >>>>>> from
> >>>>>> the
> >>>>>> outside
> >>>>>> +          network.
> >>>>>> +        </li>
> >>>>>> +      </ul>
> >>>>>> +    </column>
> >>>>>> +
> >>>>>> +    <column name="external_ids">
> >>>>>> +      See <em>External IDs</em> at the beginning of this
> >>>>>> document.
> >>>>>> +    </column>
> >>>>>> +  </table>
> >>>>>>  </database>
> >>>>>
> >>>>> _______________________________________________
> >>>>> dev mailing list
> >>>>> [email protected]
> >>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >>>
> >>> _______________________________________________
> >>> dev mailing list
> >>> [email protected]
> >>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> > 
> > _______________________________________________
> > dev mailing list
> > [email protected]
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> 
> _______________________________________________
> dev mailing list
> [email protected]
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to