On Mon, 2024-12-09 at 13:37 +0100, Felix Huettner wrote:
> On Fri, Dec 06, 2024 at 10:10:21AM +0100,
> [email protected] wrote:
> > On Thu, 2024-12-05 at 15:28 +0100, Felix Huettner wrote:
> > > On Thu, Dec 05, 2024 at 02:08:37PM +0100,
> > > [email protected] wrote:
> > > > Hi Felix and Dumitru,
> > > > Thanks for posting this standalone change Felix. I'm going to
> > > > take
> > > > it
> > > > for a spin and try to integrate it to Frode's original "route
> > > > leaking"
> > > > RFC.
> > > > I have one question/suggestion for the schema that I'll leave
> > > > below.
> > > >
> > > > On Thu, 2024-12-05 at 14:00 +0100, Felix Huettner via dev
> > > > wrote:
> > > > > The Route table will be used in the future to coordinate
> > > > > routing
> > > > > information between northd and ovn-controller.
> > > > > Northd will insert routes that should be advertised to the
> > > > > outside
> > > > > fabric.
> > > > > Ovn-controller will insert routes that have been learned from
> > > > > the
> > > > > outside fabric.
> > > > >
> > > > > Signed-off-by: Felix Huettner <[email protected]>
> > > > > ---
> > > > > v2->v3:
> > > > > * refType is now strong
> > > > > * fixed typos
> > > > >
> > > > >
> > > > > ovn-sb.ovsschema | 29 ++++++++++++++++--
> > > > > ovn-sb.xml | 80
> > > > > ++++++++++++++++++++++++++++++++++++++++++++++++
> > > > > 2 files changed, 107 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/ovn-sb.ovsschema b/ovn-sb.ovsschema
> > > > > index 73abf2c8d..88763f429 100644
> > > > > --- a/ovn-sb.ovsschema
> > > > > +++ b/ovn-sb.ovsschema
> > > > > @@ -1,7 +1,7 @@
> > > > > {
> > > > > "name": "OVN_Southbound",
> > > > > - "version": "20.37.0",
> > > > > - "cksum": "1950136776 31493",
> > > > > + "version": "20.38.0",
> > > > > + "cksum": "1392903395 32893",
> > > > > "tables": {
> > > > > "SB_Global": {
> > > > > "columns": {
> > > > > @@ -617,6 +617,31 @@
> > > > > "type": {"key": "string", "value":
> > > > > "string",
> > > > > "min": 0, "max":
> > > > > "unlimited"}}},
> > > > > "indexes": [["chassis"]],
> > > > > + "isRoot": true},
> > > > > + "Route": {
> > > > > + "columns": {
> > > > > + "datapath":
> > > > > + {"type": {"key": {"type": "uuid",
> > > > > + "refTable":
> > > > > "Datapath_Binding"}}},
> > > > > + "logical_port": {"type": {"key": {"type":
> > > > > "uuid",
> > > > > +
> > > > > "refTable":
> > > > > "Port_Binding",
> > > > > + "refType":
> > > > > "strong"}}},
> > > > > + "ip_prefix": {"type": "string"},
> > > > > + "nexthop": {"type": "string"},
> > > > > + "tracked_port": {"type": {"key": {"type":
> > > > > "uuid",
> > > > > +
> > > > > "refTable":
> > > > > "Port_Binding",
> > > > > + "refType":
> > > > > "strong"},
> > > > > + "min": 0,
> > > > > + "max": 1}},
> > > > > + "type": {"type": {"key": {"type": "string",
> > > > > + "enum": ["set",
> > > > > ["advertise",
> > > > > +
> > > > > "receive"]]},
> > > > > + "min": 1, "max": 1}},
> > > >
> > > > I wonder if this field is necessary. We could use presence or
> > > > absence
> > > > of "nexthop" value to determine whether the route is being
> > > > learned
> > > > or
> > > > advertised. I'm not sure if ovsdb is doing something clever
> > > > with
> > > > enums
> > > > or stores them as raw values in each row, but removing this
> > > > could
> > > > save
> > > > us some space. What do you think?
> > >
> > > Hi Martin,
> > >
> > > i think generally we could use the information in the "nexthop"
> > > column
> > > for this. However i would find this rather surprising if i am not
> > > deeply
> > > in the implementation. For debugging at least it is not obvious.
> > >
> > > I am not sure how much space saving we would get out of this, but
> > > i
> > > don't think that it is worth the cost of less debuggability.
> > > Especially
> > > since each learned/announced route will cause additionally once
> > > or
> > > multiple logical flows to be created which are significantly
> > > larger
> > > than
> > > this type field.
> > >
> > > But i am open to other opinions as well.
> > >
> > > Thanks a lot
> > > Felix
> >
> > Hi Felix,
> > I'm not hell-bent on this change and I agree that it would make the
> > code less explicit. It just occurred to me because I recently saw
> > similar pattern applied in NAT table [0] where NAT rules is
> > considered
> > "distributed" if the record contains both logical_port and
> > external_nat.
>
> Hi Martin,
>
> i think a big difference here is that we need to have both ovn-northd
> and ovn-controller agree to what each of these things mean.
> In the above example it looked like something only northd needs to
> know
> internally to derive the correct flows.
>
> >
> > I don't have enough experience with ovsdb in large deployments, it
> > was
> > just a gut feel that potentially saving ~8bytes for a table that
> > could
> > have potentially 100s or 1000s of records could be impactful.
> > There's also the fact that without the server-side validation (i.e.
> > if
> > tpye is "advertised" then nexthop must not be empty), it's up to
> > the
> > clients to enforce that invalid records are not created.
>
> I just spend some time measureing it by adding a few thousand Routes
> and
> checking the RSS difference.
> With the current schema (actually with Dumitru's requested
> adjustments)
> we are at ~3230 Byte per route. If we remove the "type" we are at
> ~2514
> Byte. So ~716 Byte difference.
>
> If we calculate with 10.000 Routes then we end up at roughly 7MB of
> difference.
> I would think this is worth the clarity.
>
> Additionally it allows us to later actually use nexthop also for
> advertised routes. Allthough honestly i don't know a usecase for
> this.
>
> Whats your opinion on that one?
>
> Thanks a lot
> Felix
Hi Felix,
thanks for running numbers on this. The difference indeed seems
negligible even on large deployments. You are right that the trade-off
for removing the "type" column is probably not worth it.
I've put this version (v3) to work on Friday by integrating it to
fnordahl's original route leaking RFC and one more though occurred to
me.
Would it be worth to split the `Route` table to `Advertised_Route` and
`Learned_Route`? I think this could be beneficial because:
* Both "types" of routes have different "owners". While "Learned
routes" are primarily managed "from the bottom", by the controller,
"Advertised routes" are managed "from the top", by northd. Having two
different components being in charge of keeping the table up to date
seems to complicate things.
* "Advertised routes" could do with just following attributes:
* IP prefix
* logical_port (strong ref to LRP that's supposed to advertise this
prefix)
* With two separate tables, development of learning/advertising would
be more independent as changing requirements in one feature wouldn't
affect development of the other.
With the simplified structure of the `Advertised_Route` table that I
mentioned above, northd would be the only component that adds data to
this table and records would be either explicitly removed by northd
(when NAT or LB gets removed) or implicitly when the LRP is removed and
`logical_port` reference goes away.
What do you think?
Thanks,
Martin.
>
> >
> > The issue of transparency could be solved with documentation and
> > abstraction in parsing method (like in the case of the NAT).
> >
> > Anyway, as I said, this is not a blocker from my side, just a
> > suggestion for consideration.
> >
> > Martin
> >
> > [0]
> > https://github.com/ovn-org/ovn/blob/3b32b7d08ebcd005681db5131c798da07a89133a/northd/northd.c#L16506-L16530
> > >
> > > >
> > > > Martin.
> > > >
> > > > > + "external_ids": {
> > > > > + "type": {"key": "string", "value":
> > > > > "string",
> > > > > + "min": 0, "max":
> > > > > "unlimited"}}},
> > > > > + "indexes": [["datapath", "logical_port",
> > > > > "ip_prefix",
> > > > > "nexthop",
> > > > > + "type"]],
> > > > > "isRoot": true}
> > > > > }
> > > > > }
> > > > > diff --git a/ovn-sb.xml b/ovn-sb.xml
> > > > > index ea4adc1c3..e435cf243 100644
> > > > > --- a/ovn-sb.xml
> > > > > +++ b/ovn-sb.xml
> > > > > @@ -5217,4 +5217,84 @@ tcp.flags = RST;
> > > > > The set of variable values for a given chassis.
> > > > > </column>
> > > > > </table>
> > > > > +
> > > > > + <table name="Route">
> > > > > + <p>
> > > > > + Each record represents a route that is exported from
> > > > > ovn
> > > > > or
> > > > > imported to
> > > > > + ovn using some dynamic routing logic outside of ovn.
> > > > > + It is populated on the one hand by <code>ovn-
> > > > > northd</code>
> > > > > based on the
> > > > > + addresses, routes and NAT Entries of a
> > > > > + <code>OVN_Northbound.Logical_Router_Port</code>.
> > > > > + On the other hand <code>ovn-controller</code>
> > > > > populates it
> > > > > with routes
> > > > > + learned from outside of OVN.
> > > > > + </p>
> > > > > +
> > > > > + <column name="datapath">
> > > > > + The datapath belonging to the
> > > > > + <code>OVN_Northbound.Logical_Router</code> that this
> > > > > route
> > > > > is
> > > > > valid
> > > > > + for.
> > > > > + </column>
> > > > > +
> > > > > + <column name="logical_port">
> > > > > + <p>
> > > > > + If the type is <code>advertise</code> then this is
> > > > > the
> > > > > logical_port
> > > > > + the router will send packets out.
> > > > > + </p>
> > > > > +
> > > > > + <p>
> > > > > + If the type is <code>receive</code> then this is the
> > > > > logical_port
> > > > > + the route was learned on.
> > > > > + </p>
> > > > > + </column>
> > > > > +
> > > > > + <column name="ip_prefix">
> > > > > + <p>
> > > > > + IP prefix of this route (e.g. 192.168.100.0/24).
> > > > > + </p>
> > > > > + </column>
> > > > > +
> > > > > + <column name="nexthop">
> > > > > + <p>
> > > > > + If the type is <code>advertise</code> then this is
> > > > > empty.
> > > > > + </p>
> > > > > +
> > > > > + <p>
> > > > > + If the type is <code>receive</code> then this is the
> > > > > nexthop
> > > > > ip we
> > > > > + from the outside.
> > > > > + </p>
> > > > > + </column>
> > > > > +
> > > > > + <column name="tracked_port">
> > > > > + <p>
> > > > > + Only relevant for type <code>advertise</code>.
> > > > > +
> > > > > + In combination with a host <code>ip_prefix</code>
> > > > > this
> > > > > tracks the port
> > > > > + OVN will forward the packets for this destination
> > > > > to.
> > > > > +
> > > > > + An announcing chassis can use this information to
> > > > > check
> > > > > if
> > > > > this
> > > > > + destination is local and adjust the route priorities
> > > > > based
> > > > > on that.
> > > > > + </p>
> > > > > + </column>
> > > > > +
> > > > > + <column name="type">
> > > > > + <p>
> > > > > + If the route is to be exported from OVN to the
> > > > > outside
> > > > > network or if
> > > > > + it is imported from the outside network.
> > > > > + </p>
> > > > > + <ul>
> > > > > + <li>
> > > > > + <code>advertise</code>: This route should be
> > > > > advertised to
> > > > > the
> > > > > + outside network.
> > > > > + </li>
> > > > > + <li>
> > > > > + <code>receive</code>: This route has been learned
> > > > > from
> > > > > the
> > > > > outside
> > > > > + network.
> > > > > + </li>
> > > > > + </ul>
> > > > > + </column>
> > > > > +
> > > > > + <column name="external_ids">
> > > > > + See <em>External IDs</em> at the beginning of this
> > > > > document.
> > > > > + </column>
> > > > > + </table>
> > > > > </database>
> > > >
> > > > _______________________________________________
> > > > dev mailing list
> > > > [email protected]
> > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >
> > _______________________________________________
> > dev mailing list
> > [email protected]
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev