Hi all,
I just caught up on this discussion and wanted to complicate things
further by suggesting another idea. I think the Red Hat folks have heard
this before, but I'm not sure if it has been brought up on this list before.
Aside from this issue, there is also this high-priority issue from Red
Hat Openstack: https://bugzilla.redhat.com/show_bug.cgi?id=2123176 .
IMO, this all converges on the idea of introducing a third database to
OVN. We can refer to this as the "Status" DB.
The Status DB would be a place for state information generated by
OVN/OVS to be stored. Some ideas for existing things that could go in
the Status DB would be:
* Logical port up/down state.
* Logical switch port dynamic addresses (maybe, this is more complicated)
* BFD status
* Logical port installation status and installation timestamp.
In addition to these existing items, the Status DB would be a place for
additional items that do not exist yet, such as
* Load balancer health check status
* Logical port packet/byte counts
* Gateway port bound chassis
With the implementation of the Status DB, it would cement a relationship
between the DBs as such:
NB DB: CMS writes, OVN reads
SB DB: OVN-internal
Status DB: CMS reads, OVN writes
It may be tempting to get this patch merged as-is, with the intention of
migrating this to the new DB once it gets implemented. I don't think
this is a good idea. Between this issue and the one I linked, I think
the implementation of a Status DB is a good idea, and one that should be
implemented very soon.
Since this particular problem is already worked around by OpenStack, I
think it makes more sense to implement this feature in a way that will
be easier to maintain long-term than to get it in quickly. If we merge
this as-is, then we are on the hook for supporting this status in the NB
DB for quite a long time since we would need to take time to deprecate
it properly. If we instead treat this as the impetus to write the Status
DB, then I think this lightweight use-case would give us a good starting
point towards adding the other items we're interested in.
What do you think?
On 4/13/23 09:32, Lucas Martins wrote:
Hi Han, Dumitru and Luis,
Thanks for the discussion and ideas so far. My reply is inline:
On Thu, Apr 13, 2023 at 10:45 AM Luis Tomas Bolivar <[email protected]> wrote:
On Thu, Apr 13, 2023 at 9:33 AM Dumitru Ceara <[email protected]> wrote:
On 4/12/23 23:07, Han Zhou wrote:
On Wed, Apr 12, 2023 at 8:01 AM <[email protected]> wrote:
From: Lucas Alvares Gomes <[email protected]>
In order for the CMS to know which Chassis a distributed gateway port
is bond to, this patch updates the ovn-northd daemon to populate the
Logical_Router_Port table with that information.
To avoid changing the database schema, ovn-northd is setting a new key
called "hosting-chassis" in the options column from the LRP table. This
key value points to the name of the Chassis that is currently hosting
the distributed port.
Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2107515
Signed-off-by: Lucas Alvares Gomes <[email protected]>
Hi, Lucas, Han,
Thanks Lucas for the patch. However, in my opinion the chassis binding
information belongs to SB and should stay there, otherwise we would make it
consistent for LSPs and update the chassis information for them, too, which
I think is not good in terms of clarity and extra control plane load. We'd
better keep the separation between NB and SB clear and avoid propagating
data between them back-and-forth.
I partially agree with this but it also feels wrong that the CMS
accesses the SB directly. In an ideal world (and I know that's not the
case today for neutron or ovn-k8s) the CMS should not care about what's
in the SB; that is internal OVN data.
Just to add some extra input in here. As Dumitru mentioned, it is not just a
scaling issue, but that accessing the SB has its own problems as things can
change there any time (it has already happened) breaking the logic on the CMS
about how to react to those changes. If we don't have the information at the
NB, that means we need 2 connections, one for the NB (to be as safe as possible
from the SB changes), and one for the SB to get the chassis information.
Right. So the idea is to have the CMS to only connect to the
Northbound database instead of maintaining a connection with both
databases (helping scalability). I don't know what the consensus is
but, if we agree that the Southbound database is used to store the
internal OVN data, I think it would be in everyone's favour if CMS
only used the Northbound database because as Luis pointed out apart
from scalability issues, the data structure in the Southbound database
can change overtime without any backwards compatibility and it will
break us (it already happened).
Also, note there is already chassis information on the logical_switch_ports at
the NB DB, so adding that for the cr-lrps should not be that different. Adding
the active chassis to the HA_Chassis_group also sounds good
So I believe this is the option "requested-chassis" that Neutron sets
in the LSP. The difference is that this option is set by the CMS and
the new option "hosting-chassis" from my patch is set by northd
instead. But, there are still similarities because it's also the CMS
that sets the ha_chassis_group (or gateway_chassis) for a port to make
it HA. The proposed "hosting-chassis" option is just a way for northd
to give the CMS a feedback about which chassis from the group that
port ended up binding to.
I suggest a different approach if we want to go ahead and propagate such
information to the NB: can't we store the "active chassis" information
per Gateway_chassis/HA_Chassis_group instead? That's
O(number-of-chassis) records that we need to update on chassis failover.
We might even skip this for Gateway_chassis as I understand that this
is the "old" way of configuring things (*).
That makes sense for me as well. So in the HA_Chassis_Group we would
have a column with the current active chassis name ? That would be
good because we can't really rely on the "priority" order because if
there is a fallback to another chassis, the CMS is blind to it.
(*) Should we deprecate Gateway_chassis?
I think Neutron still uses it but, with my core OVN hat on I think it
is already time. Right now in the Northbound database we have
HA_Chassis_Group and Gateway_Chassis doing the same thing. I believe
that in the Southbound everything becomes a HA_Chassis_Group. So it's
fair to get rid of the Gateway_Chassis way already.
For the problem mentioned in the bugzilla, it seems to me already a scale
challenge that something other than ovn-controller is connecting to OVN SB
from every node (if I understand correctly). Moving all these connections
from SB to NB may just make it much worse, because NB DB is usually more
heavily/frequently updated by the CMS. (For small scale, this may not
matter, even if the agent connects to both NB and SB.)
An alternative to address the scale issue without changing OVN could be
to use a dedicated SB relay to which all external (non-OVN) agents that
need access to SB information can connect. Would that help?
The problem with it is that, more often than not we actually need to
connect to both databases (as stated above) and there's no backward
compatibility regards the data structure in the Southbound database
because it is supposed to be internal OVN data. That's why having the
CMS to only connect to the Northbound is a plus.
Cheers,
Lucas
On Thu, Apr 13, 2023 at 10:45 AM Luis Tomas Bolivar <[email protected]> wrote:
On Thu, Apr 13, 2023 at 9:33 AM Dumitru Ceara <[email protected]> wrote:
On 4/12/23 23:07, Han Zhou wrote:
On Wed, Apr 12, 2023 at 8:01 AM <[email protected]> wrote:
From: Lucas Alvares Gomes <[email protected]>
In order for the CMS to know which Chassis a distributed gateway port
is bond to, this patch updates the ovn-northd daemon to populate the
Logical_Router_Port table with that information.
To avoid changing the database schema, ovn-northd is setting a new key
called "hosting-chassis" in the options column from the LRP table. This
key value points to the name of the Chassis that is currently hosting
the distributed port.
Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2107515
Signed-off-by: Lucas Alvares Gomes <[email protected]>
Hi, Lucas, Han,
Thanks Lucas for the patch. However, in my opinion the chassis binding
information belongs to SB and should stay there, otherwise we would make it
consistent for LSPs and update the chassis information for them, too, which
I think is not good in terms of clarity and extra control plane load. We'd
better keep the separation between NB and SB clear and avoid propagating
data between them back-and-forth.
I partially agree with this but it also feels wrong that the CMS
accesses the SB directly. In an ideal world (and I know that's not the
case today for neutron or ovn-k8s) the CMS should not care about what's
in the SB; that is internal OVN data.
Just to add some extra input in here. As Dumitru mentioned, it is not just a
scaling issue, but that accessing the SB has its own problems as things can
change there any time (it has already happened) breaking the logic on the CMS
about how to react to those changes. If we don't have the information at the
NB, that means we need 2 connections, one for the NB (to be as safe as possible
from the SB changes), and one for the SB to get the chassis information.
Also, note there is already chassis information on the logical_switch_ports at
the NB DB, so adding that for the cr-lrps should not be that different. Adding
the active chassis to the HA_Chassis_group also sounds good
I suggest a different approach if we want to go ahead and propagate such
information to the NB: can't we store the "active chassis" information
per Gateway_chassis/HA_Chassis_group instead? That's
O(number-of-chassis) records that we need to update on chassis failover.
We might even skip this for Gateway_chassis as I understand that this
is the "old" way of configuring things (*).
(*) Should we deprecate Gateway_chassis?
For the problem mentioned in the bugzilla, it seems to me already a scale
challenge that something other than ovn-controller is connecting to OVN SB
from every node (if I understand correctly). Moving all these connections
from SB to NB may just make it much worse, because NB DB is usually more
heavily/frequently updated by the CMS. (For small scale, this may not
matter, even if the agent connects to both NB and SB.)
An alternative to address the scale issue without changing OVN could be
to use a dedicated SB relay to which all external (non-OVN) agents that
need access to SB information can connect. Would that help?
Regards,
Dumitru
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev
--
LUIS TOMÁS BOLÍVAR
Principal Software Engineer
Red Hat
Madrid, Spain
[email protected]
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev