On Tue, Nov 26, 2019 at 9:54 PM Numan Siddique <[email protected]> wrote:
>
>
>
> On Wed, Nov 27, 2019, 7:25 AM Han Zhou <[email protected]> wrote:
>>
>> On Sun, Nov 17, 2019 at 10:55 PM Numan Siddique <[email protected]> wrote:
>> >
>> > On Mon, Nov 18, 2019 at 8:22 AM Han Zhou <[email protected]> wrote:
>> > >
>> > > On Sat, Nov 16, 2019 at 4:03 AM Numan Siddique <[email protected]>
wrote:
>> > > >
>> > > >
>> > > >
>> > > > On Fri, Nov 15, 2019 at 1:18 PM Han Zhou <[email protected]> wrote:
>> > > >>
>> > > >>
>> > > >>
>> > > >> On Thu, Nov 14, 2019 at 10:25 PM Numan Siddique <[email protected]>
>> wrote:
>> > > >> >
>> > > >> > On Thu, Nov 14, 2019 at 11:44 PM Han Zhou <[email protected]> wrote:
>> > > >> > >
>> > > >> > > On Wed, Nov 13, 2019 at 10:27 AM Numan Siddique <
[email protected]>
>> > > wrote:
>> > > >> > > >
>> > > >> > > > On Thu, Oct 31, 2019 at 2:43 AM Han Zhou <[email protected]>
wrote:
>> > > >> > > > >
>> > > >> > > > > The series supports interconnecting multiple OVN
deployments
>> > > (e.g.
>> > > >> > >  located at
>> > > >> > > > > multiple data centers) through logical routers connected
with
>> > > tansit
>> > > >> > > logical
>> > > >> > > > > switches with overlay tunnels, managed through OVN control
>> > > plane.  See
>> > > >> > > the
>> > > >> > > > > ovn-architecture.rst document updates for more details,
and
>> find
>> > > the
>> > > >> > > > > instructions in
>> Documentation/tutorials/ovn-interconnection.rst.
>> > > >> > > > >
>> > > >> > > > > Han Zhou (13):
>> > > >> > > > >   ovn-architecture: Add documentation for OVN
interconnection
>> > > feature.
>> > > >> > > > >   ovn-inb: Interconnection northbound DB schema and CLI.
>> > > >> > > > >   ovn-isb: Interconnection southbound DB schema and CLI.
>> > > >> > > > >   ovn-ic: Interconnection controller with AZ
registeration.
>> > > >> > > > >   ovn-northd.c: Refactor allocate_tnlid.
>> > > >> > > > >   ovn-ic: Transit switch controller.
>> > > >> > > > >   ovn-sb: Add columns is_interconn and is_remote to
Chassis.
>> > > >> > > > >   ovn-ic: Interconnection gateway controller.
>> > > >> > > > >   ovn-ic: Interconnection port controller.
>> > > >> > > > >   ovn.at: e2e test for OVN interconnection.
>> > > >> > > > >   ovn-ctl: Refactor to reduce redundant code.
>> > > >> > > > >   ovn-ctl: Support commands for interconnection.
>> > > >> > > > >   tutorial: Add tutorial for OVN Interconnection.
>> > > >> > > > >
>> > > >> > > >
>> > > >> > > > Hi Han,
>> > > >> > > >
>> > > >> > > > Thanks for working on this feature. It's an interesting use
>> case.
>> > > >> > > >
>> > > >> > > > I had a quick look at all the patches.
>> > > >> > > >
>> > > >> > >
>> > > >> > > Numan, thanks a lot for the thorough review! Please see my
>> response
>> > > inlined.
>> > > >> > >
>> > > >> > > > I have few comments
>> > > >> > > >
>> > > >> > > > 1. I would suggest to rename the DBs as  ovn-ic-nb.ovsschema
>> (and
>> > > the
>> > > >> > > > same for ovn-isb).
>> > > >> > > >     The DB name is - OVN_IC_Northbound. So it would make
sense
>> to
>> > > have
>> > > >> > > > - ovn-ic-nb.ovsschema
>> > > >> > > >      I would also suggest to rename the utilities to
>> ovn-ic-nbctl
>> > > and
>> > > >> > > > ovn-ic-sbctl.
>> > > >> > > >     With  ovn-inbctl/ovn-isbctl, it is really confusing.
>> > > >> > > >
>> > > >> > > Sure, I felt not quite convenient with two dashes in a command
>> name.
>> > > I
>> > > >> > > agree that ovn-ic-nbctl and ovn-ic-sbctl are more clear. I can
>> > > change it.
>> > > >> > >
>> > > >> > > > 2. ovn-ic service writes to interconnect south db, ovn north
>> db and
>> > > >> > > > ovn south db. Writing to ic south db and
>> > > >> > > >     ovn north db is fine. But it seems a little odd for
ovn-ic
>> to
>> > > >> > > > write to south db. From what I understand it writes
>> > > >> > > >     to south db for 3 purposes
>> > > >> > > >       a. Updating the tunnel_key column of datapath_binding
>> > > >> > > > representing the transit switch
>> > > >> > > >       b. Updating the key column of port_binding
representing
>> the
>> > > >> > > > logical router port connecting to the transit switch.
>> > > >> > > >       c. Creating chassis rows for remote gateway chassis.
>> > > >> > > >
>> > > >> > > >    I think it's better if ovn-ic can delegate all these to
>> > > ovn-northd.
>> > > >> > > > For (a) and (b), ovn-ic can set the generated tunnel key
>> > > >> > > >    in the other_config/options column of Logical
switch/Logical
>> > > switch
>> > > >> > > > port. ovn-northd can internally set this value to
>> > > >> > > >    the south db.
>> > > >> > > >
>> > > >> > > >    For (c), I think its better we add another table -
>> > > Remote_Chassis
>> > > >> > > > (or some other appropriate name) . ovn-ic will create rows
>> > > >> > > >    in this table for each remote chassis and ovn-northd will
>> create
>> > > >> > > > chassis row in south db.
>> > > >> > > >    We already have Gateway_Chassis table in North db. So I
>> think
>> > > it's
>> > > >> > > > fine if we add Remote_Chassis table. The only odd thing
>> > > >> > > >    would be is to store the encap details in North db.
>> > > >> > > >
>> > > >> > > >    With this, ovn-ic, doesn't need to worry about syncing
>> between
>> > > >> > > > interconnect south db and ovn south db. In my opinion ovn-ic
>> > > >> > > >    should act more like CMS to each availability zone and
hence
>> > > should
>> > > >> > > > not do any write transactions to the south db.
>> > > >> > > >
>> > > >> > > >     Any concerns with this proposed approach ?
>> > > >> > > >
>> > > >> > > We could avoid ovn-ic writing directly to SB with some extra
>> logic in
>> > > >> > > northd, but I don't see any problem for ovn-ic to update SB
>> > > directly. First
>> > > >> > > of all, we have hypervisors creating and updating SB directly
for
>> > > Chassis
>> > > >> > > and Encap records. The design here is that ovn-ic updates
Chassis
>> > > and Encap
>> > > >> > > on behalf of remote gateway nodes, which I think is
>> straightforward
>> > > and
>> > > >> > > reasonable. Similarly, port-binding's chassis column is
updated
>> the
>> > > same
>> > > >> > > way as how hypervisors are updating it. Secondly, for tunnel
keys
>> > > updating,
>> > > >> > > it may seem graceful to update from northd, since northd is
the
>> only
>> > > client
>> > > >> > > that updates tunnel keys today, but since ovn-ic is
responsible
>> for
>> > > >> > > calculating these keys, and it already has connection to SB, I
>> think
>> > > it is
>> > > >> > > just more natural and efficient to update it directly, to
avoid
>> the
>> > > extra
>> > > >> > > logic and unnecessary latency from northd with another round
of
>> > > >> > > computation. The scope of the ovn-ic is only for the
>> interconnection
>> > > >> > > objects, so I don't see any conflict or ownership ambiguity
here.
>> > > What do
>> > > >> > > you think?
>> > > >> >
>> > > >> > Regarding the ovn-ic writing to SB DB, I would like to get
other's
>> > > >> > opinion as well.
>> > > >> > I agree to your points, but at the same time concerned with few
>> things
>> > > like
>> > > >> >    - What about RBAC for ovn-ic ?
>> > > >> >    - Right now we have RBAC for ovn-controller in writing to the
>> SB DB.
>> > > >> >    - Do we want something similar for ovn-ic if ovn-ic writes to
>> SB DB.
>> > > >> >    - Does CMS need to do something similar for ovn-ic, like how
it
>> is
>> > > documented
>> > > >> >      here -
>> > >
https://mail.openvswitch.org/pipermail/ovs-dev/2019-November/364481.html
>> > > >> >      (related discussion here -
>> > > >> >
>> > >
>>
https://mail.openvswitch.org/pipermail/ovs-discuss/2019-November/049501.html
>> > > )
>> > > >> >      if no RBAC for ovn-ic ?
>> > > >> >
>> > > >> For RBAC, I think ovn-ic and northd are at the same position,
because
>> > > they both are AZ level daemons, just focusing on different tasks. In
>> theory
>> > > they can be put in same process, but I separated them for clarity. So
>> from
>> > > RBAC perspective they should be just the same.
>> > > >> Today there is no role for northd, which I think is a flaw, as
>> discussed
>> > > in the thread you pointed. It is not a big problem though, because a
>> > > workaround is simply creating a separate connection and use iptable
>> rules
>> > > to restrict access from central nodes only. Same practice should be
used
>> > > for ovn-ic.
>> > > >>
>> > > >> > @Mark Michelson  @Dumitru Ceara  You have any thoughts/concerns
on
>> > > >> > ovn-ic writing to SB DB ?
>> > > >> >
>> > > >> > Regarding the tunnel key there are 2 things here
>> > > >> >   (1) tunnel key for transit datapath
>> > > >> >   (2) tunnel key for logical port connected to the transit
switch
>> > > >> >
>> > > >> > For (1), you can completely avoid "Sync ISB TS tunnel key to AZ
SB
>> > > >> > datapath" by storing the tunnel id
>> > > >> > in logical switch in NB DB like -
>> > > >> > nbrec_logical_switch_update_other_config_setkey(ls,
>> > > >> > "interconn-ts-tunnelkey, tunnelkey)
>> > > >> >
>> > > >> > ovn-northd when creating the datapath_binding row can set this
>> value
>> > > >> > directly. With this approach I think the function
>> > > >> > ts_run() can be simplified a bit as it doesn't need to call -
>> > > >> > SBREC_DATAPATH_BINDING_FOR_EACH()
>> > > >> >
>> > > >>
>> > > >> I agree this approach should work. The code might be a little
simpler
>> > > but the latency will be added because of the extra computer iteration
>> from
>> > > northd to SB DB, although it might not be a big concern. So I don't
have
>> > > strong preference for either approach.
>> > > >>
>> > > >> > Right now CMS is expected to create lsp-lrp ports on the transit
>> > > >> > switch - logical router on each AZ.
>> > > >> > Instead of this, can't CMS/user create these links in IC-NB
table ?
>> > > >> > ''
>> > > >> > something like
>> > > >> >  ovn-ic-nbctl ts-add ts0
>> > > >> >  ovn-ic-nbctl tsp-add ts0 az0 az0-lr0 MAC1 IP1
>> > > >> >  ovn-ic-nbctl tsp-add ts0 az1 az1-lr0 MAC2 IP2
>> > > >> >
>> > > >> > (where az0-lr0 is  logical router present in az0's NB DB and
>> az1-lr0
>> > > >> > is logical router
>> > > >> >  present in az1's NB DB).
>> > > >> >
>> > > >> > ovn-ic will internally create lsp-lrp ports in its AZ's NB DB.
It
>> can
>> > > >> > also create 'Port_Binding'
>> > > >> > row in IC-SB DB and doesn't need to worry about syncing between
>> the SB
>> > > >> > DB and IC-SB DB.
>> > > >> >
>> > > >> > This would probably solve  (2) and ovn-ic can set the tunnel key
>> when
>> > > >> > creating the lsp-lrp and ovn-northd
>> > > >> > will make use of this tunnel key when creating the port_binding
>> rows
>> > > in SB DB.
>> > > >> >
>> > > >> > With this user/CMS doesn't have to worry about creating the
>> lsp-lrps
>> > > >> > in each AZ and I think
>> > > >> > this seems more logical to me.
>> > > >> >
>> > > >> > Running "ovn-ic-nbctl show" will give a clear output to the
user.
>> > > >> >
>> > > >>
>> > > >> This is a different operation model than what I had in mind. My
>> initial
>> > > idea was that all configuration should be done at each AZ itself, and
>> the
>> > > ovn-ic from each AZ will sync between each other automatically, so
>> there is
>> > > no need for configuring the IC DBs by CMS/User. The reason why I end
up
>> > > with a IC-NB DB is the use case of multi-tenancy, when an operator
>> needs to
>> > > isolate different tenants on the backbone. They can create different
TS
>> for
>> > > this purpose. Each TS can represent a tenant or segmentation-zone.
This
>> > > information should be specified by user and it is global, so it is
>> stored
>> > > in IC-NB DB. Other than this, there is no need for global level
>> > > configurations. All the information that can be populated by ovn-ic
is
>> > > stored in IC-SB.
>> > > >>
>> > > >> With your proposal, LRP's from each AZ is directly configured by
>> user in
>> > > global IC-NB. It is a valid approach, too.
>> > > >>
>> > > >> However, I would prefer configuring them at AZ level for below
>> > > considerations:
>> > > >> - Try to avoid introducing new configurations. Other than the TS
>> switch,
>> > > there is nothing new. All the operations of LRP creation stays the
same
>> way
>> > > as it is today, which I believe simplifies operation.
>> > > >> - The ownership is more clear. AZ information of each LRP is
>> populated
>> > > automatically by ovn-ic from each AZ, and each AZ will only update
its
>> own
>> > > LRPs to global DB. Each AZ has full control of its own intention of
>> joining
>> > > the global interconnection, deciding which router ports and gateways
>> should
>> > > get involved.
>> > > >> - It might seem attractive to have a global view by configuring
all
>> > > these LRPs at global level, but in fact we will still need to
configure
>> > > static routes for each other at each AZ level. So the overall
>> configuration
>> > > will be half-global and half-local, which might in fact seem less
>> natural.
>> > > >
>> > > >
>> > > > I thought about the static routes.  With my proposal, ovn-ic can
>> > > configure the static routes too because it knows the transit switch
>> > > configuration would have all the details.
>> > > >
>> > > I wanted to make the configuration at IC-NB as simple as possible.
For
>> the
>> > > static routes, currently they can be configured at each AZ, but I am
>> also
>> > > planning a further improvement to avoid the configuration at all, by
>> > > learning the routes of edge routers from each other automatically
>> through
>> > > IC-SB, but I will add it after the basic feature is consolidated.
>> > >
>> > > >
>> > > >> - The command ovn-ic-nbctl show may seem not showing enough
>> information,
>> > > but in fact ovn-ic-sbctl show tells all details with the hierarchy of
>> AZ ->
>> > > GW -> Ports (for each port it also shows its TS and subnet/IP). I
think
>> it
>> > > provides a global view of both logical and physical.
>> > > >>
>> > > >> If the change is for avoid updating tunnel keys to SB directly, we
>> can
>> > > do similar approach as you suggested for TS.
>> > > >>
>> > > >> > If we take this approach, ovn-ic doesn't need to write to the
SB DB
>> > > >> > except for creating 'Chassis' table in
>> > > >> > SB DB. We could avoid this also if we add Remote_Chassis in NB
DB.
>> But
>> > > >> > we can discuss further on this.
>> > > >> >
>> > > >> > Let me know what you think.
>> > > >> >
>> > > >> For writing to SB, I am ok to avoid writing tunnel keys (for TS
and
>> for
>> > > port-binding) directly to SB but through northd, although this would
>> > > introduce some latency which I think is not a big issue.
>> > > >
>> > > >
>> > > > Ok.
>> > > >>
>> > > >> But for chassis/encap/port-binding's chassis, I would still
prefer to
>> > > update to SB directly, consistent with what we are doing from
>> > > ovn-controller. The only difference from ovn-controller is that RBAC
is
>> > > necessary from ovn-controller but not from ovn-ic, since ovn-ic is AZ
>> level
>> > > component that only runs on central nodes.
>> > > >> If in the future we want to avoid SB update from ovn-controller,
too,
>> > > then we can make the same change in a consistent way then for both
>> > > ovn-controller and ovn-ic.
>> > > >> >
>> > > >> > >
>> > > >> > > > 3. In patch 7,its better to rename the ovs configuration
>> option -
>> > > >> > > > "external_ids:is-interconn"  to
>> "external_ids:ovn-is-interconn".
>> > > >> > > >    You also need to document it in ovn-controller-8.xml.
>> > > >> > > >
>> > > >> > > Agree!
>> > > >> > >
>> > > >> > > >    Or maybe we can remove this option -
>> external_ids:is-interconn.
>> > > We
>> > > >> > > > probably don't need this if we do like below
>> > > >> > > >
>> > > >> > > >    2 (c) can also be done this way
>> > > >> > > >          - User creates transit switch.
>> > > >> > > >          - ovn-ic creates transit switch in north db.
>> > > >> > > >          - then the user adds the logical router port -
logical
>> > > switch
>> > > >> > > > port to the transit switch in availability zone - az1 (for
>> example)
>> > > >> > > >          - then the user creates gw chassis - for example
>> > > >> > > >               ovn-nbctl lrp-set-gateway-chassis lrp-lr1-ts1
>> > > <gateway
>> > > >> > > > name> [priority]
>> > > >> > > >          - ovn-ic on az1 ,learns this and creates a
>> Gateway_Chassis
>> > > >> > > > row in the interconnect south db.
>> > > >> > > >          - ovn-ic on other az's then create Remote_Chassis
>> rows in
>> > > the
>> > > >> > > North db.
>> > > >> > > >         - ovn-northd on other az's then create chassis row
in
>> > > south db
>> > > >> > > > with "is_remote" set to true.
>> > > >> > > >
>> > > >> > > >     I am not sure if this complicates further and hence its
>> better
>> > > >> > > > that ovn-ic writes to the south dbs. But we can discuss
>> further on
>> > > >> > > > this.
>> > > >> > > >
>> > > >> > >
>> > > >> > > I think this is a good idea and I am incline to it, because it
>> > > avoids one
>> > > >> > > configuration on the gateway node, which is good.
>> > > >> > > The only concern is that it makes the remote gateway sync more
>> > > dynamic - it
>> > > >> > > changes when LRP's gateway-chassis/ha-chassis settings change,
>> which
>> > > may be
>> > > >> > > less efficient. The code of ovn-ic will be a little more
>> complex. I
>> > > think
>> > > >> > > it should be ok, but I'd like to hear from you, too.
>> > > >> >
>> > > >> > I would prefer to go with this approach. Another advantage is
that,
>> > > >> > ovn-controller don't
>> > > >> > have to establish tunnels with the remote chassis until it
really
>> has
>> > > >> > to. I think with the present
>> > > >> > approach it establishes tunnels even if a transit switch doesn't
>> have
>> > > >> > any router ports ?
>> > > >> > Correct me if I am wrong here ?
>> > > >> >
>> > > >>
>> > > >> Yes you are right. I think it is mainly about static v.s. dynamic.
>> The
>> > > benefit of static is that it is more predicable and efficient
(maybe),
>> > > while the dynamic approach avoids the extra configuration of
>> > > "is-interconnection". I don't worry about the tunnel setup yet, since
>> it is
>> > > merely a piece of data in the ovsdb table on the chassis, and there
>> won't
>> > > be too many interconnection gateways. But I think I am inclined to
>> change
>> > > as you suggested, to avoid the extra configuration.
>> > > >
>> > > >
>> > > > Ok. So to summarize from our discussion, v3 of this series would
>> > > >   -  avoid tunnel key updates to the SB DB in ovn-ic
>> > > >    - but it will create remote chassis in SB DB right ?
>> > > >
>> > >
>> > > Yes. In addition, I will also remove the "is-interconn" configuration
>> for
>> > > Chassis as you suggested, and determine the Chassis is for
>> interconnection
>> > > if there is a LRP@TS assigned to it, by either gateway_chassis or by
>> > > ha_chassis.
>> > > Does this all sound good for v3?
>> >
>> > Yes. Its sounds good to me.
>> >
>> > Numan
>> >
>> Hi Numan,
>>
>> I thought more about the "is-interconn" configuration for chassis. I want
>> to keep it for now, for the below reasons:
>> 1. It will be more complex to rely on checking all the LRPs on TS
switches
>> and then looking for related settings in gateway-chassis and ha-chassis.
>> 2. LRP change in one AZ can cause all ovn-controllers in all the other
AZs
>> to recompute, when a gateway chassis's last port is removed or the first
>> port is added, because this triggers the chassis add/delete in remote
AZs,
>> and chassis change is not handled incrementally (yet). If there is only
LRP
>> change itself doesn't cause recompute in other AZs. The recompute can
also
>> cause latency for LRP changes to take effect.
>>
>> It seems natural to me to configure it since usually admin would assign
>> roles to nodes beforehand and it should be clear that which nodes are
going
>> to be used as gateway nodes. We can still try to implement the dynamic
>> discovery logic as an enhancement, if needed. If we implement it, I think
>> there is no compatibility problem to worry about in this case because the
>> config would just become useless and harmless. So, do you think it is ok
to
>> keep this configuration at least for the first version?
>
>
>
> Thanks for the detailed explanation. Sounds reasonable to me.
>
> Thanks
>
Hi Numan,

I addressed the comments and sent out v3:
https://patchwork.ozlabs.org/project/openvswitch/list/?series=155577

PTAL.

Thanks,
Han
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to