On Wed, Aug 5, 2020 at 3:05 PM Han Zhou <[email protected]> wrote:

>
>
> On Wed, Aug 5, 2020 at 12:51 PM Winson Wang <[email protected]>
> wrote:
>
>> Hello OVN Experts:
>>
>> With large scale ovn-k8s cluster,  there are several conditions that
>> would make ovn-controller clients connect SB central from a balanced state
>> to an unbalanced state.
>> Is there an ongoing project to address this problem?
>> If not,  I have one proposal not sure if it is doable.
>> Please share your thoughts.
>>
>> The issue:
>>
>> OVN SB RAFT 3 node cluster,  at first all the ovn-controller clients will
>> connect all the 3 nodes in a balanced state.
>>
>> The following conditions will make the connections become unbalanced.
>>
>>    -
>>
>>    One RAFT node restart,  all the ovn-controller clients to reconnect
>>    to the two remaining cluster nodes.
>>
>>
>>    -
>>
>>    Ovn-k8s,  after SB raft pods rolling upgrade, the last raft pod has
>>    no client connections.
>>
>>
>> RAFT clients in an unbalanced state would trigger more stress to the raft
>> cluster,  which makes the raft unstable under stress compared to a balanced
>> state.
>> The proposal solution:
>>
>> Ovn-controller adds next unix commands “reconnect” with argument of
>> preferred SB node IP.
>>
>> When unbalanced state happens,  the UNIX command can trigger
>> ovn-controller reconnect
>>
>> To new SB raft node with fast sync which doesn’t trigger the whole DB
>> downloading process.
>>
>>
> Thanks Winson. The proposal sounds good to me. Will you implement it?
>

Han/Winson,

The fast re-sync is for ovsdb-server restart and it will not apply for
ovn-controller restart, right?

If the ovsdb-client (ovn-controller) restarts, then it would have lost all
its state and when it starts again it will still need to download
logical_flows, port_bindings , and other tables it cares about. So, fast
re-sync may not apply to this case.

Also, the ovn-controller should stash the IP address of the SB server to
which it is connected to in Open_vSwitch table's external_id column. It
updates this field whenever it re-connects to a different SB server
(because that ovsdb-server instance failed or restarted). When
ovn-controller itself restarts it could check for the value in this field
and try to connect to it first and on failure fallback to connect to
default connection approach.

Regards,
~Girish




>
> Han
>
>
>
>>
>> --
>> Winson
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "ovn-kubernetes" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/ovn-kubernetes/CAMu6iS--iOW0LxxtkOhJpRT49E-9bJVy0iXraC1LMDUWeu6kLA%40mail.gmail.com
>> <https://groups.google.com/d/msgid/ovn-kubernetes/CAMu6iS--iOW0LxxtkOhJpRT49E-9bJVy0iXraC1LMDUWeu6kLA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
> _______________________________________________
> discuss mailing list
> [email protected]
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
>
_______________________________________________
discuss mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to