Re: [ovs-discuss] ovsdb relay server active connection probe interval do not work

2023-10-18 Thread Vladislav Odintsov via discuss


> On 18 Oct 2023, at 18:59, Ilya Maximets  wrote:
> 
> On 10/18/23 17:14, Vladislav Odintsov wrote:
>> Hi Ilya, Terry,
>> 
>>> On 7 Mar 2023, at 14:03, Ilya Maximets  wrote:
>>> 
>>> On 3/7/23 00:15, Vladislav Odintsov wrote:
 Hi Ilya,
 
 I’m wondering whether there are possible configuration parameters for 
 ovsdb relay -> main ovsdb server inactivity probe timer.
 My cluster experiencing issues where relay disconnects from main cluster 
 due to 5 sec. inactivity probe timeout.
 Main cluster has quite big database and a bunch of daemons, which connects 
 to it and it makes difficult to maintain connections in time.
 
 For ovsdb relay as a remote I use in-db configuration (to provide 
 inactivity probe and rbac configuration for ovn-controllers).
 For ovsdb-server, which serves SB, I just set --remote=pssl:.
 
 I’d like to configure remote for ovsdb cluster via DB to set inactivity 
 probe setting, but I’m not sure about the correct way for that.
 
 For now I see only two options:
 1. Setup custom database scheme with connection table, serve it in same SB 
 cluster and specify this connection when start ovsdb sb server.
>>> 
>>> There is a ovsdb/local-config.ovsschema shipped with OVS that can be
>>> used for that purpose.  But you'll need to craft transactions for it
>>> manually with ovsdb-client.
>>> 
>>> There is a control tool prepared by Terry:
>>>  
>>> https://patchwork.ozlabs.org/project/openvswitch/patch/20220713030250.2634491-1-twil...@redhat.com/
>>> 
>>> But it's not in the repo yet (I need to get back to reviews on that
>>> topic at some point).  The tool itself should be fine, but maybe name
>>> will change.
>> 
>> I want to step back to this thread.
>> The mentioned patch is archived with "Changes Requested" state, but there is 
>> no review comments in this patch.
>> If there is no ongoing work with it, I can take it over to finalise.
>> For now it needs a small rebase, so I can do it and resend, but before want 
>> to hear your thoughts on this.
>> 
>> Internally we use this patch to work with Local_Config DB for almost 6 
>> months and it works fine.
>> On each OVS update we have to re-apply it and sometimes solve conflicts, so 
>> would be nice to have this patch in upstream.
> 
> Hi, I'm currently in the middle of re-working the ovsdb-server configuration
> for a different approach that will replace command-line and appctl configs
> with a config file (cmdline and appctls will be preserved for backward
> compatibility, but there will be a new way of setting things up).  This should
> be much more flexible and user-friendly than working with a local-config
> database.  That should also address most of the concerns raised by Terry
> regarding usability of local-config (having way too many ways of configuring
> the same thing mainly, and requirement to use special tools to modify the
> configuration).  I'm planning to post the first version of the change
> relatively soon.  I can Cc you on the patches.

Okay, got it.
It would be nice if you can Cc me for not to miss patches, thanks!

> 
> Best regards, Ilya Maximets.
> 
>> 
>>> 
 2. Setup second connection in ovn sb database to be used for ovsdb cluster 
 and deploy cluster separately from ovsdb relay, because they both start 
 same connections and conflict on ports. (I don’t use docker here, so I 
 need a separate server for that).
>>> 
>>> That's an easy option available right now, true.  If they are deployed
>>> on different nodes, you may even use the same connection record.
>>> 
 
 Anyway, if I configure ovsdb remote for ovsdb cluster with specified 
 inactivity probe (say, to 60k), I guess it’s still not enough to have 
 ovsdb pings every 60 seconds. Inactivity probe must be the same from both 
 ends - right? From the ovsdb relay process.
>>> 
>>> Inactivity probes don't need to be the same.  They are separate for each
>>> side of a connection and so configured separately.
>>> 
>>> You can set up inactivity probe for the server side of the connection via
>>> database.  So, server will probe the relay every 60 seconds, but today
>>> it's not possible to set inactivity probe for the relay-to-server direction.
>>> So, relay will probe the server every 5 seconds.
>>> 
>>> The way out from this situation is to allow configuration of relays via
>>> database as well, e.g. relay:db:Local_Config,Config,relays.  This will
>>> require addition of a new table to the Local_Config database and allowing
>>> relay config to be parsed from the database in the code.  That wasn't
>>> implemented yet.
>>> 
 I saw your talk on last ovscon about this topic, and the solution was in 
 progress there. But maybe there were some changes from that time? I’m 
 ready to test it if any. Or, maybe there’s any workaround?
>>> 
>>> Sorry, we didn't move forward much on that topic since the presentation.
>>> There are few unanswered questions 

Re: [ovs-discuss] ovsdb relay server active connection probe interval do not work

2023-10-18 Thread Ilya Maximets via discuss
On 10/18/23 17:14, Vladislav Odintsov wrote:
> Hi Ilya, Terry,
> 
>> On 7 Mar 2023, at 14:03, Ilya Maximets  wrote:
>>
>> On 3/7/23 00:15, Vladislav Odintsov wrote:
>>> Hi Ilya,
>>>
>>> I’m wondering whether there are possible configuration parameters for ovsdb 
>>> relay -> main ovsdb server inactivity probe timer.
>>> My cluster experiencing issues where relay disconnects from main cluster 
>>> due to 5 sec. inactivity probe timeout.
>>> Main cluster has quite big database and a bunch of daemons, which connects 
>>> to it and it makes difficult to maintain connections in time.
>>>
>>> For ovsdb relay as a remote I use in-db configuration (to provide 
>>> inactivity probe and rbac configuration for ovn-controllers).
>>> For ovsdb-server, which serves SB, I just set --remote=pssl:.
>>>
>>> I’d like to configure remote for ovsdb cluster via DB to set inactivity 
>>> probe setting, but I’m not sure about the correct way for that.
>>>
>>> For now I see only two options:
>>> 1. Setup custom database scheme with connection table, serve it in same SB 
>>> cluster and specify this connection when start ovsdb sb server.
>>
>> There is a ovsdb/local-config.ovsschema shipped with OVS that can be
>> used for that purpose.  But you'll need to craft transactions for it
>> manually with ovsdb-client.
>>
>> There is a control tool prepared by Terry:
>>  
>> https://patchwork.ozlabs.org/project/openvswitch/patch/20220713030250.2634491-1-twil...@redhat.com/
>>
>> But it's not in the repo yet (I need to get back to reviews on that
>> topic at some point).  The tool itself should be fine, but maybe name
>> will change.
> 
> I want to step back to this thread.
> The mentioned patch is archived with "Changes Requested" state, but there is 
> no review comments in this patch.
> If there is no ongoing work with it, I can take it over to finalise.
> For now it needs a small rebase, so I can do it and resend, but before want 
> to hear your thoughts on this.
> 
> Internally we use this patch to work with Local_Config DB for almost 6 months 
> and it works fine.
> On each OVS update we have to re-apply it and sometimes solve conflicts, so 
> would be nice to have this patch in upstream.

Hi, I'm currently in the middle of re-working the ovsdb-server configuration
for a different approach that will replace command-line and appctl configs
with a config file (cmdline and appctls will be preserved for backward
compatibility, but there will be a new way of setting things up).  This should
be much more flexible and user-friendly than working with a local-config
database.  That should also address most of the concerns raised by Terry
regarding usability of local-config (having way too many ways of configuring
the same thing mainly, and requirement to use special tools to modify the
configuration).  I'm planning to post the first version of the change
relatively soon.  I can Cc you on the patches.

Best regards, Ilya Maximets.

> 
>>
>>> 2. Setup second connection in ovn sb database to be used for ovsdb cluster 
>>> and deploy cluster separately from ovsdb relay, because they both start 
>>> same connections and conflict on ports. (I don’t use docker here, so I need 
>>> a separate server for that).
>>
>> That's an easy option available right now, true.  If they are deployed
>> on different nodes, you may even use the same connection record.
>>
>>>
>>> Anyway, if I configure ovsdb remote for ovsdb cluster with specified 
>>> inactivity probe (say, to 60k), I guess it’s still not enough to have ovsdb 
>>> pings every 60 seconds. Inactivity probe must be the same from both ends - 
>>> right? From the ovsdb relay process.
>>
>> Inactivity probes don't need to be the same.  They are separate for each
>> side of a connection and so configured separately.
>>
>> You can set up inactivity probe for the server side of the connection via
>> database.  So, server will probe the relay every 60 seconds, but today
>> it's not possible to set inactivity probe for the relay-to-server direction.
>> So, relay will probe the server every 5 seconds.
>>
>> The way out from this situation is to allow configuration of relays via
>> database as well, e.g. relay:db:Local_Config,Config,relays.  This will
>> require addition of a new table to the Local_Config database and allowing
>> relay config to be parsed from the database in the code.  That wasn't
>> implemented yet.
>>
>>> I saw your talk on last ovscon about this topic, and the solution was in 
>>> progress there. But maybe there were some changes from that time? I’m ready 
>>> to test it if any. Or, maybe there’s any workaround?
>>
>> Sorry, we didn't move forward much on that topic since the presentation.
>> There are few unanswered questions around local config database.  Mainly
>> regarding upgrades from cmdline/main db -based configuration to a local
>> config -based.  But I hope we can figure that out in the current release
>> time frame, i.e. before 3.2 release.
>>
>> There is also this workaround:
>> 

Re: [ovs-discuss] ovsdb relay server active connection probe interval do not work

2023-10-18 Thread Vladislav Odintsov via discuss
Hi Ilya, Terry,

> On 7 Mar 2023, at 14:03, Ilya Maximets  wrote:
> 
> On 3/7/23 00:15, Vladislav Odintsov wrote:
>> Hi Ilya,
>> 
>> I’m wondering whether there are possible configuration parameters for ovsdb 
>> relay -> main ovsdb server inactivity probe timer.
>> My cluster experiencing issues where relay disconnects from main cluster due 
>> to 5 sec. inactivity probe timeout.
>> Main cluster has quite big database and a bunch of daemons, which connects 
>> to it and it makes difficult to maintain connections in time.
>> 
>> For ovsdb relay as a remote I use in-db configuration (to provide inactivity 
>> probe and rbac configuration for ovn-controllers).
>> For ovsdb-server, which serves SB, I just set --remote=pssl:.
>> 
>> I’d like to configure remote for ovsdb cluster via DB to set inactivity 
>> probe setting, but I’m not sure about the correct way for that.
>> 
>> For now I see only two options:
>> 1. Setup custom database scheme with connection table, serve it in same SB 
>> cluster and specify this connection when start ovsdb sb server.
> 
> There is a ovsdb/local-config.ovsschema shipped with OVS that can be
> used for that purpose.  But you'll need to craft transactions for it
> manually with ovsdb-client.
> 
> There is a control tool prepared by Terry:
>  
> https://patchwork.ozlabs.org/project/openvswitch/patch/20220713030250.2634491-1-twil...@redhat.com/
> 
> But it's not in the repo yet (I need to get back to reviews on that
> topic at some point).  The tool itself should be fine, but maybe name
> will change.

I want to step back to this thread.
The mentioned patch is archived with "Changes Requested" state, but there is no 
review comments in this patch.
If there is no ongoing work with it, I can take it over to finalise.
For now it needs a small rebase, so I can do it and resend, but before want to 
hear your thoughts on this.

Internally we use this patch to work with Local_Config DB for almost 6 months 
and it works fine.
On each OVS update we have to re-apply it and sometimes solve conflicts, so 
would be nice to have this patch in upstream.

> 
>> 2. Setup second connection in ovn sb database to be used for ovsdb cluster 
>> and deploy cluster separately from ovsdb relay, because they both start same 
>> connections and conflict on ports. (I don’t use docker here, so I need a 
>> separate server for that).
> 
> That's an easy option available right now, true.  If they are deployed
> on different nodes, you may even use the same connection record.
> 
>> 
>> Anyway, if I configure ovsdb remote for ovsdb cluster with specified 
>> inactivity probe (say, to 60k), I guess it’s still not enough to have ovsdb 
>> pings every 60 seconds. Inactivity probe must be the same from both ends - 
>> right? From the ovsdb relay process.
> 
> Inactivity probes don't need to be the same.  They are separate for each
> side of a connection and so configured separately.
> 
> You can set up inactivity probe for the server side of the connection via
> database.  So, server will probe the relay every 60 seconds, but today
> it's not possible to set inactivity probe for the relay-to-server direction.
> So, relay will probe the server every 5 seconds.
> 
> The way out from this situation is to allow configuration of relays via
> database as well, e.g. relay:db:Local_Config,Config,relays.  This will
> require addition of a new table to the Local_Config database and allowing
> relay config to be parsed from the database in the code.  That wasn't
> implemented yet.
> 
>> I saw your talk on last ovscon about this topic, and the solution was in 
>> progress there. But maybe there were some changes from that time? I’m ready 
>> to test it if any. Or, maybe there’s any workaround?
> 
> Sorry, we didn't move forward much on that topic since the presentation.
> There are few unanswered questions around local config database.  Mainly
> regarding upgrades from cmdline/main db -based configuration to a local
> config -based.  But I hope we can figure that out in the current release
> time frame, i.e. before 3.2 release.
> 
> There is also this workaround:
>  
> https://patchwork.ozlabs.org/project/openvswitch/patch/an2a4qcpihpcfukyt1uomqre.1.1641782536691.hmail.wentao@easystack.cn/
> It simply takes the server->relay inactivity probe value and applies it
> to the relay->server connection.  But it's not a correct solution, because
> it relies on certain database names.
> 
> Out of curiosity, what kind of poll intervals you see on your main server
> setup that triggers inactivity probe failures?  Can upgrade to OVS 3.1
> solve some of these issues?  3.1 should be noticeably faster than 2.17,
> and also parallel compaction introduced in 3.0 removes one of the big
> reasons for large poll intervals.  OVN upgrade to 22.09+ or even 23.03
> should also help with database sizes.
> 
> Best regards, Ilya Maximets.


Regards,
Vladislav Odintsov

___
discuss mailing list

Re: [ovs-discuss] ovsdb relay server active connection probe interval do not work

2023-03-31 Thread Ilya Maximets via discuss
On 3/31/23 01:14, Vladislav Odintsov wrote:
> Thanks Ilya for such a detailed description about inactivity probes and 
> keepalives.
> 
> regards,
> Vladislav Odintsov
> 
> 
> 
> regards,
> Vladislav Odintsov
>> On 31 Mar 2023, at 00:37, Ilya Maximets via discuss 
>>  wrote:
>>
>> On 3/30/23 22:51, Vladislav Odintsov via discuss wrote:
>>> Hi Ilya,
>>> following your recomendation I’ve built OVS 3.1.0 plus Terry’s patch [1].
>>> It’s a bit outdated, but with some changes related to "last_command" logic 
>>> in ovs*ctl it successfully built.
>>> Also, I grabbed your idea gust to hardcode inactivity interval for ovsdb 
>>> relay because it just solves my issue.
>>> So, after testing it seems to work fine. I’ve managed to run ovn-sb-db 
>>> cluster with custom connections from local db and ovsdb relay with 
>>> connections from sb db.
>>
>> Good to know.
>>
>>> I’ve got a question here:
>>> Do we actually need probing from relay to sb cluster if we have configured 
>>> probing from the other side in other direction (db cluster to relay)? Maybe 
>>> we even can just set to 0 inactivity probes in ovsdb/relay.c?
>>
>> If connection between relay and the main cluster is lost, relay may
>> not notice this and just think that there are no new updates.  All the
>> clients connected to that relay will have stale data as a result.
>> Inactivity probe interval is essentially a value for how long you think
>> you can afford that condition to last.
> 
> Do I understand you correctly that by “connection is lost” you mean an 
> accidental termination of tcp session? Like iptables drop or cluster member 
> got killed by sigkill?

Right.  Here also sudden power loss, someone tripping over a cable,
machine force-reset and other causes like this.

> In my understanding if cluster member will just be gracefully stopped, it’ll 
> gracefully shutdown the connection and relay will reconnect to another 
> cluster member?

That's correct.  Graceful shutdown will trigger a correct termination
of TCP session, so the other end will know and re-connect.

> 
> Just of curiosity, in case of accidental termination, where some “outdated” 
> ovn-controller which is connected to relay which in turn thinks it is 
> connected to cluster but it is not. If in such condition ovn-controller tries 
> to claim vif, will relay detect connection failure and reconnecy to another 
> “upstream”?

It depends, but it may not detect an issue.  Relay basically
forwards a transaction.  So, it will send the transaction
received from ovn-controller to the socket "connected" to the
main cluster.  And it will wait for reply.  And reply will
never arrive.  If transaction doesn't have a timeout specified
(and ovn-controller transactions do not), both controller
and the relay may wait for the transaction reply indefinitely.

> 
>>
>>> Also, ovsdb relay has active bidirectional probing to ovn-controllers.
>>> If tcp session got dropped, ovsdb relay wont notice this without probing?
>>
>> TCP timeouts can be very high or may not exist at all, if the network
>> connectivity suddenly disappears (a firewall in between or one of
>> the nodes crashed), both the client and the server may not notice
>> that for a very long time.  I've seen in practice OVN clusters where
>> nodes suddenly disappeared (crashed) and other nodes didn't notice
>> that for many hours (caused by non-working inactivity probes).
>>
>> Another interesting side effect to consider is if controller disappears
>> and the relay will keep sending updates to it, that may cause significant
>> memory usage increase on the relay, because it will keep the backlog of
>> data that underlying socket didn't accept.  May end up being killed by
>> OOM killer, if that continues long enough.
> 
> By disappearing you mean death of ovn-controller without proper connection 
> termination?

Yes.

> So if I understand correctly, relay-to-controllers probing is a “must have”.
> That’s interesting, thanks!

More or less, yes.  As I said in some other email, it's less important
than the opposite direction, so the actual probe interval can likely be set
higher, but we should have something to close dead connections eventually.

> 
>>
>> If you don't want to deal with inactivity probes, you may partially
>> replace them with TCP keepalive.  Disable probes and start daemons with
>> keepalive library preloaded, i.e. LD_PRELOAD=libkeepalive.so with the
>> configuration you think is suitable (default keepalive time is 2 hours
>> on many systems, so defaults are likely not a good choice).  You will
>> loose ability to detect infinite loops or deadlocks and stuff like that,
>> but at least, you'll be protected from pure network failures.
>> See some examples at the end of this page:
>> https://tldp.org/HOWTO/html_single/TCP-Keepalive-HOWTO/
>>
>> Running a cluster without any bidirectional probes is not advisable.
>>
>> Best regards, Ilya Maximets.
>>
>>> Thank you for your help and Terry for his patch!
>>> 1: 
>>> 

Re: [ovs-discuss] ovsdb relay server active connection probe interval do not work

2023-03-30 Thread Vladislav Odintsov via discuss
Thanks Ilya for such a detailed description about inactivity probes and 
keepalives.

regards,
Vladislav Odintsov



regards,
Vladislav Odintsov
> On 31 Mar 2023, at 00:37, Ilya Maximets via discuss 
>  wrote:
> 
> On 3/30/23 22:51, Vladislav Odintsov via discuss wrote:
>> Hi Ilya,
>> following your recomendation I’ve built OVS 3.1.0 plus Terry’s patch [1].
>> It’s a bit outdated, but with some changes related to "last_command" logic 
>> in ovs*ctl it successfully built.
>> Also, I grabbed your idea gust to hardcode inactivity interval for ovsdb 
>> relay because it just solves my issue.
>> So, after testing it seems to work fine. I’ve managed to run ovn-sb-db 
>> cluster with custom connections from local db and ovsdb relay with 
>> connections from sb db.
> 
> Good to know.
> 
>> I’ve got a question here:
>> Do we actually need probing from relay to sb cluster if we have configured 
>> probing from the other side in other direction (db cluster to relay)? Maybe 
>> we even can just set to 0 inactivity probes in ovsdb/relay.c?
> 
> If connection between relay and the main cluster is lost, relay may
> not notice this and just think that there are no new updates.  All the
> clients connected to that relay will have stale data as a result.
> Inactivity probe interval is essentially a value for how long you think
> you can afford that condition to last.

Do I understand you correctly that by “connection is lost” you mean an 
accidental termination of tcp session? Like iptables drop or cluster member got 
killed by sigkill?
In my understanding if cluster member will just be gracefully stopped, it’ll 
gracefully shutdown the connection and relay will reconnect to another cluster 
member?

Just of curiosity, in case of accidental termination, where some “outdated” 
ovn-controller which is connected to relay which in turn thinks it is connected 
to cluster but it is not. If in such condition ovn-controller tries to claim 
vif, will relay detect connection failure and reconnecy to another “upstream”?

> 
>> Also, ovsdb relay has active bidirectional probing to ovn-controllers.
>> If tcp session got dropped, ovsdb relay wont notice this without probing?
> 
> TCP timeouts can be very high or may not exist at all, if the network
> connectivity suddenly disappears (a firewall in between or one of
> the nodes crashed), both the client and the server may not notice
> that for a very long time.  I've seen in practice OVN clusters where
> nodes suddenly disappeared (crashed) and other nodes didn't notice
> that for many hours (caused by non-working inactivity probes).
> 
> Another interesting side effect to consider is if controller disappears
> and the relay will keep sending updates to it, that may cause significant
> memory usage increase on the relay, because it will keep the backlog of
> data that underlying socket didn't accept.  May end up being killed by
> OOM killer, if that continues long enough.

By disappearing you mean death of ovn-controller without proper connection 
termination?
So if I understand correctly, relay-to-controllers probing is a “must have”.
That’s interesting, thanks!

> 
> If you don't want to deal with inactivity probes, you may partially
> replace them with TCP keepalive.  Disable probes and start daemons with
> keepalive library preloaded, i.e. LD_PRELOAD=libkeepalive.so with the
> configuration you think is suitable (default keepalive time is 2 hours
> on many systems, so defaults are likely not a good choice).  You will
> loose ability to detect infinite loops or deadlocks and stuff like that,
> but at least, you'll be protected from pure network failures.
> See some examples at the end of this page:
> https://tldp.org/HOWTO/html_single/TCP-Keepalive-HOWTO/
> 
> Running a cluster without any bidirectional probes is not advisable.
> 
> Best regards, Ilya Maximets.
> 
>> Thank you for your help and Terry for his patch!
>> 1: 
>> https://patchwork.ozlabs.org/project/openvswitch/patch/20220713030250.2634491-1-twil...@redhat.com/
 On 7 Mar 2023, at 19:43, Ilya Maximets via discuss 
  wrote:
>>> On 3/7/23 16:58, Vladislav Odintsov wrote:
 I’ve sent last mail from wrong account and indentation was lost.
 Resending...
> On 7 Mar 2023, at 18:01, Vladislav Odintsov via discuss 
>  wrote:
> Thanks Ilya for the quick and detailed response!
>> On 7 Mar 2023, at 14:03, Ilya Maximets via discuss 
>>  wrote:
>> On 3/7/23 00:15, Vladislav Odintsov wrote:
>>> Hi Ilya,
>>> I’m wondering whether there are possible configuration parameters for 
>>> ovsdb relay -> main ovsdb server inactivity probe timer.
>>> My cluster experiencing issues where relay disconnects from main 
>>> cluster due to 5 sec. inactivity probe timeout.
>>> Main cluster has quite big database and a bunch of daemons, which 
>>> connects to it and it makes difficult to maintain connections in time.
>>> For ovsdb relay as a remote I use in-db configuration (to 

Re: [ovs-discuss] ovsdb relay server active connection probe interval do not work

2023-03-30 Thread Ilya Maximets via discuss
On 3/30/23 22:51, Vladislav Odintsov via discuss wrote:
> Hi Ilya,
> 
> following your recomendation I’ve built OVS 3.1.0 plus Terry’s patch [1].
> It’s a bit outdated, but with some changes related to "last_command" logic in 
> ovs*ctl it successfully built.
> 
> Also, I grabbed your idea gust to hardcode inactivity interval for ovsdb 
> relay because it just solves my issue.
> 
> So, after testing it seems to work fine. I’ve managed to run ovn-sb-db 
> cluster with custom connections from local db and ovsdb relay with 
> connections from sb db.

Good to know.

> 
> I’ve got a question here:
> Do we actually need probing from relay to sb cluster if we have configured 
> probing from the other side in other direction (db cluster to relay)? Maybe 
> we even can just set to 0 inactivity probes in ovsdb/relay.c?

If connection between relay and the main cluster is lost, relay may
not notice this and just think that there are no new updates.  All the
clients connected to that relay will have stale data as a result.
Inactivity probe interval is essentially a value for how long you think
you can afford that condition to last.

> Also, ovsdb relay has active bidirectional probing to ovn-controllers.
> If tcp session got dropped, ovsdb relay wont notice this without probing?

TCP timeouts can be very high or may not exist at all, if the network
connectivity suddenly disappears (a firewall in between or one of
the nodes crashed), both the client and the server may not notice
that for a very long time.  I've seen in practice OVN clusters where
nodes suddenly disappeared (crashed) and other nodes didn't notice
that for many hours (caused by non-working inactivity probes).

Another interesting side effect to consider is if controller disappears
and the relay will keep sending updates to it, that may cause significant
memory usage increase on the relay, because it will keep the backlog of
data that underlying socket didn't accept.  May end up being killed by
OOM killer, if that continues long enough.

If you don't want to deal with inactivity probes, you may partially
replace them with TCP keepalive.  Disable probes and start daemons with
keepalive library preloaded, i.e. LD_PRELOAD=libkeepalive.so with the
configuration you think is suitable (default keepalive time is 2 hours
on many systems, so defaults are likely not a good choice).  You will
loose ability to detect infinite loops or deadlocks and stuff like that,
but at least, you'll be protected from pure network failures.
See some examples at the end of this page:
  https://tldp.org/HOWTO/html_single/TCP-Keepalive-HOWTO/

Running a cluster without any bidirectional probes is not advisable.

Best regards, Ilya Maximets.

> 
> 
> Thank you for your help and Terry for his patch!
> 
> 1: 
> https://patchwork.ozlabs.org/project/openvswitch/patch/20220713030250.2634491-1-twil...@redhat.com/
> 
>> On 7 Mar 2023, at 19:43, Ilya Maximets via discuss 
>>  wrote:
>>
>> On 3/7/23 16:58, Vladislav Odintsov wrote:
>>> I’ve sent last mail from wrong account and indentation was lost.
>>> Resending...
>>>
 On 7 Mar 2023, at 18:01, Vladislav Odintsov via discuss 
  wrote:

 Thanks Ilya for the quick and detailed response!

> On 7 Mar 2023, at 14:03, Ilya Maximets via discuss 
>  wrote:
>
> On 3/7/23 00:15, Vladislav Odintsov wrote:
>> Hi Ilya,
>>
>> I’m wondering whether there are possible configuration parameters for 
>> ovsdb relay -> main ovsdb server inactivity probe timer.
>> My cluster experiencing issues where relay disconnects from main cluster 
>> due to 5 sec. inactivity probe timeout.
>> Main cluster has quite big database and a bunch of daemons, which 
>> connects to it and it makes difficult to maintain connections in time.
>>
>> For ovsdb relay as a remote I use in-db configuration (to provide 
>> inactivity probe and rbac configuration for ovn-controllers).
>> For ovsdb-server, which serves SB, I just set --remote=pssl:.
>>
>> I’d like to configure remote for ovsdb cluster via DB to set inactivity 
>> probe setting, but I’m not sure about the correct way for that.
>>
>> For now I see only two options:
>> 1. Setup custom database scheme with connection table, serve it in same 
>> SB cluster and specify this connection when start ovsdb sb server.
>
> There is a ovsdb/local-config.ovsschema shipped with OVS that can be
> used for that purpose.  But you'll need to craft transactions for it
> manually with ovsdb-client.
>
> There is a control tool prepared by Terry:
>  
> https://patchwork.ozlabs.org/project/openvswitch/patch/20220713030250.2634491-1-twil...@redhat.com/

 Thanks for pointing on a patch, I guess, I’ll test it out.

>
> But it's not in the repo yet (I need to get back to reviews on that
> topic at some point).  The tool itself should be fine, but maybe name
> will change.

 Am I 

Re: [ovs-discuss] ovsdb relay server active connection probe interval do not work

2023-03-30 Thread Vladislav Odintsov via discuss
Hi Ilya,

following your recomendation I’ve built OVS 3.1.0 plus Terry’s patch [1].
It’s a bit outdated, but with some changes related to "last_command" logic in 
ovs*ctl it successfully built.

Also, I grabbed your idea gust to hardcode inactivity interval for ovsdb relay 
because it just solves my issue.

So, after testing it seems to work fine. I’ve managed to run ovn-sb-db cluster 
with custom connections from local db and ovsdb relay with connections from sb 
db.

I’ve got a question here:
Do we actually need probing from relay to sb cluster if we have configured 
probing from the other side in other direction (db cluster to relay)? Maybe we 
even can just set to 0 inactivity probes in ovsdb/relay.c?
Also, ovsdb relay has active bidirectional probing to ovn-controllers.
If tcp session got dropped, ovsdb relay wont notice this without probing?


Thank you for your help and Terry for his patch!

1: 
https://patchwork.ozlabs.org/project/openvswitch/patch/20220713030250.2634491-1-twil...@redhat.com/

> On 7 Mar 2023, at 19:43, Ilya Maximets via discuss 
>  wrote:
> 
> On 3/7/23 16:58, Vladislav Odintsov wrote:
>> I’ve sent last mail from wrong account and indentation was lost.
>> Resending...
>> 
>>> On 7 Mar 2023, at 18:01, Vladislav Odintsov via discuss 
>>>  wrote:
>>> 
>>> Thanks Ilya for the quick and detailed response!
>>> 
 On 7 Mar 2023, at 14:03, Ilya Maximets via discuss 
  wrote:
 
 On 3/7/23 00:15, Vladislav Odintsov wrote:
> Hi Ilya,
> 
> I’m wondering whether there are possible configuration parameters for 
> ovsdb relay -> main ovsdb server inactivity probe timer.
> My cluster experiencing issues where relay disconnects from main cluster 
> due to 5 sec. inactivity probe timeout.
> Main cluster has quite big database and a bunch of daemons, which 
> connects to it and it makes difficult to maintain connections in time.
> 
> For ovsdb relay as a remote I use in-db configuration (to provide 
> inactivity probe and rbac configuration for ovn-controllers).
> For ovsdb-server, which serves SB, I just set --remote=pssl:.
> 
> I’d like to configure remote for ovsdb cluster via DB to set inactivity 
> probe setting, but I’m not sure about the correct way for that.
> 
> For now I see only two options:
> 1. Setup custom database scheme with connection table, serve it in same 
> SB cluster and specify this connection when start ovsdb sb server.
 
 There is a ovsdb/local-config.ovsschema shipped with OVS that can be
 used for that purpose.  But you'll need to craft transactions for it
 manually with ovsdb-client.
 
 There is a control tool prepared by Terry:
  
 https://patchwork.ozlabs.org/project/openvswitch/patch/20220713030250.2634491-1-twil...@redhat.com/
>>> 
>>> Thanks for pointing on a patch, I guess, I’ll test it out.
>>> 
 
 But it's not in the repo yet (I need to get back to reviews on that
 topic at some point).  The tool itself should be fine, but maybe name
 will change.
>>> 
>>> Am I right that in-DB remote configuration must be a hosted by this 
>>> ovsdb-server database?
> 
> Yes.
> 
>>> What is the best way to configure additional DB on ovsdb-server so that 
>>> this configuration to be permanent?
> 
> You may specify multiple database files on the command-line for ovsdb-server
> process.  It will open and serve each of them.  They all can be in different
> modes, e.g. you have multiple clustered, standalone and relay databases in
> the same ovsdb-server process.
> 
> There is also ovsdb-server/add-db appctl to add a new database to a running
> process, but it will not survive the restart.
> 
>>> Also, am I understand correctly that there is no necessity for this DB to 
>>> be clustered?
> 
> It's kind of a point of the Local_Config database to not be clustered.
> The original use case was to allow each cluster member to listen on a
> different IP. i.e. if you don't want to listen on 0.0.0.0 and your
> cluster members are on different nodes, so have different listening IPs.
> 
>>> 
 
> 2. Setup second connection in ovn sb database to be used for ovsdb 
> cluster and deploy cluster separately from ovsdb relay, because they both 
> start same connections and conflict on ports. (I don’t use docker here, 
> so I need a separate server for that).
 
 That's an easy option available right now, true.  If they are deployed
 on different nodes, you may even use the same connection record.
 
> 
> Anyway, if I configure ovsdb remote for ovsdb cluster with specified 
> inactivity probe (say, to 60k), I guess it’s still not enough to have 
> ovsdb pings every 60 seconds. Inactivity probe must be the same from both 
> ends - right? From the ovsdb relay process.
 
 Inactivity probes don't need to be the same.  They are separate for each
 side of a connection and so configured separately.
 

Re: [ovs-discuss] ovsdb relay server active connection probe interval do not work

2023-03-21 Thread Ilya Maximets via discuss
On 3/21/23 07:18, Jake Yip wrote:
> 
> 
> On 20/3/2023 10:51 pm, Ilya Maximets wrote:
>> On 3/16/23 23:06, Jake Yip wrote:
>>> Hi all,
>>>
>>> Apologies for jumping into this thread. We are seeing the same and it's 
>>> nice to find someone with similar issues :)
>>>
>>> On 8/3/2023 3:43 am, Ilya Maximets via discuss wrote:
>>
>> We see failures on the OVSDB Relay side:
>>
>> 2023-03-06T22:19:32.966Z|00099|reconnect|ERR|ssl:xxx:16642: no response 
>> to inactivity probe after 5 seconds, disconnecting
>> 2023-03-06T22:19:32.966Z|00100|reconnect|INFO|ssl:xxx:16642: connection 
>> dropped
>> 2023-03-06T22:19:40.989Z|00101|reconnect|INFO|ssl:xxx:16642: connected
>> 2023-03-06T22:19:50.997Z|00102|reconnect|ERR|ssl:xxx:16642: no response 
>> to inactivity probe after 5 seconds, disconnecting
>> 2023-03-06T22:19:50.997Z|00103|reconnect|INFO|ssl:xxx:16642: connection 
>> dropped
>> 2023-03-06T22:19:59.022Z|00104|reconnect|INFO|ssl:xxx:16642: connected
>> 2023-03-06T22:20:09.026Z|00105|reconnect|ERR|ssl:xxx:16642: no response 
>> to inactivity probe after 5 seconds, disconnecting
>> 2023-03-06T22:20:09.026Z|00106|reconnect|INFO|ssl:xxx:16642: connection 
>> dropped
>> 2023-03-06T22:20:17.052Z|00107|reconnect|INFO|ssl:xxx:16642: connected
>> 2023-03-06T22:20:27.056Z|00108|reconnect|ERR|ssl:xxx:16642: no response 
>> to inactivity probe after 5 seconds, disconnecting
>> 2023-03-06T22:20:27.056Z|00109|reconnect|INFO|ssl:xxx:16642: connection 
>> dropped
>> 2023-03-06T22:20:35.111Z|00110|reconnect|INFO|ssl:xxx:16642: connected
>>
>> On the DB cluster this looks like:
>>
>> 2023-03-06T22:19:04.208Z|00451|stream_ssl|WARN|SSL_read: unexpected SSL 
>> connection close
>> 2023-03-06T22:19:04.211Z|00452|reconnect|WARN|ssl:xxx:52590: connection 
>> dropped (Protocol error)

 OK.  These are symptoms.  The cause must be something like
 'Unreasonably long MANY ms poll interval' on the DB cluster side.
 i.e. the reason why the main DB cluster didn't reply to the
 probes sent from the relay.  Because as soon as server receives
 the probe, it replies right back.  If it didn't reply, it was
 doing something else for an extended period of time.  "MANY" is
 more than 5 seconds.

>>>
>>> We are seeing the same issue here after moving to OVN relay.
>>>
>>> - On the relay "no response to inactivity probe after 5 seconds"
>>> - On the OVSDB cluster
>>>    - "Unreasonably long 1726ms poll interval"
>>>    - "connection dropped (Input/output error)"
>>>    - "SSL_write: system error (Broken pipe)"
>>>    - 100% CPU on northd process
>>>
>>> Is there anything we could look for on the OVSDB side to narrow down what 
>>> may be causing the load on the cluster side?
>>>
>>> A brief history - We are migrating an OpenStack cloud from MidoNet to OVN. 
>>> This cloud has roughly
>>>
>>> - 400 neutron networks / ovn logical switches
>>> - 300 neutron routers
>>> - 14000 neutron ports / ovn logical switchports
>>> - 28000 neutron security groups / ovn port group
>>> - 8 neutron secgroup rules / acl
>>>
>>> We populated the OVN DB by using OpenStack/Neutron ovn sync script.
>>>
>>> We have attempted the migration twice previously (2021, 2022) but failed 
>>> due to load issues. We've reported issues and have seen lots of performance 
>>> improvements over the last two years. Here is a BIG thank you to the dev 
>>> teams!
>>>
>>> We are now on the following versions
>>>
>>> - OVS 2.17
>>> - OVN 22.03
>>>
>>> We are exploring upgrade as an option, but I am concerned if there's 
>>> something fundamentally wrong with the data / config we have that is 
>>> causing the high load, and would like to rule that out first. Please let me 
>>> know if you need more information, will be happy to start a new thread too.
>>
>> Hi, Jake.  Your scale numbers are fairly high, i.e. this number of
>> objects in the setup may indeed create a noticeable load.
>>
>> The fact that relay is disconnecting with only 1726ms poll intervals
>> on the main cluster side is a bit strange.  Not sure why this happened.
>> Normally it should be 5+ seconds.
> 
> There are multiple errors. I just grabbed the first I found; indeed there are 
> poll intervals >5secs like
> 
> ovs|05000|timeval|WARN|Unreasonably long 13209ms poll interval (12942ms user, 
> 264ms system)

Yeah, this one is pretty high.  Is it, by any chance, database compaction
related?  i.e. are there database compaction related logs in the close
proximity to this one?

In case all the huge poll intervals are compaction-related, upgrade to
OVS 3.0+ may completely solve the issue, since most of the compaction
work is moved into a separate thread there.

> 
>>
>> The versions you're using have an upgrade path with potentially
>> significant performance improvements, e.g OVS 3.1 + OVN 23.03.
>> Both ovsdb-server and core OVN components became much faster in
>> the 

Re: [ovs-discuss] ovsdb relay server active connection probe interval do not work

2023-03-21 Thread Jake Yip via discuss



On 20/3/2023 10:51 pm, Ilya Maximets wrote:

On 3/16/23 23:06, Jake Yip wrote:

Hi all,

Apologies for jumping into this thread. We are seeing the same and it's nice to 
find someone with similar issues :)

On 8/3/2023 3:43 am, Ilya Maximets via discuss wrote:


We see failures on the OVSDB Relay side:

2023-03-06T22:19:32.966Z|00099|reconnect|ERR|ssl:xxx:16642: no response to 
inactivity probe after 5 seconds, disconnecting
2023-03-06T22:19:32.966Z|00100|reconnect|INFO|ssl:xxx:16642: connection dropped
2023-03-06T22:19:40.989Z|00101|reconnect|INFO|ssl:xxx:16642: connected
2023-03-06T22:19:50.997Z|00102|reconnect|ERR|ssl:xxx:16642: no response to 
inactivity probe after 5 seconds, disconnecting
2023-03-06T22:19:50.997Z|00103|reconnect|INFO|ssl:xxx:16642: connection dropped
2023-03-06T22:19:59.022Z|00104|reconnect|INFO|ssl:xxx:16642: connected
2023-03-06T22:20:09.026Z|00105|reconnect|ERR|ssl:xxx:16642: no response to 
inactivity probe after 5 seconds, disconnecting
2023-03-06T22:20:09.026Z|00106|reconnect|INFO|ssl:xxx:16642: connection dropped
2023-03-06T22:20:17.052Z|00107|reconnect|INFO|ssl:xxx:16642: connected
2023-03-06T22:20:27.056Z|00108|reconnect|ERR|ssl:xxx:16642: no response to 
inactivity probe after 5 seconds, disconnecting
2023-03-06T22:20:27.056Z|00109|reconnect|INFO|ssl:xxx:16642: connection dropped
2023-03-06T22:20:35.111Z|00110|reconnect|INFO|ssl:xxx:16642: connected

On the DB cluster this looks like:

2023-03-06T22:19:04.208Z|00451|stream_ssl|WARN|SSL_read: unexpected SSL 
connection close
2023-03-06T22:19:04.211Z|00452|reconnect|WARN|ssl:xxx:52590: connection dropped 
(Protocol error)


OK.  These are symptoms.  The cause must be something like
'Unreasonably long MANY ms poll interval' on the DB cluster side.
i.e. the reason why the main DB cluster didn't reply to the
probes sent from the relay.  Because as soon as server receives
the probe, it replies right back.  If it didn't reply, it was
doing something else for an extended period of time.  "MANY" is
more than 5 seconds.



We are seeing the same issue here after moving to OVN relay.

- On the relay "no response to inactivity probe after 5 seconds"
- On the OVSDB cluster
   - "Unreasonably long 1726ms poll interval"
   - "connection dropped (Input/output error)"
   - "SSL_write: system error (Broken pipe)"
   - 100% CPU on northd process

Is there anything we could look for on the OVSDB side to narrow down what may 
be causing the load on the cluster side?

A brief history - We are migrating an OpenStack cloud from MidoNet to OVN. This 
cloud has roughly

- 400 neutron networks / ovn logical switches
- 300 neutron routers
- 14000 neutron ports / ovn logical switchports
- 28000 neutron security groups / ovn port group
- 8 neutron secgroup rules / acl

We populated the OVN DB by using OpenStack/Neutron ovn sync script.

We have attempted the migration twice previously (2021, 2022) but failed due to 
load issues. We've reported issues and have seen lots of performance 
improvements over the last two years. Here is a BIG thank you to the dev teams!

We are now on the following versions

- OVS 2.17
- OVN 22.03

We are exploring upgrade as an option, but I am concerned if there's something 
fundamentally wrong with the data / config we have that is causing the high 
load, and would like to rule that out first. Please let me know if you need 
more information, will be happy to start a new thread too.


Hi, Jake.  Your scale numbers are fairly high, i.e. this number of
objects in the setup may indeed create a noticeable load.

The fact that relay is disconnecting with only 1726ms poll intervals
on the main cluster side is a bit strange.  Not sure why this happened.
Normally it should be 5+ seconds.


There are multiple errors. I just grabbed the first I found; indeed 
there are poll intervals >5secs like


ovs|05000|timeval|WARN|Unreasonably long 13209ms poll interval (12942ms 
user, 264ms system)




The versions you're using have an upgrade path with potentially
significant performance improvements, e.g OVS 3.1 + OVN 23.03.
Both ovsdb-server and core OVN components became much faster in
the previous year.



Thanks for the work you've put into OVN. I've seen your conference 
presentations and believe that is a valid way forward. One thing keeping 
us back is that there are no Ubuntu packages for us. So we may need to 
build them.


We may also be exploring containers but is still not sure not sure how 
containerised openvswitch works.


Another issue is if integration will work - we are using Neutron Yoga. I 
believe OVN being able to be upgraded from one LTS to another means 
Neutron Yoga should work with  OVS 3.1 + OVN 23.03 ?



I'm not sure if there is anything fundamentally wrong with your setup,
other than the total amount of resources.

If you have a freedom of building your own packages and the relay
disconnection is the main problem in your setup, you may try something
like this:

diff --git a/ovsdb/relay.c b/ovsdb/relay.c

Re: [ovs-discuss] ovsdb relay server active connection probe interval do not work

2023-03-20 Thread Ilya Maximets via discuss
On 3/16/23 23:06, Jake Yip wrote:
> Hi all,
> 
> Apologies for jumping into this thread. We are seeing the same and it's nice 
> to find someone with similar issues :)
> 
> On 8/3/2023 3:43 am, Ilya Maximets via discuss wrote:

 We see failures on the OVSDB Relay side:

 2023-03-06T22:19:32.966Z|00099|reconnect|ERR|ssl:xxx:16642: no response to 
 inactivity probe after 5 seconds, disconnecting
 2023-03-06T22:19:32.966Z|00100|reconnect|INFO|ssl:xxx:16642: connection 
 dropped
 2023-03-06T22:19:40.989Z|00101|reconnect|INFO|ssl:xxx:16642: connected
 2023-03-06T22:19:50.997Z|00102|reconnect|ERR|ssl:xxx:16642: no response to 
 inactivity probe after 5 seconds, disconnecting
 2023-03-06T22:19:50.997Z|00103|reconnect|INFO|ssl:xxx:16642: connection 
 dropped
 2023-03-06T22:19:59.022Z|00104|reconnect|INFO|ssl:xxx:16642: connected
 2023-03-06T22:20:09.026Z|00105|reconnect|ERR|ssl:xxx:16642: no response to 
 inactivity probe after 5 seconds, disconnecting
 2023-03-06T22:20:09.026Z|00106|reconnect|INFO|ssl:xxx:16642: connection 
 dropped
 2023-03-06T22:20:17.052Z|00107|reconnect|INFO|ssl:xxx:16642: connected
 2023-03-06T22:20:27.056Z|00108|reconnect|ERR|ssl:xxx:16642: no response to 
 inactivity probe after 5 seconds, disconnecting
 2023-03-06T22:20:27.056Z|00109|reconnect|INFO|ssl:xxx:16642: connection 
 dropped
 2023-03-06T22:20:35.111Z|00110|reconnect|INFO|ssl:xxx:16642: connected

 On the DB cluster this looks like:

 2023-03-06T22:19:04.208Z|00451|stream_ssl|WARN|SSL_read: unexpected SSL 
 connection close
 2023-03-06T22:19:04.211Z|00452|reconnect|WARN|ssl:xxx:52590: connection 
 dropped (Protocol error)
>>
>> OK.  These are symptoms.  The cause must be something like
>> 'Unreasonably long MANY ms poll interval' on the DB cluster side.
>> i.e. the reason why the main DB cluster didn't reply to the
>> probes sent from the relay.  Because as soon as server receives
>> the probe, it replies right back.  If it didn't reply, it was
>> doing something else for an extended period of time.  "MANY" is
>> more than 5 seconds.
>>
> 
> We are seeing the same issue here after moving to OVN relay.
> 
> - On the relay "no response to inactivity probe after 5 seconds"
> - On the OVSDB cluster
>   - "Unreasonably long 1726ms poll interval"
>   - "connection dropped (Input/output error)"
>   - "SSL_write: system error (Broken pipe)"
>   - 100% CPU on northd process
> 
> Is there anything we could look for on the OVSDB side to narrow down what may 
> be causing the load on the cluster side?
> 
> A brief history - We are migrating an OpenStack cloud from MidoNet to OVN. 
> This cloud has roughly
> 
> - 400 neutron networks / ovn logical switches
> - 300 neutron routers
> - 14000 neutron ports / ovn logical switchports
> - 28000 neutron security groups / ovn port group
> - 8 neutron secgroup rules / acl
> 
> We populated the OVN DB by using OpenStack/Neutron ovn sync script.
> 
> We have attempted the migration twice previously (2021, 2022) but failed due 
> to load issues. We've reported issues and have seen lots of performance 
> improvements over the last two years. Here is a BIG thank you to the dev 
> teams!
> 
> We are now on the following versions
> 
> - OVS 2.17
> - OVN 22.03
> 
> We are exploring upgrade as an option, but I am concerned if there's 
> something fundamentally wrong with the data / config we have that is causing 
> the high load, and would like to rule that out first. Please let me know if 
> you need more information, will be happy to start a new thread too.

Hi, Jake.  Your scale numbers are fairly high, i.e. this number of
objects in the setup may indeed create a noticeable load.

The fact that relay is disconnecting with only 1726ms poll intervals
on the main cluster side is a bit strange.  Not sure why this happened.
Normally it should be 5+ seconds.

The versions you're using have an upgrade path with potentially
significant performance improvements, e.g OVS 3.1 + OVN 23.03.
Both ovsdb-server and core OVN components became much faster in
the previous year.

I'm not sure if there is anything fundamentally wrong with your setup,
other than the total amount of resources.

If you have a freedom of building your own packages and the relay
disconnection is the main problem in your setup, you may try something
like this:

diff --git a/ovsdb/relay.c b/ovsdb/relay.c
index 9ff6ed8f3..5c5937c27 100644
--- a/ovsdb/relay.c
+++ b/ovsdb/relay.c
@@ -152,6 +152,7 @@ ovsdb_relay_add_db(struct ovsdb *db, const char *remote,
 shash_add(_dbs, db->name, ctx);
 ovsdb_cs_set_leader_only(ctx->cs, false);
 ovsdb_cs_set_remote(ctx->cs, remote, true);
+ovsdb_cs_set_probe_interval(ctx->cs, 16000);
 
 VLOG_DBG("added database: %s, %s", db->name, remote);
 }
---

This change will set 16 seconds as inactivity probe interval for
relay-to-server connection by default.

Best regards, Ilya 

Re: [ovs-discuss] ovsdb relay server active connection probe interval do not work

2023-03-16 Thread Jake Yip via discuss

Hi all,

Apologies for jumping into this thread. We are seeing the same and it's 
nice to find someone with similar issues :)


On 8/3/2023 3:43 am, Ilya Maximets via discuss wrote:


We see failures on the OVSDB Relay side:

2023-03-06T22:19:32.966Z|00099|reconnect|ERR|ssl:xxx:16642: no response to 
inactivity probe after 5 seconds, disconnecting
2023-03-06T22:19:32.966Z|00100|reconnect|INFO|ssl:xxx:16642: connection dropped
2023-03-06T22:19:40.989Z|00101|reconnect|INFO|ssl:xxx:16642: connected
2023-03-06T22:19:50.997Z|00102|reconnect|ERR|ssl:xxx:16642: no response to 
inactivity probe after 5 seconds, disconnecting
2023-03-06T22:19:50.997Z|00103|reconnect|INFO|ssl:xxx:16642: connection dropped
2023-03-06T22:19:59.022Z|00104|reconnect|INFO|ssl:xxx:16642: connected
2023-03-06T22:20:09.026Z|00105|reconnect|ERR|ssl:xxx:16642: no response to 
inactivity probe after 5 seconds, disconnecting
2023-03-06T22:20:09.026Z|00106|reconnect|INFO|ssl:xxx:16642: connection dropped
2023-03-06T22:20:17.052Z|00107|reconnect|INFO|ssl:xxx:16642: connected
2023-03-06T22:20:27.056Z|00108|reconnect|ERR|ssl:xxx:16642: no response to 
inactivity probe after 5 seconds, disconnecting
2023-03-06T22:20:27.056Z|00109|reconnect|INFO|ssl:xxx:16642: connection dropped
2023-03-06T22:20:35.111Z|00110|reconnect|INFO|ssl:xxx:16642: connected

On the DB cluster this looks like:

2023-03-06T22:19:04.208Z|00451|stream_ssl|WARN|SSL_read: unexpected SSL 
connection close
2023-03-06T22:19:04.211Z|00452|reconnect|WARN|ssl:xxx:52590: connection dropped 
(Protocol error)


OK.  These are symptoms.  The cause must be something like
'Unreasonably long MANY ms poll interval' on the DB cluster side.
i.e. the reason why the main DB cluster didn't reply to the
probes sent from the relay.  Because as soon as server receives
the probe, it replies right back.  If it didn't reply, it was
doing something else for an extended period of time.  "MANY" is
more than 5 seconds.



We are seeing the same issue here after moving to OVN relay.

- On the relay "no response to inactivity probe after 5 seconds"
- On the OVSDB cluster
  - "Unreasonably long 1726ms poll interval"
  - "connection dropped (Input/output error)"
  - "SSL_write: system error (Broken pipe)"
  - 100% CPU on northd process

Is there anything we could look for on the OVSDB side to narrow down 
what may be causing the load on the cluster side?


A brief history - We are migrating an OpenStack cloud from MidoNet to 
OVN. This cloud has roughly


- 400 neutron networks / ovn logical switches
- 300 neutron routers
- 14000 neutron ports / ovn logical switchports
- 28000 neutron security groups / ovn port group
- 8 neutron secgroup rules / acl

We populated the OVN DB by using OpenStack/Neutron ovn sync script.

We have attempted the migration twice previously (2021, 2022) but failed 
due to load issues. We've reported issues and have seen lots of 
performance improvements over the last two years. Here is a BIG thank 
you to the dev teams!


We are now on the following versions

- OVS 2.17
- OVN 22.03

We are exploring upgrade as an option, but I am concerned if there's 
something fundamentally wrong with the data / config we have that is 
causing the high load, and would like to rule that out first. Please let 
me know if you need more information, will be happy to start a new 
thread too.


Regards,
Jake
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] ovsdb relay server active connection probe interval do not work

2023-03-08 Thread Ilya Maximets via discuss
On 3/8/23 08:57, Frode Nordahl wrote:
 Does it state that configuring inactivity probe on the DB cluster side 
 will not help and configuration on the relay side must be done?
>>
>> Yes.  You likely need a configuration on the relay side.
> 
> Sorry for butting into an ongoing discussion, but this part resonated
> with one of my past ventures. While investigating a different problem
> we kind of hit a similar problem [0]. Aligning client, relay and
> backend server configuration has potential to become complicated.
> Would an alternative be for the real server and relay server to
> exchange this information in-line as part of their communication, for
> example exposing it in the special _Server built-in database [1]?
> 
> 0: 
> https://bugs.launchpad.net/ubuntu/lunar/+source/openvswitch/+bug/1998781/comments/3
> 1: https://github.com/openvswitch/ovs/blob/master/ovsdb/_server.ovsschema
> 

Hi, Frode.

Do you mean synchronizing probes for passive connections, i.e.
having ptcp/pssl remotes with the same inactivity probes on
the main DB and relay?  Or making  -->  connection
have the same probe interval as passive  --> 
connection?

The main problem with the former I see is that it is currently
configurable for each side individually.  And it would be
confusing if ovsdb-server will override the user-specified value.
Hence, the config knob, i.e. configuration by the user, will
be needed anyway.

The latter is basically some form of what Wentao Jia proposed [2].
We could do something like that, since there is no way to
configure the probe interval in  -->  direction
at the moment.  But that that will be different from any other
connection we have in OVS world, so may complicate the
understanding of the matter even more.

I hope that we can forget about  -->  inactivity
probes at some point and just use the default.  We do control the
server and we can make it faster / operate better.  In fact,
we do not see any large poll intervals in large scale ovn-heater
tests on neither Sb nor Nb OVSDB servers, not even 1 second long,
with recent OVS versions.  Remaining cases I'm aware of are
associated with the database conversion and potential mass
re-connections, which are both solvable and being worked at.

The opposite  -->  direction is a bit more
problematic, because we do not control the client application, so
we don't know how long it may not reply.  E.g. full recompute on
ovn-controller may still take a lot of time.  But perhaps we could
just bump the default probe interval for this direction to something
like 60 seconds or even more.  Checking if the client is alive isn't
really that important for a server, we just need to disconnect
dead clients eventually.

[2] 
https://patchwork.ozlabs.org/project/openvswitch/patch/an2a4qcpihpcfukyt1uomqre.1.1641782536691.hmail.wentao@easystack.cn/

Best regards, Ilya Maximets.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] ovsdb relay server active connection probe interval do not work

2023-03-07 Thread Frode Nordahl via discuss
On Tue, Mar 7, 2023 at 5:43 PM Ilya Maximets via discuss
 wrote:
>
> On 3/7/23 16:58, Vladislav Odintsov wrote:
> > I’ve sent last mail from wrong account and indentation was lost.
> > Resending...
> >
> >> On 7 Mar 2023, at 18:01, Vladislav Odintsov via discuss 
> >>  wrote:
> >>
> >> Thanks Ilya for the quick and detailed response!
> >>
> >>> On 7 Mar 2023, at 14:03, Ilya Maximets via discuss 
> >>>  wrote:
> >>>
> >>> On 3/7/23 00:15, Vladislav Odintsov wrote:
>  Hi Ilya,
> 
>  I’m wondering whether there are possible configuration parameters for 
>  ovsdb relay -> main ovsdb server inactivity probe timer.
>  My cluster experiencing issues where relay disconnects from main cluster 
>  due to 5 sec. inactivity probe timeout.
>  Main cluster has quite big database and a bunch of daemons, which 
>  connects to it and it makes difficult to maintain connections in time.
> 
>  For ovsdb relay as a remote I use in-db configuration (to provide 
>  inactivity probe and rbac configuration for ovn-controllers).
>  For ovsdb-server, which serves SB, I just set --remote=pssl:.
> 
>  I’d like to configure remote for ovsdb cluster via DB to set inactivity 
>  probe setting, but I’m not sure about the correct way for that.
> 
>  For now I see only two options:
>  1. Setup custom database scheme with connection table, serve it in same 
>  SB cluster and specify this connection when start ovsdb sb server.
> >>>
> >>> There is a ovsdb/local-config.ovsschema shipped with OVS that can be
> >>> used for that purpose.  But you'll need to craft transactions for it
> >>> manually with ovsdb-client.
> >>>
> >>> There is a control tool prepared by Terry:
> >>>  
> >>> https://patchwork.ozlabs.org/project/openvswitch/patch/20220713030250.2634491-1-twil...@redhat.com/
> >>
> >> Thanks for pointing on a patch, I guess, I’ll test it out.
> >>
> >>>
> >>> But it's not in the repo yet (I need to get back to reviews on that
> >>> topic at some point).  The tool itself should be fine, but maybe name
> >>> will change.
> >>
> >> Am I right that in-DB remote configuration must be a hosted by this 
> >> ovsdb-server database?
>
> Yes.
>
> >> What is the best way to configure additional DB on ovsdb-server so that 
> >> this configuration to be permanent?
>
> You may specify multiple database files on the command-line for ovsdb-server
> process.  It will open and serve each of them.  They all can be in different
> modes, e.g. you have multiple clustered, standalone and relay databases in
> the same ovsdb-server process.
>
> There is also ovsdb-server/add-db appctl to add a new database to a running
> process, but it will not survive the restart.
>
> >> Also, am I understand correctly that there is no necessity for this DB to 
> >> be clustered?
>
> It's kind of a point of the Local_Config database to not be clustered.
> The original use case was to allow each cluster member to listen on a
> different IP. i.e. if you don't want to listen on 0.0.0.0 and your
> cluster members are on different nodes, so have different listening IPs.
>
> >>
> >>>
>  2. Setup second connection in ovn sb database to be used for ovsdb 
>  cluster and deploy cluster separately from ovsdb relay, because they 
>  both start same connections and conflict on ports. (I don’t use docker 
>  here, so I need a separate server for that).
> >>>
> >>> That's an easy option available right now, true.  If they are deployed
> >>> on different nodes, you may even use the same connection record.
> >>>
> 
>  Anyway, if I configure ovsdb remote for ovsdb cluster with specified 
>  inactivity probe (say, to 60k), I guess it’s still not enough to have 
>  ovsdb pings every 60 seconds. Inactivity probe must be the same from 
>  both ends - right? From the ovsdb relay process.
> >>>
> >>> Inactivity probes don't need to be the same.  They are separate for each
> >>> side of a connection and so configured separately.
> >>>
> >>> You can set up inactivity probe for the server side of the connection via
> >>> database.  So, server will probe the relay every 60 seconds, but today
> >>> it's not possible to set inactivity probe for the relay-to-server 
> >>> direction.
> >>> So, relay will probe the server every 5 seconds.
> >>>
> >>> The way out from this situation is to allow configuration of relays via
> >>> database as well, e.g. relay:db:Local_Config,Config,relays.  This will
> >>> require addition of a new table to the Local_Config database and allowing
> >>> relay config to be parsed from the database in the code.  That wasn't
> >>> implemented yet.
> >>>
>  I saw your talk on last ovscon about this topic, and the solution was in 
>  progress there. But maybe there were some changes from that time? I’m 
>  ready to test it if any. Or, maybe there’s any workaround?
> >>>
> >>> Sorry, we didn't move forward much on that topic since the presentation.
> >>> There are 

Re: [ovs-discuss] ovsdb relay server active connection probe interval do not work

2023-03-07 Thread Ilya Maximets via discuss
On 3/7/23 16:58, Vladislav Odintsov wrote:
> I’ve sent last mail from wrong account and indentation was lost.
> Resending...
> 
>> On 7 Mar 2023, at 18:01, Vladislav Odintsov via discuss 
>>  wrote:
>>
>> Thanks Ilya for the quick and detailed response!
>>
>>> On 7 Mar 2023, at 14:03, Ilya Maximets via discuss 
>>>  wrote:
>>>
>>> On 3/7/23 00:15, Vladislav Odintsov wrote:
 Hi Ilya,

 I’m wondering whether there are possible configuration parameters for 
 ovsdb relay -> main ovsdb server inactivity probe timer.
 My cluster experiencing issues where relay disconnects from main cluster 
 due to 5 sec. inactivity probe timeout.
 Main cluster has quite big database and a bunch of daemons, which connects 
 to it and it makes difficult to maintain connections in time.

 For ovsdb relay as a remote I use in-db configuration (to provide 
 inactivity probe and rbac configuration for ovn-controllers).
 For ovsdb-server, which serves SB, I just set --remote=pssl:.

 I’d like to configure remote for ovsdb cluster via DB to set inactivity 
 probe setting, but I’m not sure about the correct way for that.

 For now I see only two options:
 1. Setup custom database scheme with connection table, serve it in same SB 
 cluster and specify this connection when start ovsdb sb server.
>>>
>>> There is a ovsdb/local-config.ovsschema shipped with OVS that can be
>>> used for that purpose.  But you'll need to craft transactions for it
>>> manually with ovsdb-client.
>>>
>>> There is a control tool prepared by Terry:
>>>  
>>> https://patchwork.ozlabs.org/project/openvswitch/patch/20220713030250.2634491-1-twil...@redhat.com/
>>
>> Thanks for pointing on a patch, I guess, I’ll test it out.
>>
>>>
>>> But it's not in the repo yet (I need to get back to reviews on that
>>> topic at some point).  The tool itself should be fine, but maybe name
>>> will change.
>>
>> Am I right that in-DB remote configuration must be a hosted by this 
>> ovsdb-server database?

Yes.

>> What is the best way to configure additional DB on ovsdb-server so that this 
>> configuration to be permanent?

You may specify multiple database files on the command-line for ovsdb-server
process.  It will open and serve each of them.  They all can be in different
modes, e.g. you have multiple clustered, standalone and relay databases in
the same ovsdb-server process.

There is also ovsdb-server/add-db appctl to add a new database to a running
process, but it will not survive the restart.

>> Also, am I understand correctly that there is no necessity for this DB to be 
>> clustered?

It's kind of a point of the Local_Config database to not be clustered.
The original use case was to allow each cluster member to listen on a
different IP. i.e. if you don't want to listen on 0.0.0.0 and your
cluster members are on different nodes, so have different listening IPs.

>>
>>>
 2. Setup second connection in ovn sb database to be used for ovsdb cluster 
 and deploy cluster separately from ovsdb relay, because they both start 
 same connections and conflict on ports. (I don’t use docker here, so I 
 need a separate server for that).
>>>
>>> That's an easy option available right now, true.  If they are deployed
>>> on different nodes, you may even use the same connection record.
>>>

 Anyway, if I configure ovsdb remote for ovsdb cluster with specified 
 inactivity probe (say, to 60k), I guess it’s still not enough to have 
 ovsdb pings every 60 seconds. Inactivity probe must be the same from both 
 ends - right? From the ovsdb relay process.
>>>
>>> Inactivity probes don't need to be the same.  They are separate for each
>>> side of a connection and so configured separately.
>>>
>>> You can set up inactivity probe for the server side of the connection via
>>> database.  So, server will probe the relay every 60 seconds, but today
>>> it's not possible to set inactivity probe for the relay-to-server direction.
>>> So, relay will probe the server every 5 seconds.
>>>
>>> The way out from this situation is to allow configuration of relays via
>>> database as well, e.g. relay:db:Local_Config,Config,relays.  This will
>>> require addition of a new table to the Local_Config database and allowing
>>> relay config to be parsed from the database in the code.  That wasn't
>>> implemented yet.
>>>
 I saw your talk on last ovscon about this topic, and the solution was in 
 progress there. But maybe there were some changes from that time? I’m 
 ready to test it if any. Or, maybe there’s any workaround?
>>>
>>> Sorry, we didn't move forward much on that topic since the presentation.
>>> There are few unanswered questions around local config database.  Mainly
>>> regarding upgrades from cmdline/main db -based configuration to a local
>>> config -based.  But I hope we can figure that out in the current release
>>> time frame, i.e. before 3.2 release.
> 
> Regarding 

Re: [ovs-discuss] ovsdb relay server active connection probe interval do not work

2023-03-07 Thread Vladislav Odintsov via discuss
I’ve sent last mail from wrong account and indentation was lost.
Resending...

> On 7 Mar 2023, at 18:01, Vladislav Odintsov via discuss 
>  wrote:
> 
> Thanks Ilya for the quick and detailed response!
> 
>> On 7 Mar 2023, at 14:03, Ilya Maximets via discuss 
>>  wrote:
>> 
>> On 3/7/23 00:15, Vladislav Odintsov wrote:
>>> Hi Ilya,
>>> 
>>> I’m wondering whether there are possible configuration parameters for ovsdb 
>>> relay -> main ovsdb server inactivity probe timer.
>>> My cluster experiencing issues where relay disconnects from main cluster 
>>> due to 5 sec. inactivity probe timeout.
>>> Main cluster has quite big database and a bunch of daemons, which connects 
>>> to it and it makes difficult to maintain connections in time.
>>> 
>>> For ovsdb relay as a remote I use in-db configuration (to provide 
>>> inactivity probe and rbac configuration for ovn-controllers).
>>> For ovsdb-server, which serves SB, I just set --remote=pssl:.
>>> 
>>> I’d like to configure remote for ovsdb cluster via DB to set inactivity 
>>> probe setting, but I’m not sure about the correct way for that.
>>> 
>>> For now I see only two options:
>>> 1. Setup custom database scheme with connection table, serve it in same SB 
>>> cluster and specify this connection when start ovsdb sb server.
>> 
>> There is a ovsdb/local-config.ovsschema shipped with OVS that can be
>> used for that purpose.  But you'll need to craft transactions for it
>> manually with ovsdb-client.
>> 
>> There is a control tool prepared by Terry:
>>  
>> https://patchwork.ozlabs.org/project/openvswitch/patch/20220713030250.2634491-1-twil...@redhat.com/
> 
> Thanks for pointing on a patch, I guess, I’ll test it out.
> 
>> 
>> But it's not in the repo yet (I need to get back to reviews on that
>> topic at some point).  The tool itself should be fine, but maybe name
>> will change.
> 
> Am I right that in-DB remote configuration must be a hosted by this 
> ovsdb-server database?
> What is the best way to configure additional DB on ovsdb-server so that this 
> configuration to be permanent?
> Also, am I understand correctly that there is no necessity for this DB to be 
> clustered?
> 
>> 
>>> 2. Setup second connection in ovn sb database to be used for ovsdb cluster 
>>> and deploy cluster separately from ovsdb relay, because they both start 
>>> same connections and conflict on ports. (I don’t use docker here, so I need 
>>> a separate server for that).
>> 
>> That's an easy option available right now, true.  If they are deployed
>> on different nodes, you may even use the same connection record.
>> 
>>> 
>>> Anyway, if I configure ovsdb remote for ovsdb cluster with specified 
>>> inactivity probe (say, to 60k), I guess it’s still not enough to have ovsdb 
>>> pings every 60 seconds. Inactivity probe must be the same from both ends - 
>>> right? From the ovsdb relay process.
>> 
>> Inactivity probes don't need to be the same.  They are separate for each
>> side of a connection and so configured separately.
>> 
>> You can set up inactivity probe for the server side of the connection via
>> database.  So, server will probe the relay every 60 seconds, but today
>> it's not possible to set inactivity probe for the relay-to-server direction.
>> So, relay will probe the server every 5 seconds.
>> 
>> The way out from this situation is to allow configuration of relays via
>> database as well, e.g. relay:db:Local_Config,Config,relays.  This will
>> require addition of a new table to the Local_Config database and allowing
>> relay config to be parsed from the database in the code.  That wasn't
>> implemented yet.
>> 
>>> I saw your talk on last ovscon about this topic, and the solution was in 
>>> progress there. But maybe there were some changes from that time? I’m ready 
>>> to test it if any. Or, maybe there’s any workaround?
>> 
>> Sorry, we didn't move forward much on that topic since the presentation.
>> There are few unanswered questions around local config database.  Mainly
>> regarding upgrades from cmdline/main db -based configuration to a local
>> config -based.  But I hope we can figure that out in the current release
>> time frame, i.e. before 3.2 release.

Regarding configuration method… Just like an idea (I haven’t seen this variant 
as one of possible).
Remote add/remove is possible via ovsdb-server ctl socket. Could introducing 
new command
"ovsdb-server/set-remote-param PARAM=VALUE" be a solution here?

>> 
>> There is also this workaround:
>>  
>> https://patchwork.ozlabs.org/project/openvswitch/patch/an2a4qcpihpcfukyt1uomqre.1.1641782536691.hmail.wentao@easystack.cn/
>> It simply takes the server->relay inactivity probe value and applies it
>> to the relay->server connection.  But it's not a correct solution, because
>> it relies on certain database names.
>> 
>> Out of curiosity, what kind of poll intervals you see on your main server
>> setup that triggers inactivity probe failures?  Can upgrade to OVS 3.1
>> solve some of these issues?  3.1 

Re: [ovs-discuss] ovsdb relay server active connection probe interval do not work

2023-03-07 Thread Odintsov Vladislav via discuss


On 7 Mar 2023, at 18:01, Vladislav Odintsov via discuss 
 wrote:

Thanks Ilya for the quick and detailed response!

On 7 Mar 2023, at 14:03, Ilya Maximets via discuss 
 wrote:

On 3/7/23 00:15, Vladislav Odintsov wrote:
Hi Ilya,

I’m wondering whether there are possible configuration parameters for ovsdb 
relay -> main ovsdb server inactivity probe timer.
My cluster experiencing issues where relay disconnects from main cluster due to 
5 sec. inactivity probe timeout.
Main cluster has quite big database and a bunch of daemons, which connects to 
it and it makes difficult to maintain connections in time.

For ovsdb relay as a remote I use in-db configuration (to provide inactivity 
probe and rbac configuration for ovn-controllers).
For ovsdb-server, which serves SB, I just set --remote=pssl:.

I’d like to configure remote for ovsdb cluster via DB to set inactivity probe 
setting, but I’m not sure about the correct way for that.

For now I see only two options:
1. Setup custom database scheme with connection table, serve it in same SB 
cluster and specify this connection when start ovsdb sb server.

There is a ovsdb/local-config.ovsschema shipped with OVS that can be
used for that purpose.  But you'll need to craft transactions for it
manually with ovsdb-client.

There is a control tool prepared by Terry:
 
https://patchwork.ozlabs.org/project/openvswitch/patch/20220713030250.2634491-1-twil...@redhat.com/

Thanks for pointing on a patch, I guess, I’ll test it out.


But it's not in the repo yet (I need to get back to reviews on that
topic at some point).  The tool itself should be fine, but maybe name
will change.

Am I right that in-DB remote configuration must be a hosted by this 
ovsdb-server database?
What is the best way to configure additional DB on ovsdb-server so that this 
configuration to be permanent?
Also, am I understand correctly that there is no necessity for this DB to be 
clustered?


2. Setup second connection in ovn sb database to be used for ovsdb cluster and 
deploy cluster separately from ovsdb relay, because they both start same 
connections and conflict on ports. (I don’t use docker here, so I need a 
separate server for that).

That's an easy option available right now, true.  If they are deployed
on different nodes, you may even use the same connection record.


Anyway, if I configure ovsdb remote for ovsdb cluster with specified inactivity 
probe (say, to 60k), I guess it’s still not enough to have ovsdb pings every 60 
seconds. Inactivity probe must be the same from both ends - right? From the 
ovsdb relay process.

Inactivity probes don't need to be the same.  They are separate for each
side of a connection and so configured separately.

You can set up inactivity probe for the server side of the connection via
database.  So, server will probe the relay every 60 seconds, but today
it's not possible to set inactivity probe for the relay-to-server direction.
So, relay will probe the server every 5 seconds.

The way out from this situation is to allow configuration of relays via
database as well, e.g. relay:db:Local_Config,Config,relays.  This will
require addition of a new table to the Local_Config database and allowing
relay config to be parsed from the database in the code.  That wasn't
implemented yet.

I saw your talk on last ovscon about this topic, and the solution was in 
progress there. But maybe there were some changes from that time? I’m ready to 
test it if any. Or, maybe there’s any workaround?

Sorry, we didn't move forward much on that topic since the presentation.
There are few unanswered questions around local config database.  Mainly
regarding upgrades from cmdline/main db -based configuration to a local
config -based.  But I hope we can figure that out in the current release
time frame, i.e. before 3.2 release.

Regarding configuration method… Just like an idea (I haven’t seen this variant 
as one of possible).
Remote add/remove is possible via ovsdb-server ctl socket. Could introducing 
new command
"ovsdb-server/set-remote-param PARAM=VALUE" be a solution here?


There is also this workaround:
 
https://patchwork.ozlabs.org/project/openvswitch/patch/an2a4qcpihpcfukyt1uomqre.1.1641782536691.hmail.wentao@easystack.cn/
It simply takes the server->relay inactivity probe value and applies it
to the relay->server connection.  But it's not a correct solution, because
it relies on certain database names.

Out of curiosity, what kind of poll intervals you see on your main server
setup that triggers inactivity probe failures?  Can upgrade to OVS 3.1
solve some of these issues?  3.1 should be noticeably faster than 2.17,
and also parallel compaction introduced in 3.0 removes one of the big
reasons for large poll intervals.  OVN upgrade to 22.09+ or even 23.03
should also help with database sizes.

We see failures on the OVSDB Relay side:

2023-03-06T22:19:32.966Z|00099|reconnect|ERR|ssl:xxx:16642: no response to 
inactivity probe after 5 seconds, disconnecting

Re: [ovs-discuss] ovsdb relay server active connection probe interval do not work

2023-03-07 Thread Vladislav Odintsov via discuss
Thanks Ilya for the quick and detailed response!

> On 7 Mar 2023, at 14:03, Ilya Maximets via discuss 
>  wrote:
> 
> On 3/7/23 00:15, Vladislav Odintsov wrote:
>> Hi Ilya,
>> 
>> I’m wondering whether there are possible configuration parameters for ovsdb 
>> relay -> main ovsdb server inactivity probe timer.
>> My cluster experiencing issues where relay disconnects from main cluster due 
>> to 5 sec. inactivity probe timeout.
>> Main cluster has quite big database and a bunch of daemons, which connects 
>> to it and it makes difficult to maintain connections in time.
>> 
>> For ovsdb relay as a remote I use in-db configuration (to provide inactivity 
>> probe and rbac configuration for ovn-controllers).
>> For ovsdb-server, which serves SB, I just set --remote=pssl:.
>> 
>> I’d like to configure remote for ovsdb cluster via DB to set inactivity 
>> probe setting, but I’m not sure about the correct way for that.
>> 
>> For now I see only two options:
>> 1. Setup custom database scheme with connection table, serve it in same SB 
>> cluster and specify this connection when start ovsdb sb server.
> 
> There is a ovsdb/local-config.ovsschema shipped with OVS that can be
> used for that purpose.  But you'll need to craft transactions for it
> manually with ovsdb-client.
> 
> There is a control tool prepared by Terry:
>  
> https://patchwork.ozlabs.org/project/openvswitch/patch/20220713030250.2634491-1-twil...@redhat.com/

Thanks for pointing on a patch, I guess, I’ll test it out.

> 
> But it's not in the repo yet (I need to get back to reviews on that
> topic at some point).  The tool itself should be fine, but maybe name
> will change.

Am I right that in-DB remote configuration must be a hosted by this 
ovsdb-server database?
What is the best way to configure additional DB on ovsdb-server so that this 
configuration to be permanent?
Also, am I understand correctly that there is no necessity for this DB to be 
clustered?

> 
>> 2. Setup second connection in ovn sb database to be used for ovsdb cluster 
>> and deploy cluster separately from ovsdb relay, because they both start same 
>> connections and conflict on ports. (I don’t use docker here, so I need a 
>> separate server for that).
> 
> That's an easy option available right now, true.  If they are deployed
> on different nodes, you may even use the same connection record.
> 
>> 
>> Anyway, if I configure ovsdb remote for ovsdb cluster with specified 
>> inactivity probe (say, to 60k), I guess it’s still not enough to have ovsdb 
>> pings every 60 seconds. Inactivity probe must be the same from both ends - 
>> right? From the ovsdb relay process.
> 
> Inactivity probes don't need to be the same.  They are separate for each
> side of a connection and so configured separately.
> 
> You can set up inactivity probe for the server side of the connection via
> database.  So, server will probe the relay every 60 seconds, but today
> it's not possible to set inactivity probe for the relay-to-server direction.
> So, relay will probe the server every 5 seconds.
> 
> The way out from this situation is to allow configuration of relays via
> database as well, e.g. relay:db:Local_Config,Config,relays.  This will
> require addition of a new table to the Local_Config database and allowing
> relay config to be parsed from the database in the code.  That wasn't
> implemented yet.
> 
>> I saw your talk on last ovscon about this topic, and the solution was in 
>> progress there. But maybe there were some changes from that time? I’m ready 
>> to test it if any. Or, maybe there’s any workaround?
> 
> Sorry, we didn't move forward much on that topic since the presentation.
> There are few unanswered questions around local config database.  Mainly
> regarding upgrades from cmdline/main db -based configuration to a local
> config -based.  But I hope we can figure that out in the current release
> time frame, i.e. before 3.2 release.
> 
> There is also this workaround:
>  
> https://patchwork.ozlabs.org/project/openvswitch/patch/an2a4qcpihpcfukyt1uomqre.1.1641782536691.hmail.wentao@easystack.cn/
> It simply takes the server->relay inactivity probe value and applies it
> to the relay->server connection.  But it's not a correct solution, because
> it relies on certain database names.
> 
> Out of curiosity, what kind of poll intervals you see on your main server
> setup that triggers inactivity probe failures?  Can upgrade to OVS 3.1
> solve some of these issues?  3.1 should be noticeably faster than 2.17,
> and also parallel compaction introduced in 3.0 removes one of the big
> reasons for large poll intervals.  OVN upgrade to 22.09+ or even 23.03
> should also help with database sizes.

We see failures on the OVSDB Relay side:

2023-03-06T22:19:32.966Z|00099|reconnect|ERR|ssl:xxx:16642: no response to 
inactivity probe after 5 seconds, disconnecting
2023-03-06T22:19:32.966Z|00100|reconnect|INFO|ssl:xxx:16642: connection dropped

Re: [ovs-discuss] ovsdb relay server active connection probe interval do not work

2023-03-07 Thread Ilya Maximets via discuss
On 3/7/23 00:15, Vladislav Odintsov wrote:
> Hi Ilya,
> 
> I’m wondering whether there are possible configuration parameters for ovsdb 
> relay -> main ovsdb server inactivity probe timer.
> My cluster experiencing issues where relay disconnects from main cluster due 
> to 5 sec. inactivity probe timeout.
> Main cluster has quite big database and a bunch of daemons, which connects to 
> it and it makes difficult to maintain connections in time.
> 
> For ovsdb relay as a remote I use in-db configuration (to provide inactivity 
> probe and rbac configuration for ovn-controllers).
> For ovsdb-server, which serves SB, I just set --remote=pssl:.
> 
> I’d like to configure remote for ovsdb cluster via DB to set inactivity probe 
> setting, but I’m not sure about the correct way for that.
> 
> For now I see only two options:
> 1. Setup custom database scheme with connection table, serve it in same SB 
> cluster and specify this connection when start ovsdb sb server.

There is a ovsdb/local-config.ovsschema shipped with OVS that can be
used for that purpose.  But you'll need to craft transactions for it
manually with ovsdb-client.

There is a control tool prepared by Terry:
  
https://patchwork.ozlabs.org/project/openvswitch/patch/20220713030250.2634491-1-twil...@redhat.com/

But it's not in the repo yet (I need to get back to reviews on that
topic at some point).  The tool itself should be fine, but maybe name
will change.

> 2. Setup second connection in ovn sb database to be used for ovsdb cluster 
> and deploy cluster separately from ovsdb relay, because they both start same 
> connections and conflict on ports. (I don’t use docker here, so I need a 
> separate server for that).

That's an easy option available right now, true.  If they are deployed
on different nodes, you may even use the same connection record.

> 
> Anyway, if I configure ovsdb remote for ovsdb cluster with specified 
> inactivity probe (say, to 60k), I guess it’s still not enough to have ovsdb 
> pings every 60 seconds. Inactivity probe must be the same from both ends - 
> right? From the ovsdb relay process.

Inactivity probes don't need to be the same.  They are separate for each
side of a connection and so configured separately.

You can set up inactivity probe for the server side of the connection via
database.  So, server will probe the relay every 60 seconds, but today
it's not possible to set inactivity probe for the relay-to-server direction.
So, relay will probe the server every 5 seconds.

The way out from this situation is to allow configuration of relays via
database as well, e.g. relay:db:Local_Config,Config,relays.  This will
require addition of a new table to the Local_Config database and allowing
relay config to be parsed from the database in the code.  That wasn't
implemented yet.

> I saw your talk on last ovscon about this topic, and the solution was in 
> progress there. But maybe there were some changes from that time? I’m ready 
> to test it if any. Or, maybe there’s any workaround?

Sorry, we didn't move forward much on that topic since the presentation.
There are few unanswered questions around local config database.  Mainly
regarding upgrades from cmdline/main db -based configuration to a local
config -based.  But I hope we can figure that out in the current release
time frame, i.e. before 3.2 release.

There is also this workaround:
  
https://patchwork.ozlabs.org/project/openvswitch/patch/an2a4qcpihpcfukyt1uomqre.1.1641782536691.hmail.wentao@easystack.cn/
It simply takes the server->relay inactivity probe value and applies it
to the relay->server connection.  But it's not a correct solution, because
it relies on certain database names.

Out of curiosity, what kind of poll intervals you see on your main server
setup that triggers inactivity probe failures?  Can upgrade to OVS 3.1
solve some of these issues?  3.1 should be noticeably faster than 2.17,
and also parallel compaction introduced in 3.0 removes one of the big
reasons for large poll intervals.  OVN upgrade to 22.09+ or even 23.03
should also help with database sizes.

Best regards, Ilya Maximets.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] ovsdb relay server active connection probe interval do not work

2023-03-06 Thread Vladislav Odintsov via discuss
Hi Ilya,

I’m wondering whether there are possible configuration parameters for ovsdb 
relay -> main ovsdb server inactivity probe timer.
My cluster experiencing issues where relay disconnects from main cluster due to 
5 sec. inactivity probe timeout.
Main cluster has quite big database and a bunch of daemons, which connects to 
it and it makes difficult to maintain connections in time.

For ovsdb relay as a remote I use in-db configuration (to provide inactivity 
probe and rbac configuration for ovn-controllers).
For ovsdb-server, which serves SB, I just set --remote=pssl:.

I’d like to configure remote for ovsdb cluster via DB to set inactivity probe 
setting, but I’m not sure about the correct way for that.

For now I see only two options:
1. Setup custom database scheme with connection table, serve it in same SB 
cluster and specify this connection when start ovsdb sb server.
2. Setup second connection in ovn sb database to be used for ovsdb cluster and 
deploy cluster separately from ovsdb relay, because they both start same 
connections and conflict on ports. (I don’t use docker here, so I need a 
separate server for that).

Anyway, if I configure ovsdb remote for ovsdb cluster with specified inactivity 
probe (say, to 60k), I guess it’s still not enough to have ovsdb pings every 60 
seconds. Inactivity probe must be the same from both ends - right? From the 
ovsdb relay process.
I saw your talk on last ovscon about this topic, and the solution was in 
progress there. But maybe there were some changes from that time? I’m ready to 
test it if any. Or, maybe there’s any workaround?

Regards,
Vladislav Odintsov

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] ovsdb relay server active connection probe interval do not work

2021-08-25 Thread 贾文涛

Hi,Ilya Maximetsthanks for your replay


I am running ovn large scale test,with 1000 sandbox (that is 1000 
ovn-controller),3 clustered nb ,3 nb-relay, 3 clustered sb,20 sb-relay
configure flows:neutron-server <> nb-relay <> nb <>  northd <> 
sb <> sb-relay <> ovn-controller
default 5 seconds probe interval will  cause connection flapping: large 
transaction handing,db log compression,...


ovsdb relay server has two kinds of connections:active connection and passive 
connection, active connection ,as ovsdb client,connect to clustered ovsdb 
server,and passive connection listening  other client connect to itself
I config this two kinds of connections in nb:  active connection: 
"tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641"  passive connection: 
“ptcp:6641:0.0.0.0”
it cannot relay server share the same connection configurations with clustered 
ovsdb-server? 


it is not a good way to have another small database with a relay configuration. 
 an example: ovn-northd has no database,probe interval read from NB,config 
northd probe interval like this:ovn-nbctl  set NB_Global . 
options:northd_probe_interval=6,can relay sever read probe interval  from 
NB or SB?   if probe interval of relay server cannot read from NB or SB,   
appctl command can be  as consider,because it can reconfig without restart
Best regards, Wentao Jia
the follow msg is configuration of my test:
clustered ovsdb server
 ovsdb-server -vconsole:info -vsyslog:off -vfile:off 
--log-file=/var/log/ovn/ovsdb-server-nb.log 
--remote=punix:/var/run/ovn/ovnnb_db.sock --pidfile=/var/run/ovn/ovnnb_db.pid 
--unixctl=/var/run/ovn/ovnnb_db.ctl 
--remote=db:OVN_Northbound,NB_Global,connections 
--private-key=db:OVN_Northbound,SSL,private_key 
--certificate=db:OVN_Northbound,SSL,certificate 
--ca-cert=db:OVN_Northbound,SSL,ca_cert 
--ssl-protocols=db:OVN_Northbound,SSL,ssl_protocols 
--ssl-ciphers=db:OVN_Northbound,SSL,ssl_ciphers /etc/ovn/ovnnb_db.db


ovsdb relay server:
ovsdb-server --remote=db:OVN_Northbound,NB_Global,connections -vconsole:info 
-vsyslog:off -vfile:off --log-file=/var/log/ovn/ovsdb-server-nb.log 
relay:OVN_Northbound:tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641



connecion configuration:one active connection and one passive connection
()[root@ovn-busybox-0 /]# ovn-nbctl list connection
_uuid   : 5ddab5a4-a267-42b4-9dd4-76d55855a109
external_ids: {}
inactivity_probe: 12
is_connected: true
max_backoff : []
other_config: {}
status  : {sec_since_connect="143208", state=ACTIVE}
target  : "tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641"


_uuid   : 351b99bb-dd6a-4ba3-9c30-c0b4cff183e7
external_ids: {}
inactivity_probe: 0
is_connected: true
max_backoff : []
other_config: {}
status  : {bound_port="6641", sec_since_connect="0", 
sec_since_disconnect="0"}
target  : "ptcp:6641:0.0.0.0"






发件人:Ilya Maximets 
发送日期:2021-08-26 02:38:58
收件人:ovs-discuss@openvswitch.org,"贾文涛" 
抄送人:i.maxim...@ovn.org
主题:Re: [ovs-discuss] ovsdb relay server active connection probe interval do not 
work>> hi,all
>> 
>> 
>>  the default inactivity probe interval of ovsdb relay server to nb/sb ovsdb 
>> server is 5000ms.
>>  I set an active connection as follow,set inactivity probe interval to 
>> 12ms :
>> _uuid   : 5ddab5a4-a267-42b4-9dd4-76d55855a109
>> external_ids: {}
>> inactivity_probe: 12
>> is_connected: true
>> max_backoff : []
>> other_config: {}
>> status  : {sec_since_connect="0", state=ACTIVE}
>> target  : "tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641"
>
>
>Hmm.  How exactly did you configure that?
>
>> 
>> ovn-ovsdb-nb.openstack.svc.cluster.local is a vip 
>> but the inactivity probe is still 5000> 
>> 2021-08-24T12:34:17.313Z|04924|reconnect|DBG|tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641:
>>  idle 120225 ms, sending inactivity probe
>> 2021-08-24T12:36:17.759Z|05854|reconnect|DBG|tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641:
>>  idle 120446 ms, sending inactivity probe
>> 2021-08-24T12:37:06.326Z|06145|reconnect|DBG|tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641:
>>  idle 6853 ms, sending inactivity probe
>> 2021-08-24T12:37:11.330Z|06155|reconnect|DBG|tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641:
>>  idle 5004 ms, sending inactivity probe
>
>This looks like you have 2 different connections.  One with 5000 and
>one with 12 inactivity probe interval.
>
>I suspect that relay server is started something like this:
>
>ovsdb-server ... --remo

Re: [ovs-discuss] ovsdb relay server active connection probe interval do not work

2021-08-25 Thread Ilya Maximets
> hi,all
> 
> 
>  the default inactivity probe interval of ovsdb relay server to nb/sb ovsdb 
> server is 5000ms.
>  I set an active connection as follow,set inactivity probe interval to 
> 12ms :
> _uuid   : 5ddab5a4-a267-42b4-9dd4-76d55855a109
> external_ids: {}
> inactivity_probe: 12
> is_connected: true
> max_backoff : []
> other_config: {}
> status  : {sec_since_connect="0", state=ACTIVE}
> target  : "tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641"


Hmm.  How exactly did you configure that?

> 
> ovn-ovsdb-nb.openstack.svc.cluster.local is a vip 
> but the inactivity probe is still 5000> 
> 2021-08-24T12:34:17.313Z|04924|reconnect|DBG|tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641:
>  idle 120225 ms, sending inactivity probe
> 2021-08-24T12:36:17.759Z|05854|reconnect|DBG|tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641:
>  idle 120446 ms, sending inactivity probe
> 2021-08-24T12:37:06.326Z|06145|reconnect|DBG|tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641:
>  idle 6853 ms, sending inactivity probe
> 2021-08-24T12:37:11.330Z|06155|reconnect|DBG|tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641:
>  idle 5004 ms, sending inactivity probe

This looks like you have 2 different connections.  One with 5000 and
one with 12 inactivity probe interval.

I suspect that relay server is started something like this:

ovsdb-server ... --remote=db:OVN_Northbound,NB_Global,connections \
   relay:OVN_Northbound:tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641

And the connection showed above is configured in this 'connections' row, right?

Connections configured with '--remote' are not the same as 'relay' connections.
So, in this case ovsdb-server will create a relay with the remote specified
in a 'relay:' part and with a default inactivity probe interval.  And it will
open a connection to what is specified in a database row pointed by '--remote'
with a configured values for that connection.  It will expect a client on the
other side of the connection.  So, this connection will connect main server
with relay, but they both will just wait database queries from each other.

Configuring things this way you will also, probably, have a self-connection
from the main server to itself, right?


In general, currently, there is no way to configure inactivity probe interval
for "relay" --> "main server" connection, you can only configure it in the
opposite direction.
Does default inactivity interval cause problems for your setup?

I have a plan to implement that though.  There are several options how to do 
that:

1. Add a simple cmdline argument like '--relay-inactivity-probe=N' that will
   affect all the relay databases on this ovsdb-server process.

   Pros: Simple
   Cons: Affects all relay databases of this process, change requires restart,
 configuration applied to a single process.

2. appctl command that can be executed against relay server, e.g.
 ovs-appctl ovsdb-server/relay-set-inactivity-probe OVN_Northbound 12

   Pros: Simple
   Cons: Doesn't survive restart, configuration applied to a single process.

3. Add more configuration options to the 'relay:' syntax, e.g.:
 relay:inactivity-probe=12:OVN_Northbound:tcp:127.0.0.1:6641

   Pros: Simple
   Cons: Doesn't look like a good API.

4. Have a separate small database with a relay configuration, e.g.

 ovsdb-server ... relay:db:OVSDB_Relay,Relay,relays relay.db

   And a small tool to interact with this local database:

 ovs-relayctl add-relay OVN_Northbound \
  tcp:127.0.0.1:6641 inactivity-probe=12

   This will add a new relay configuration to the OVSDB_Relay database
   and ovsdb-server will start relaying it.

   Pros: Lots of things can be configured including inactivity probes and
 backoff.  Can be extended with relay specific configs in the future.
 Survives restart.  relay.db can be relayed from a separate 
ovsdb-server,
 if needed, so there is no need to configure each relay separately.
   Cons: A bit more complex implementation.

   Example of a complex setup would be:
# start a main database server
 a. ovsdb-server --remote=db:OVN_Nortbound:NB_Global,connections ovnnb.db

# start a small database server that only holds relay.db
 b. ovsdb-server --remote=pssl:6647:server relay.db
ovs-relayctl add-relay OVN_Northbound tcp:your-server:6641 
inactivity-probe=12

# start a relay server that relays OVSDB_Relay and relays everything
# that configured in this db.  If OVSDB_Relay db has configured
# OVN_Northbound db, start accepting connections on remotes configured 
there.
 c. ovsdb-server --remote=db:OVN_Nortbound:NB_Global,connections   \
  relay:db:OVSDB_Relay,Relay,relays \
  relay:OVSDB_Relay:ssl:server:6647

 Once this started, server 'c' will connect to server 'b' and get the 
OVSDB_Relay
  

[ovs-discuss] ovsdb relay server active connection probe interval do not work

2021-08-24 Thread 贾文涛
hi,all


 the default inactivity probe interval of ovsdb relay server to nb/sb ovsdb 
server is 5000ms.
 I set an active connection as follow,set inactivity probe interval to 12ms 
:
_uuid   : 5ddab5a4-a267-42b4-9dd4-76d55855a109
external_ids: {}
inactivity_probe: 12
is_connected: true
max_backoff : []
other_config: {}
status  : {sec_since_connect="0", state=ACTIVE}
target  : "tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641"



ovn-ovsdb-nb.openstack.svc.cluster.local is a vip 


but the inactivity probe is still 5000
the follow is log of ovsdb relay server
2021-08-24T12:34:17.313Z|04924|reconnect|DBG|tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641:
 idle 120225 ms, sending inactivity probe
2021-08-24T12:36:17.759Z|05854|reconnect|DBG|tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641:
 idle 120446 ms, sending inactivity probe
2021-08-24T12:37:06.326Z|06145|reconnect|DBG|tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641:
 idle 6853 ms, sending inactivity probe
2021-08-24T12:37:11.330Z|06155|reconnect|DBG|tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641:
 idle 5004 ms, sending inactivity probe
2021-08-24T12:37:16.334Z|06165|reconnect|DBG|tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641:
 idle 5003 ms, sending inactivity probe
2021-08-24T12:37:21.339Z|06175|reconnect|DBG|tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641:
 idle 5005 ms, sending inactivity probe
2021-08-24T12:37:33.850Z|06226|reconnect|DBG|tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641:
 idle 6681 ms, sending inactivity probe
2021-08-24T12:37:38.855Z|06236|reconnect|DBG|tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641:
 idle 5003 ms, sending inactivity probe
2021-08-24T12:37:43.859Z|06246|reconnect|DBG|tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641:
 idle 5004 ms, sending inactivity probe
2021-08-24T12:37:48.864Z|06256|reconnect|DBG|tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641:
 idle 5004 ms, sending inactivity probe
2021-08-24T12:37:53.870Z|06266|reconnect|DBG|tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641:
 idle 5006 ms, sending inactivity probe
2021-08-24T12:37:58.876Z|06276|reconnect|DBG|tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641:
 idle 5006 ms, sending inactivity probe
2021-08-24T12:38:08.882Z|06293|reconnect|DBG|tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641:
 idle 6299 ms, sending inactivity probe
2021-08-24T12:38:13.887Z|06303|reconnect|DBG|tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641:
 idle 5003 ms, sending inactivity probe
2021-08-24T12:38:18.890Z|06313|reconnect|DBG|tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641:
 idle 121131 ms, sending inactivity probe
2021-08-24T12:38:18.891Z|06316|reconnect|DBG|tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641:
 idle 5004 ms, sending inactivity probe
2021-08-24T12:38:23.895Z|06330|reconnect|DBG|tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641:
 idle 5004 ms, sending inactivity probe
2021-08-24T12:38:28.901Z|06340|reconnect|DBG|tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641:
 idle 5005 ms, sending inactivity probe
2021-08-24T12:38:33.905Z|06350|reconnect|DBG|tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641:
 idle 5004 ms, sending inactivity probe
2021-08-24T12:38:38.909Z|06360|reconnect|DBG|tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641:
 idle 5003 ms, sending inactivity probe
2021-08-24T12:38:43.913Z|06370|reconnect|DBG|tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641:
 idle 5003 ms, sending inactivity probe
2021-08-24T12:38:48.922Z|06380|reconnect|DBG|tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641:
 idle 5009 ms, sending inactivity probe
2021-08-24T12:38:53.926Z|06390|reconnect|DBG|tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641:
 idle 5003 ms, sending inactivity probe
2021-08-24T12:38:58.930Z|06400|reconnect|DBG|tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641:
 idle 5003 ms, sending inactivity probe
2021-08-24T12:39:03.934Z|06410|reconnect|DBG|tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641:
 idle 5003 ms, sending inactivity probe
2021-08-24T12:39:08.938Z|06420|reconnect|DBG|tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641:
 idle 5004 ms, sending inactivity probe
2021-08-24T12:39:13.941Z|06430|reconnect|DBG|tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641:
 idle 5002 ms, sending inactivity probe
2021-08-24T12:39:18.946Z|06440|reconnect|DBG|tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641:
 idle 5004 ms, sending inactivity probe
2021-08-24T12:39:23.951Z|06452|reconnect|DBG|tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641:
 idle 5005 ms, sending inactivity probe
2021-08-24T12:39:28.956Z|06462|reconnect|DBG|tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641:
 idle 5004 ms, sending inactivity probe
2021-08-24T12:39:33.962Z|06472|reconnect|DBG|tcp:ovn-ovsdb-nb.openstack.svc.cluster.local:6641:
 idle 5006 ms, sending inactivity probe



best regards,Wentao Jia




___
discuss mailing list
disc...@openvswitch.org