Re: [ovs-discuss] ovn-controller stranger behaviour
Great news! Thank you Numan! Do you think the Canonical guys can backport this commit to the Ubuntu cloud 'Xena' repository to release a package update? Thanks, Roberto Em seg., 27 de jun. de 2022 às 17:33, Numan Siddique escreveu: > On Mon, Jun 27, 2022 at 1:35 PM Numan Siddique wrote: > > > > On Mon, Jun 27, 2022 at 12:25 PM Numan Siddique wrote: > > > > > > On Mon, Jun 27, 2022 at 3:56 AM Dumitru Ceara > wrote: > > > > > > > > On 6/24/22 21:50, Numan Siddique wrote: > > > > > On Fri, Jun 24, 2022 at 11:53 AM ROBERTO BARTZEN ACOSTA via > discuss < > > > > > ovs-discuss@openvswitch.org> wrote: > > > > > > > > > >> Hi Dumitru, > > > > >> > > > > > > > > Hi Roberto, > > > > > > > > >> I also think this issue is related to ovn-monitor-all=true but > I'm not > > > > >> sure about the CPU usage consequences of disabling this. > > > > >> > > > > >> In OVSDB the changes are tracked and applied to each client in > the IDL > > > > >> layer. The OVSDB_IDL_MONITOR is set by default, then the IDL > replicates the > > > > >> changes in the database. Therefore, an OVSDB-IDL transaction > modifies the > > > > >> contents of a database and the client requests information about > the > > > > >> incremental changes. > > > > >> > > > > >> The ovn-monitor-all set the condition clause as true in > update_sb_monitors > > > > >> function and enable the monitor for many database information, > such as: > > > > >> Port_Bindings rows for local interfaces and local datapaths; > Monitor > > > > >> Logical_Flow, MAC_Binding, Multicast_Group, and DNS tables for > local > > > > >> datapaths; Monitor Controller_Event rows for local chassis; etc. > > > > >> > > > > >> Using these conditions clauses allows ovn-monitor-all to filter > to only > > > > >> replicate when specific conditions are met. However, the default > behavior > > > > >> is different, when the IDL replicates a particular table in the > database, > > > > >> it replicates every row in the table. > > > > >> > > > > >> I would like to better understand the computational advantages of > the > > > > >> ovn-controller conditional replication clauses, and the risks of > not > > > > >> enabling this parameter in a large-scale solution. > > > > >> > > > > >> Best regards, > > > > >> Roberto > > > > >> > > > > >> > > > > > On large scale deployments, our testing has shown that - > ovn-monitor-all=false > > > > > puts a significant amount of CPU load to the Southbound > ovsdb-server as it > > > > > has to conditionally send the data to each ovn-controller. > > > > > And hence we added the option - ovn-monitor-all=true. Drawback of > this is > > > > > that an ovn-controller with ovn-monitor-all=true will get all the > DB > > > > > updates. But this is still better compared to slow southbound > ovsdb-server. > > > > > > > > > > > > > Another potential drawback of ovn-monitor-all=true is additional > network > > > > traffic (all clients need to get all updates). But, as Numan > mentioned > > > > above, in all our scale testing (both for OpenStack and OpenShift) > the > > > > impact of this seems to be less significant compared to the impact on > > > > the Southbound when ovn-monitor-all=false. > > > > > > > > > > > > Numan, to avoid using ovn-monitor-all=false as workaround in this > case, > > > > do you think we can port > > > > > https://github.com/ovn-org/ovn/commit/0a4e073f4124b58f1b21778ec2293bbc4180e3e0 > > > > to branch-21.09 and see if Xena can pick it up? > > > > > > > > > > Given that this commit has fixed a bug, I think we can backport to > > > older branches. Looks like its not backported to branch-22.03 as > > > well. > > > > Actually it's already backported till branch-21.12. I'll backport to > > branch-21.09 once the CI here passes - > > > https://github.com/numansiddique/ovn/runs/7077845291?check_suite_focus=true > > > > Backport to branch-21.09 done. > > Thanks > Numan > > > Thanks > > Numan > > > > > > > > > > I'll see if it can be backported easily. > > > > > > Numan > > > > > > > Thanks, > > > > Dumitru > > > > > > > > > Thanks > > > > > Numan > > > > > > > > > > > > > > > > > > > >> Em sex., 24 de jun. de 2022 às 12:13, Tiago Pires < > tiag...@gmail.com> > > > > >> escreveu: > > > > >> > > > > >>> Hi Dumitru, > > > > >>> > > > > >>> I did a test and configuring ovn-monitor-all as false to solve > this > > > > >>> behaviour. > > > > >>> It seems the option I have now is to use it as a workaround > until I have > > > > >>> conditions to upgrade to Yoga that has OVN 22.03. > > > > >>> > > > > >>> Thank you for your help. > > > > >>> > > > > >>> Regards, > > > > >>> > > > > >>> Tiago Pires > > > > >>> > > > > >>> > > > > >>> Em sex., 24 de jun. de 2022 às 04:27, Dumitru Ceara < > dce...@redhat.com> > > > > >>> escreveu: > > > > >>> > > > > On 6/23/22 22:23, Tiago Pires wrote: > > > > > Hi all, > > > > > > > > > > > > > Hi Tiago, > > > > > > > > > I did some troubleshooting and I'm seeing this error > (ovs-vswitchd) > > > >
Re: [ovs-discuss] ovn-controller stranger behaviour
On Mon, Jun 27, 2022 at 1:35 PM Numan Siddique wrote: > > On Mon, Jun 27, 2022 at 12:25 PM Numan Siddique wrote: > > > > On Mon, Jun 27, 2022 at 3:56 AM Dumitru Ceara wrote: > > > > > > On 6/24/22 21:50, Numan Siddique wrote: > > > > On Fri, Jun 24, 2022 at 11:53 AM ROBERTO BARTZEN ACOSTA via discuss < > > > > ovs-discuss@openvswitch.org> wrote: > > > > > > > >> Hi Dumitru, > > > >> > > > > > > Hi Roberto, > > > > > > >> I also think this issue is related to ovn-monitor-all=true but I'm not > > > >> sure about the CPU usage consequences of disabling this. > > > >> > > > >> In OVSDB the changes are tracked and applied to each client in the IDL > > > >> layer. The OVSDB_IDL_MONITOR is set by default, then the IDL > > > >> replicates the > > > >> changes in the database. Therefore, an OVSDB-IDL transaction modifies > > > >> the > > > >> contents of a database and the client requests information about the > > > >> incremental changes. > > > >> > > > >> The ovn-monitor-all set the condition clause as true in > > > >> update_sb_monitors > > > >> function and enable the monitor for many database information, such as: > > > >> Port_Bindings rows for local interfaces and local datapaths; Monitor > > > >> Logical_Flow, MAC_Binding, Multicast_Group, and DNS tables for local > > > >> datapaths; Monitor Controller_Event rows for local chassis; etc. > > > >> > > > >> Using these conditions clauses allows ovn-monitor-all to filter to only > > > >> replicate when specific conditions are met. However, the default > > > >> behavior > > > >> is different, when the IDL replicates a particular table in the > > > >> database, > > > >> it replicates every row in the table. > > > >> > > > >> I would like to better understand the computational advantages of the > > > >> ovn-controller conditional replication clauses, and the risks of not > > > >> enabling this parameter in a large-scale solution. > > > >> > > > >> Best regards, > > > >> Roberto > > > >> > > > >> > > > > On large scale deployments, our testing has shown that - > > > > ovn-monitor-all=false > > > > puts a significant amount of CPU load to the Southbound ovsdb-server as > > > > it > > > > has to conditionally send the data to each ovn-controller. > > > > And hence we added the option - ovn-monitor-all=true. Drawback of this > > > > is > > > > that an ovn-controller with ovn-monitor-all=true will get all the DB > > > > updates. But this is still better compared to slow southbound > > > > ovsdb-server. > > > > > > > > > > Another potential drawback of ovn-monitor-all=true is additional network > > > traffic (all clients need to get all updates). But, as Numan mentioned > > > above, in all our scale testing (both for OpenStack and OpenShift) the > > > impact of this seems to be less significant compared to the impact on > > > the Southbound when ovn-monitor-all=false. > > > > > > > > > Numan, to avoid using ovn-monitor-all=false as workaround in this case, > > > do you think we can port > > > https://github.com/ovn-org/ovn/commit/0a4e073f4124b58f1b21778ec2293bbc4180e3e0 > > > to branch-21.09 and see if Xena can pick it up? > > > > > > > Given that this commit has fixed a bug, I think we can backport to > > older branches. Looks like its not backported to branch-22.03 as > > well. > > Actually it's already backported till branch-21.12. I'll backport to > branch-21.09 once the CI here passes - > https://github.com/numansiddique/ovn/runs/7077845291?check_suite_focus=true > Backport to branch-21.09 done. Thanks Numan > Thanks > Numan > > > > > > I'll see if it can be backported easily. > > > > Numan > > > > > Thanks, > > > Dumitru > > > > > > > Thanks > > > > Numan > > > > > > > > > > > > > > > >> Em sex., 24 de jun. de 2022 às 12:13, Tiago Pires > > > >> escreveu: > > > >> > > > >>> Hi Dumitru, > > > >>> > > > >>> I did a test and configuring ovn-monitor-all as false to solve this > > > >>> behaviour. > > > >>> It seems the option I have now is to use it as a workaround until I > > > >>> have > > > >>> conditions to upgrade to Yoga that has OVN 22.03. > > > >>> > > > >>> Thank you for your help. > > > >>> > > > >>> Regards, > > > >>> > > > >>> Tiago Pires > > > >>> > > > >>> > > > >>> Em sex., 24 de jun. de 2022 às 04:27, Dumitru Ceara > > > >>> > > > >>> escreveu: > > > >>> > > > On 6/23/22 22:23, Tiago Pires wrote: > > > > Hi all, > > > > > > > > > > Hi Tiago, > > > > > > > I did some troubleshooting and I'm seeing this error (ovs-vswitchd) > > > always > > > > when a VM is created in a Chassi: > > > > 2022-06-23T11:47:08.385Z|07907|bridge|WARN|could not open network > > > device > > > > tap8a43df0c-fd (No such device) > > > > 2022-06-23T11:47:09.282Z|07908|bridge|INFO|bridge br-int: added > > > interface > > > > tap8a43df0c-fd on port 51 > > > > 2022-06-23T11:47:09.645Z|07909|bridge|INFO|bridge br-int: added > > > interface > > > > tap3200bf1c-20 on port 52 > > > >
Re: [ovs-discuss] ovn-controller stranger behaviour
On Mon, Jun 27, 2022 at 12:25 PM Numan Siddique wrote: > > On Mon, Jun 27, 2022 at 3:56 AM Dumitru Ceara wrote: > > > > On 6/24/22 21:50, Numan Siddique wrote: > > > On Fri, Jun 24, 2022 at 11:53 AM ROBERTO BARTZEN ACOSTA via discuss < > > > ovs-discuss@openvswitch.org> wrote: > > > > > >> Hi Dumitru, > > >> > > > > Hi Roberto, > > > > >> I also think this issue is related to ovn-monitor-all=true but I'm not > > >> sure about the CPU usage consequences of disabling this. > > >> > > >> In OVSDB the changes are tracked and applied to each client in the IDL > > >> layer. The OVSDB_IDL_MONITOR is set by default, then the IDL replicates > > >> the > > >> changes in the database. Therefore, an OVSDB-IDL transaction modifies the > > >> contents of a database and the client requests information about the > > >> incremental changes. > > >> > > >> The ovn-monitor-all set the condition clause as true in > > >> update_sb_monitors > > >> function and enable the monitor for many database information, such as: > > >> Port_Bindings rows for local interfaces and local datapaths; Monitor > > >> Logical_Flow, MAC_Binding, Multicast_Group, and DNS tables for local > > >> datapaths; Monitor Controller_Event rows for local chassis; etc. > > >> > > >> Using these conditions clauses allows ovn-monitor-all to filter to only > > >> replicate when specific conditions are met. However, the default behavior > > >> is different, when the IDL replicates a particular table in the database, > > >> it replicates every row in the table. > > >> > > >> I would like to better understand the computational advantages of the > > >> ovn-controller conditional replication clauses, and the risks of not > > >> enabling this parameter in a large-scale solution. > > >> > > >> Best regards, > > >> Roberto > > >> > > >> > > > On large scale deployments, our testing has shown that - > > > ovn-monitor-all=false > > > puts a significant amount of CPU load to the Southbound ovsdb-server as it > > > has to conditionally send the data to each ovn-controller. > > > And hence we added the option - ovn-monitor-all=true. Drawback of this is > > > that an ovn-controller with ovn-monitor-all=true will get all the DB > > > updates. But this is still better compared to slow southbound > > > ovsdb-server. > > > > > > > Another potential drawback of ovn-monitor-all=true is additional network > > traffic (all clients need to get all updates). But, as Numan mentioned > > above, in all our scale testing (both for OpenStack and OpenShift) the > > impact of this seems to be less significant compared to the impact on > > the Southbound when ovn-monitor-all=false. > > > > > > Numan, to avoid using ovn-monitor-all=false as workaround in this case, > > do you think we can port > > https://github.com/ovn-org/ovn/commit/0a4e073f4124b58f1b21778ec2293bbc4180e3e0 > > to branch-21.09 and see if Xena can pick it up? > > > > Given that this commit has fixed a bug, I think we can backport to > older branches. Looks like its not backported to branch-22.03 as > well. Actually it's already backported till branch-21.12. I'll backport to branch-21.09 once the CI here passes - https://github.com/numansiddique/ovn/runs/7077845291?check_suite_focus=true Thanks Numan > > I'll see if it can be backported easily. > > Numan > > > Thanks, > > Dumitru > > > > > Thanks > > > Numan > > > > > > > > > > > >> Em sex., 24 de jun. de 2022 às 12:13, Tiago Pires > > >> escreveu: > > >> > > >>> Hi Dumitru, > > >>> > > >>> I did a test and configuring ovn-monitor-all as false to solve this > > >>> behaviour. > > >>> It seems the option I have now is to use it as a workaround until I have > > >>> conditions to upgrade to Yoga that has OVN 22.03. > > >>> > > >>> Thank you for your help. > > >>> > > >>> Regards, > > >>> > > >>> Tiago Pires > > >>> > > >>> > > >>> Em sex., 24 de jun. de 2022 às 04:27, Dumitru Ceara > > >>> escreveu: > > >>> > > On 6/23/22 22:23, Tiago Pires wrote: > > > Hi all, > > > > > > > Hi Tiago, > > > > > I did some troubleshooting and I'm seeing this error (ovs-vswitchd) > > always > > > when a VM is created in a Chassi: > > > 2022-06-23T11:47:08.385Z|07907|bridge|WARN|could not open network > > device > > > tap8a43df0c-fd (No such device) > > > 2022-06-23T11:47:09.282Z|07908|bridge|INFO|bridge br-int: added > > interface > > > tap8a43df0c-fd on port 51 > > > 2022-06-23T11:47:09.645Z|07909|bridge|INFO|bridge br-int: added > > interface > > > tap3200bf1c-20 on port 52 > > > 2022-06-23T11:47:19.329Z|07911|connmgr|INFO|br-int<->unix#1468: 430 > > > flow_mods in the 7 s starting 10 s ago (410 adds, 20 deletes) > > > > > > > It doesn't look to me like there's anything to worry about from these > > logs. > > > > > On this commit > > > > > http://patchwork.ozlabs.org/project/ovn/patch/1608197000-637-1-git-send-email-dce...@redhat.com/ > > > it solved so
Re: [ovs-discuss] ovn-controller stranger behaviour
On Mon, Jun 27, 2022 at 3:56 AM Dumitru Ceara wrote: > > On 6/24/22 21:50, Numan Siddique wrote: > > On Fri, Jun 24, 2022 at 11:53 AM ROBERTO BARTZEN ACOSTA via discuss < > > ovs-discuss@openvswitch.org> wrote: > > > >> Hi Dumitru, > >> > > Hi Roberto, > > >> I also think this issue is related to ovn-monitor-all=true but I'm not > >> sure about the CPU usage consequences of disabling this. > >> > >> In OVSDB the changes are tracked and applied to each client in the IDL > >> layer. The OVSDB_IDL_MONITOR is set by default, then the IDL replicates the > >> changes in the database. Therefore, an OVSDB-IDL transaction modifies the > >> contents of a database and the client requests information about the > >> incremental changes. > >> > >> The ovn-monitor-all set the condition clause as true in update_sb_monitors > >> function and enable the monitor for many database information, such as: > >> Port_Bindings rows for local interfaces and local datapaths; Monitor > >> Logical_Flow, MAC_Binding, Multicast_Group, and DNS tables for local > >> datapaths; Monitor Controller_Event rows for local chassis; etc. > >> > >> Using these conditions clauses allows ovn-monitor-all to filter to only > >> replicate when specific conditions are met. However, the default behavior > >> is different, when the IDL replicates a particular table in the database, > >> it replicates every row in the table. > >> > >> I would like to better understand the computational advantages of the > >> ovn-controller conditional replication clauses, and the risks of not > >> enabling this parameter in a large-scale solution. > >> > >> Best regards, > >> Roberto > >> > >> > > On large scale deployments, our testing has shown that - > > ovn-monitor-all=false > > puts a significant amount of CPU load to the Southbound ovsdb-server as it > > has to conditionally send the data to each ovn-controller. > > And hence we added the option - ovn-monitor-all=true. Drawback of this is > > that an ovn-controller with ovn-monitor-all=true will get all the DB > > updates. But this is still better compared to slow southbound ovsdb-server. > > > > Another potential drawback of ovn-monitor-all=true is additional network > traffic (all clients need to get all updates). But, as Numan mentioned > above, in all our scale testing (both for OpenStack and OpenShift) the > impact of this seems to be less significant compared to the impact on > the Southbound when ovn-monitor-all=false. > > > Numan, to avoid using ovn-monitor-all=false as workaround in this case, > do you think we can port > https://github.com/ovn-org/ovn/commit/0a4e073f4124b58f1b21778ec2293bbc4180e3e0 > to branch-21.09 and see if Xena can pick it up? > Given that this commit has fixed a bug, I think we can backport to older branches. Looks like its not backported to branch-22.03 as well. I'll see if it can be backported easily. Numan > Thanks, > Dumitru > > > Thanks > > Numan > > > > > > > >> Em sex., 24 de jun. de 2022 às 12:13, Tiago Pires > >> escreveu: > >> > >>> Hi Dumitru, > >>> > >>> I did a test and configuring ovn-monitor-all as false to solve this > >>> behaviour. > >>> It seems the option I have now is to use it as a workaround until I have > >>> conditions to upgrade to Yoga that has OVN 22.03. > >>> > >>> Thank you for your help. > >>> > >>> Regards, > >>> > >>> Tiago Pires > >>> > >>> > >>> Em sex., 24 de jun. de 2022 às 04:27, Dumitru Ceara > >>> escreveu: > >>> > On 6/23/22 22:23, Tiago Pires wrote: > > Hi all, > > > > Hi Tiago, > > > I did some troubleshooting and I'm seeing this error (ovs-vswitchd) > always > > when a VM is created in a Chassi: > > 2022-06-23T11:47:08.385Z|07907|bridge|WARN|could not open network > device > > tap8a43df0c-fd (No such device) > > 2022-06-23T11:47:09.282Z|07908|bridge|INFO|bridge br-int: added > interface > > tap8a43df0c-fd on port 51 > > 2022-06-23T11:47:09.645Z|07909|bridge|INFO|bridge br-int: added > interface > > tap3200bf1c-20 on port 52 > > 2022-06-23T11:47:19.329Z|07911|connmgr|INFO|br-int<->unix#1468: 430 > > flow_mods in the 7 s starting 10 s ago (410 adds, 20 deletes) > > > > It doesn't look to me like there's anything to worry about from these > logs. > > > On this commit > > > http://patchwork.ozlabs.org/project/ovn/patch/1608197000-637-1-git-send-email-dce...@redhat.com/ > > it solved something similar to my issue. It seems the ovs-vswitchd is > > missing some flows and when I run the recompute it fixes it. > > So, to avoid this issue I'm testing at this moment to run the recompute > > through libvirt hook when a VM gets "started" status. > > > > While this might "fix" the issue it's not really ideal. ovn-controller > should properly install the flows all the time. Otherwise it's a bug. > > > Regards, > > > > Tiago Pires > > > > > > Em qua., 22 de
Re: [ovs-discuss] ovn-controller stranger behaviour
On 6/24/22 21:50, Numan Siddique wrote: > On Fri, Jun 24, 2022 at 11:53 AM ROBERTO BARTZEN ACOSTA via discuss < > ovs-discuss@openvswitch.org> wrote: > >> Hi Dumitru, >> Hi Roberto, >> I also think this issue is related to ovn-monitor-all=true but I'm not >> sure about the CPU usage consequences of disabling this. >> >> In OVSDB the changes are tracked and applied to each client in the IDL >> layer. The OVSDB_IDL_MONITOR is set by default, then the IDL replicates the >> changes in the database. Therefore, an OVSDB-IDL transaction modifies the >> contents of a database and the client requests information about the >> incremental changes. >> >> The ovn-monitor-all set the condition clause as true in update_sb_monitors >> function and enable the monitor for many database information, such as: >> Port_Bindings rows for local interfaces and local datapaths; Monitor >> Logical_Flow, MAC_Binding, Multicast_Group, and DNS tables for local >> datapaths; Monitor Controller_Event rows for local chassis; etc. >> >> Using these conditions clauses allows ovn-monitor-all to filter to only >> replicate when specific conditions are met. However, the default behavior >> is different, when the IDL replicates a particular table in the database, >> it replicates every row in the table. >> >> I would like to better understand the computational advantages of the >> ovn-controller conditional replication clauses, and the risks of not >> enabling this parameter in a large-scale solution. >> >> Best regards, >> Roberto >> >> > On large scale deployments, our testing has shown that - > ovn-monitor-all=false > puts a significant amount of CPU load to the Southbound ovsdb-server as it > has to conditionally send the data to each ovn-controller. > And hence we added the option - ovn-monitor-all=true. Drawback of this is > that an ovn-controller with ovn-monitor-all=true will get all the DB > updates. But this is still better compared to slow southbound ovsdb-server. > Another potential drawback of ovn-monitor-all=true is additional network traffic (all clients need to get all updates). But, as Numan mentioned above, in all our scale testing (both for OpenStack and OpenShift) the impact of this seems to be less significant compared to the impact on the Southbound when ovn-monitor-all=false. Numan, to avoid using ovn-monitor-all=false as workaround in this case, do you think we can port https://github.com/ovn-org/ovn/commit/0a4e073f4124b58f1b21778ec2293bbc4180e3e0 to branch-21.09 and see if Xena can pick it up? Thanks, Dumitru > Thanks > Numan > > > >> Em sex., 24 de jun. de 2022 às 12:13, Tiago Pires >> escreveu: >> >>> Hi Dumitru, >>> >>> I did a test and configuring ovn-monitor-all as false to solve this >>> behaviour. >>> It seems the option I have now is to use it as a workaround until I have >>> conditions to upgrade to Yoga that has OVN 22.03. >>> >>> Thank you for your help. >>> >>> Regards, >>> >>> Tiago Pires >>> >>> >>> Em sex., 24 de jun. de 2022 às 04:27, Dumitru Ceara >>> escreveu: >>> On 6/23/22 22:23, Tiago Pires wrote: > Hi all, > Hi Tiago, > I did some troubleshooting and I'm seeing this error (ovs-vswitchd) always > when a VM is created in a Chassi: > 2022-06-23T11:47:08.385Z|07907|bridge|WARN|could not open network device > tap8a43df0c-fd (No such device) > 2022-06-23T11:47:09.282Z|07908|bridge|INFO|bridge br-int: added interface > tap8a43df0c-fd on port 51 > 2022-06-23T11:47:09.645Z|07909|bridge|INFO|bridge br-int: added interface > tap3200bf1c-20 on port 52 > 2022-06-23T11:47:19.329Z|07911|connmgr|INFO|br-int<->unix#1468: 430 > flow_mods in the 7 s starting 10 s ago (410 adds, 20 deletes) > It doesn't look to me like there's anything to worry about from these logs. > On this commit > http://patchwork.ozlabs.org/project/ovn/patch/1608197000-637-1-git-send-email-dce...@redhat.com/ > it solved something similar to my issue. It seems the ovs-vswitchd is > missing some flows and when I run the recompute it fixes it. > So, to avoid this issue I'm testing at this moment to run the recompute > through libvirt hook when a VM gets "started" status. > While this might "fix" the issue it's not really ideal. ovn-controller should properly install the flows all the time. Otherwise it's a bug. > Regards, > > Tiago Pires > > > Em qua., 22 de jun. de 2022 às 19:43, Tiago Pires > escreveu: > >> Hi all, >> >> I'm trying to understand a stranger's behaviour regarding to >> ovn-controller. >> In my setup I have OVN 21.09/ OVS 2.16 and Xena and sometimes when a new >> VM is created, this VM can reach other VMs in east-west traffic (even in >> differents Chassis) but it can't reach an external network (e.g. Internet) >> through Chassi Gateway. >> I ran the following trace:
Re: [ovs-discuss] ovn-controller stranger behaviour
On Fri, Jun 24, 2022 at 11:53 AM ROBERTO BARTZEN ACOSTA via discuss < ovs-discuss@openvswitch.org> wrote: > Hi Dumitru, > > I also think this issue is related to ovn-monitor-all=true but I'm not > sure about the CPU usage consequences of disabling this. > > In OVSDB the changes are tracked and applied to each client in the IDL > layer. The OVSDB_IDL_MONITOR is set by default, then the IDL replicates the > changes in the database. Therefore, an OVSDB-IDL transaction modifies the > contents of a database and the client requests information about the > incremental changes. > > The ovn-monitor-all set the condition clause as true in update_sb_monitors > function and enable the monitor for many database information, such as: > Port_Bindings rows for local interfaces and local datapaths; Monitor > Logical_Flow, MAC_Binding, Multicast_Group, and DNS tables for local > datapaths; Monitor Controller_Event rows for local chassis; etc. > > Using these conditions clauses allows ovn-monitor-all to filter to only > replicate when specific conditions are met. However, the default behavior > is different, when the IDL replicates a particular table in the database, > it replicates every row in the table. > > I would like to better understand the computational advantages of the > ovn-controller conditional replication clauses, and the risks of not > enabling this parameter in a large-scale solution. > > Best regards, > Roberto > > On large scale deployments, our testing has shown that - ovn-monitor-all=false puts a significant amount of CPU load to the Southbound ovsdb-server as it has to conditionally send the data to each ovn-controller. And hence we added the option - ovn-monitor-all=true. Drawback of this is that an ovn-controller with ovn-monitor-all=true will get all the DB updates. But this is still better compared to slow southbound ovsdb-server. Thanks Numan > Em sex., 24 de jun. de 2022 às 12:13, Tiago Pires > escreveu: > >> Hi Dumitru, >> >> I did a test and configuring ovn-monitor-all as false to solve this >> behaviour. >> It seems the option I have now is to use it as a workaround until I have >> conditions to upgrade to Yoga that has OVN 22.03. >> >> Thank you for your help. >> >> Regards, >> >> Tiago Pires >> >> >> Em sex., 24 de jun. de 2022 às 04:27, Dumitru Ceara >> escreveu: >> >>> On 6/23/22 22:23, Tiago Pires wrote: >>> > Hi all, >>> > >>> >>> Hi Tiago, >>> >>> > I did some troubleshooting and I'm seeing this error (ovs-vswitchd) >>> always >>> > when a VM is created in a Chassi: >>> > 2022-06-23T11:47:08.385Z|07907|bridge|WARN|could not open network >>> device >>> > tap8a43df0c-fd (No such device) >>> > 2022-06-23T11:47:09.282Z|07908|bridge|INFO|bridge br-int: added >>> interface >>> > tap8a43df0c-fd on port 51 >>> > 2022-06-23T11:47:09.645Z|07909|bridge|INFO|bridge br-int: added >>> interface >>> > tap3200bf1c-20 on port 52 >>> > 2022-06-23T11:47:19.329Z|07911|connmgr|INFO|br-int<->unix#1468: 430 >>> > flow_mods in the 7 s starting 10 s ago (410 adds, 20 deletes) >>> > >>> >>> It doesn't look to me like there's anything to worry about from these >>> logs. >>> >>> > On this commit >>> > >>> http://patchwork.ozlabs.org/project/ovn/patch/1608197000-637-1-git-send-email-dce...@redhat.com/ >>> > it solved something similar to my issue. It seems the ovs-vswitchd is >>> > missing some flows and when I run the recompute it fixes it. >>> > So, to avoid this issue I'm testing at this moment to run the recompute >>> > through libvirt hook when a VM gets "started" status. >>> > >>> >>> While this might "fix" the issue it's not really ideal. ovn-controller >>> should properly install the flows all the time. Otherwise it's a bug. >>> >>> > Regards, >>> > >>> > Tiago Pires >>> > >>> > >>> > Em qua., 22 de jun. de 2022 às 19:43, Tiago Pires >>> > escreveu: >>> > >>> >> Hi all, >>> >> >>> >> I'm trying to understand a stranger's behaviour regarding to >>> >> ovn-controller. >>> >> In my setup I have OVN 21.09/ OVS 2.16 and Xena and sometimes when a >>> new >>> >> VM is created, this VM can reach other VMs in east-west traffic (even >>> in >>> >> differents Chassis) but it can't reach an external network (e.g. >>> Internet) >>> >> through Chassi Gateway. >>> >> I ran the following trace: >>> >> # ovs-appctl ofproto/trace br-int >>> >> >>> in_port="93",icmp,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=192.168.40.140,nw_dst=8.8.8.8,nw_ttl=64 >>> >> >>> >> And I got this output: >>> >> Final flow: >>> >> >>> recirc_id=0xc157b1,eth,icmp,reg0=0x300,reg11=0xd,reg12=0x10,reg13=0xf,reg14=0x3,reg15=0x2,metadata=0x29,in_port=93,vlan_tci=0x,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=192.168.40.140,nw_dst=8.8.8.8,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=0,icmp_code=0 >>> >> Megaflow: >>> >> >>> recirc_id=0xc157b1,ct_state=+new-est-rel-rpl-inv+trk,ct_label=0/0x1,eth,icmp,in_port=93,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src= >>> >> 192.168.40.128/26,nw_dst=8.0.0.0/7,nw_ttl=64,n
Re: [ovs-discuss] ovn-controller stranger behaviour
Hi Dumitru, I also think this issue is related to ovn-monitor-all=true but I'm not sure about the CPU usage consequences of disabling this. In OVSDB the changes are tracked and applied to each client in the IDL layer. The OVSDB_IDL_MONITOR is set by default, then the IDL replicates the changes in the database. Therefore, an OVSDB-IDL transaction modifies the contents of a database and the client requests information about the incremental changes. The ovn-monitor-all set the condition clause as true in update_sb_monitors function and enable the monitor for many database information, such as: Port_Bindings rows for local interfaces and local datapaths; Monitor Logical_Flow, MAC_Binding, Multicast_Group, and DNS tables for local datapaths; Monitor Controller_Event rows for local chassis; etc. Using these conditions clauses allows ovn-monitor-all to filter to only replicate when specific conditions are met. However, the default behavior is different, when the IDL replicates a particular table in the database, it replicates every row in the table. I would like to better understand the computational advantages of the ovn-controller conditional replication clauses, and the risks of not enabling this parameter in a large-scale solution. Best regards, Roberto Em sex., 24 de jun. de 2022 às 12:13, Tiago Pires escreveu: > Hi Dumitru, > > I did a test and configuring ovn-monitor-all as false to solve this > behaviour. > It seems the option I have now is to use it as a workaround until I have > conditions to upgrade to Yoga that has OVN 22.03. > > Thank you for your help. > > Regards, > > Tiago Pires > > > Em sex., 24 de jun. de 2022 às 04:27, Dumitru Ceara > escreveu: > >> On 6/23/22 22:23, Tiago Pires wrote: >> > Hi all, >> > >> >> Hi Tiago, >> >> > I did some troubleshooting and I'm seeing this error (ovs-vswitchd) >> always >> > when a VM is created in a Chassi: >> > 2022-06-23T11:47:08.385Z|07907|bridge|WARN|could not open network device >> > tap8a43df0c-fd (No such device) >> > 2022-06-23T11:47:09.282Z|07908|bridge|INFO|bridge br-int: added >> interface >> > tap8a43df0c-fd on port 51 >> > 2022-06-23T11:47:09.645Z|07909|bridge|INFO|bridge br-int: added >> interface >> > tap3200bf1c-20 on port 52 >> > 2022-06-23T11:47:19.329Z|07911|connmgr|INFO|br-int<->unix#1468: 430 >> > flow_mods in the 7 s starting 10 s ago (410 adds, 20 deletes) >> > >> >> It doesn't look to me like there's anything to worry about from these >> logs. >> >> > On this commit >> > >> http://patchwork.ozlabs.org/project/ovn/patch/1608197000-637-1-git-send-email-dce...@redhat.com/ >> > it solved something similar to my issue. It seems the ovs-vswitchd is >> > missing some flows and when I run the recompute it fixes it. >> > So, to avoid this issue I'm testing at this moment to run the recompute >> > through libvirt hook when a VM gets "started" status. >> > >> >> While this might "fix" the issue it's not really ideal. ovn-controller >> should properly install the flows all the time. Otherwise it's a bug. >> >> > Regards, >> > >> > Tiago Pires >> > >> > >> > Em qua., 22 de jun. de 2022 às 19:43, Tiago Pires >> > escreveu: >> > >> >> Hi all, >> >> >> >> I'm trying to understand a stranger's behaviour regarding to >> >> ovn-controller. >> >> In my setup I have OVN 21.09/ OVS 2.16 and Xena and sometimes when a >> new >> >> VM is created, this VM can reach other VMs in east-west traffic (even >> in >> >> differents Chassis) but it can't reach an external network (e.g. >> Internet) >> >> through Chassi Gateway. >> >> I ran the following trace: >> >> # ovs-appctl ofproto/trace br-int >> >> >> in_port="93",icmp,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=192.168.40.140,nw_dst=8.8.8.8,nw_ttl=64 >> >> >> >> And I got this output: >> >> Final flow: >> >> >> recirc_id=0xc157b1,eth,icmp,reg0=0x300,reg11=0xd,reg12=0x10,reg13=0xf,reg14=0x3,reg15=0x2,metadata=0x29,in_port=93,vlan_tci=0x,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=192.168.40.140,nw_dst=8.8.8.8,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=0,icmp_code=0 >> >> Megaflow: >> >> >> recirc_id=0xc157b1,ct_state=+new-est-rel-rpl-inv+trk,ct_label=0/0x1,eth,icmp,in_port=93,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src= >> >> 192.168.40.128/26,nw_dst=8.0.0.0/7,nw_ttl=64,nw_frag=no >> >> Datapath actions: >> >> >> ct(commit,zone=15,label=0/0x1,nat(src)),set(eth(src=fa:16:3e:ec:7f:dd,dst=00:00:00:00:00:00)),set(ipv4(ttl=63)),userspace(pid=3451843211,controller(reason=1,dont_send=1,continuation=0,recirc_id=12670898,rule_cookie=0x3e26215e,controller_id=0,max_len=65535)) >> >> It seems the Datapath is querying the controller and I did not >> understand >> >> the reason. >> >> >> >> So, I did an ovn-controller recompute (ovn-appctl -t ovn-controller >> >> recompute) on the Chassi where the VM is placed to check if it could >> change >> >> the behaviour and I could trace the packet with success and the VM >> started >> >> to communicate with the Internet normally: >> >>
Re: [ovs-discuss] ovn-controller stranger behaviour
Hi Dumitru, I did a test and configuring ovn-monitor-all as false to solve this behaviour. It seems the option I have now is to use it as a workaround until I have conditions to upgrade to Yoga that has OVN 22.03. Thank you for your help. Regards, Tiago Pires Em sex., 24 de jun. de 2022 às 04:27, Dumitru Ceara escreveu: > On 6/23/22 22:23, Tiago Pires wrote: > > Hi all, > > > > Hi Tiago, > > > I did some troubleshooting and I'm seeing this error (ovs-vswitchd) > always > > when a VM is created in a Chassi: > > 2022-06-23T11:47:08.385Z|07907|bridge|WARN|could not open network device > > tap8a43df0c-fd (No such device) > > 2022-06-23T11:47:09.282Z|07908|bridge|INFO|bridge br-int: added interface > > tap8a43df0c-fd on port 51 > > 2022-06-23T11:47:09.645Z|07909|bridge|INFO|bridge br-int: added interface > > tap3200bf1c-20 on port 52 > > 2022-06-23T11:47:19.329Z|07911|connmgr|INFO|br-int<->unix#1468: 430 > > flow_mods in the 7 s starting 10 s ago (410 adds, 20 deletes) > > > > It doesn't look to me like there's anything to worry about from these logs. > > > On this commit > > > http://patchwork.ozlabs.org/project/ovn/patch/1608197000-637-1-git-send-email-dce...@redhat.com/ > > it solved something similar to my issue. It seems the ovs-vswitchd is > > missing some flows and when I run the recompute it fixes it. > > So, to avoid this issue I'm testing at this moment to run the recompute > > through libvirt hook when a VM gets "started" status. > > > > While this might "fix" the issue it's not really ideal. ovn-controller > should properly install the flows all the time. Otherwise it's a bug. > > > Regards, > > > > Tiago Pires > > > > > > Em qua., 22 de jun. de 2022 às 19:43, Tiago Pires > > escreveu: > > > >> Hi all, > >> > >> I'm trying to understand a stranger's behaviour regarding to > >> ovn-controller. > >> In my setup I have OVN 21.09/ OVS 2.16 and Xena and sometimes when a new > >> VM is created, this VM can reach other VMs in east-west traffic (even in > >> differents Chassis) but it can't reach an external network (e.g. > Internet) > >> through Chassi Gateway. > >> I ran the following trace: > >> # ovs-appctl ofproto/trace br-int > >> > in_port="93",icmp,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=192.168.40.140,nw_dst=8.8.8.8,nw_ttl=64 > >> > >> And I got this output: > >> Final flow: > >> > recirc_id=0xc157b1,eth,icmp,reg0=0x300,reg11=0xd,reg12=0x10,reg13=0xf,reg14=0x3,reg15=0x2,metadata=0x29,in_port=93,vlan_tci=0x,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=192.168.40.140,nw_dst=8.8.8.8,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=0,icmp_code=0 > >> Megaflow: > >> > recirc_id=0xc157b1,ct_state=+new-est-rel-rpl-inv+trk,ct_label=0/0x1,eth,icmp,in_port=93,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src= > >> 192.168.40.128/26,nw_dst=8.0.0.0/7,nw_ttl=64,nw_frag=no > >> Datapath actions: > >> > ct(commit,zone=15,label=0/0x1,nat(src)),set(eth(src=fa:16:3e:ec:7f:dd,dst=00:00:00:00:00:00)),set(ipv4(ttl=63)),userspace(pid=3451843211,controller(reason=1,dont_send=1,continuation=0,recirc_id=12670898,rule_cookie=0x3e26215e,controller_id=0,max_len=65535)) > >> It seems the Datapath is querying the controller and I did not > understand > >> the reason. > >> > >> So, I did an ovn-controller recompute (ovn-appctl -t ovn-controller > >> recompute) on the Chassi where the VM is placed to check if it could > change > >> the behaviour and I could trace the packet with success and the VM > started > >> to communicate with the Internet normally: > >> Final flow: > >> > recirc_id=0x2,eth,icmp,reg0=0x300,reg11=0xd,reg12=0x10,reg13=0xf,reg14=0x3,reg15=0x2,metadata=0x29,in_port=93,vlan_tci=0x,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=192.168.40.140,nw_dst=8.8.8.8,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=0,icmp_code=0 > >> Megaflow: > >> > recirc_id=0x2,ct_state=+new-est-rel-rpl-inv+trk,ct_label=0/0x1,eth,icmp,tun_id=0/0xff,tun_metadata0=NP,in_port=93,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src= > >> 192.168.40.128/26,nw_dst=8.0.0.0/7,nw_ecn=0,nw_ttl=64,nw_frag=no > >> Datapath actions: > >> > ct(commit,zone=15,label=0/0x1,nat(src)),set(tunnel(tun_id=0x2a,dst=10.X6.X3.133,ttl=64,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x30002}),flags(df|csum|key))),set(eth(src=fa:16:3e:ec:7f:dd,dst=00:00:5e:00:04:00)),set(ipv4(ttl=63)),2 > >> The Datapath action is using the tunnel with the Chassi Gateway. > >> > >> It happens always with new VMs but sometimes. After running the > recompute > >> on the Chassi, I created additional VMs and this issue did not happen. > >> > >> In my Chassi I have enable these parameters also: > >> ovn-monitor-all="true" > >> ovn-openflow-probe-interval="0" > >> ovn-remote-probe-interval="18" > >> > >> Do you know this behaviour could be bug related? > > This is most definitely a bug. > > Very likely it's the bug that was fixed in this commit: > > https://github.com/ovn-org/ovn/commit/0a4e073f4124b58f1b21778ec2293bbc4180e3e0 > >
Re: [ovs-discuss] ovn-controller stranger behaviour
Hi Tiago, Thanks for reporting the problem. It seems you can easily reproduce the problem, right? If so, could you enable debug log for ovn-controller before triggering the recompute, and then we can see what flows are added during recompute from the logs of the ofctrl module? Thanks, Han On Thu, Jun 23, 2022 at 1:24 PM Tiago Pires wrote: > Hi all, > > I did some troubleshooting and I'm seeing this error (ovs-vswitchd) always > when a VM is created in a Chassi: > 2022-06-23T11:47:08.385Z|07907|bridge|WARN|could not open network device > tap8a43df0c-fd (No such device) > 2022-06-23T11:47:09.282Z|07908|bridge|INFO|bridge br-int: added interface > tap8a43df0c-fd on port 51 > 2022-06-23T11:47:09.645Z|07909|bridge|INFO|bridge br-int: added interface > tap3200bf1c-20 on port 52 > 2022-06-23T11:47:19.329Z|07911|connmgr|INFO|br-int<->unix#1468: 430 > flow_mods in the 7 s starting 10 s ago (410 adds, 20 deletes) > > On this commit > http://patchwork.ozlabs.org/project/ovn/patch/1608197000-637-1-git-send-email-dce...@redhat.com/ > it solved something similar to my issue. It seems the ovs-vswitchd is > missing some flows and when I run the recompute it fixes it. > So, to avoid this issue I'm testing at this moment to run the recompute > through libvirt hook when a VM gets "started" status. > > Regards, > > Tiago Pires > > > Em qua., 22 de jun. de 2022 às 19:43, Tiago Pires > escreveu: > >> Hi all, >> >> I'm trying to understand a stranger's behaviour regarding to >> ovn-controller. >> In my setup I have OVN 21.09/ OVS 2.16 and Xena and sometimes when a new >> VM is created, this VM can reach other VMs in east-west traffic (even in >> differents Chassis) but it can't reach an external network (e.g. Internet) >> through Chassi Gateway. >> I ran the following trace: >> # ovs-appctl ofproto/trace br-int >> in_port="93",icmp,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=192.168.40.140,nw_dst=8.8.8.8,nw_ttl=64 >> >> And I got this output: >> Final flow: >> recirc_id=0xc157b1,eth,icmp,reg0=0x300,reg11=0xd,reg12=0x10,reg13=0xf,reg14=0x3,reg15=0x2,metadata=0x29,in_port=93,vlan_tci=0x,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=192.168.40.140,nw_dst=8.8.8.8,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=0,icmp_code=0 >> Megaflow: >> recirc_id=0xc157b1,ct_state=+new-est-rel-rpl-inv+trk,ct_label=0/0x1,eth,icmp,in_port=93,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src= >> 192.168.40.128/26,nw_dst=8.0.0.0/7,nw_ttl=64,nw_frag=no >> Datapath actions: >> ct(commit,zone=15,label=0/0x1,nat(src)),set(eth(src=fa:16:3e:ec:7f:dd,dst=00:00:00:00:00:00)),set(ipv4(ttl=63)),userspace(pid=3451843211,controller(reason=1,dont_send=1,continuation=0,recirc_id=12670898,rule_cookie=0x3e26215e,controller_id=0,max_len=65535)) >> It seems the Datapath is querying the controller and I did not understand >> the reason. >> >> So, I did an ovn-controller recompute (ovn-appctl -t ovn-controller >> recompute) on the Chassi where the VM is placed to check if it could change >> the behaviour and I could trace the packet with success and the VM started >> to communicate with the Internet normally: >> Final flow: >> recirc_id=0x2,eth,icmp,reg0=0x300,reg11=0xd,reg12=0x10,reg13=0xf,reg14=0x3,reg15=0x2,metadata=0x29,in_port=93,vlan_tci=0x,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=192.168.40.140,nw_dst=8.8.8.8,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=0,icmp_code=0 >> Megaflow: >> recirc_id=0x2,ct_state=+new-est-rel-rpl-inv+trk,ct_label=0/0x1,eth,icmp,tun_id=0/0xff,tun_metadata0=NP,in_port=93,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src= >> 192.168.40.128/26,nw_dst=8.0.0.0/7,nw_ecn=0,nw_ttl=64,nw_frag=no >> Datapath actions: >> ct(commit,zone=15,label=0/0x1,nat(src)),set(tunnel(tun_id=0x2a,dst=10.X6.X3.133,ttl=64,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x30002}),flags(df|csum|key))),set(eth(src=fa:16:3e:ec:7f:dd,dst=00:00:5e:00:04:00)),set(ipv4(ttl=63)),2 >> The Datapath action is using the tunnel with the Chassi Gateway. >> >> It happens always with new VMs but sometimes. After running the recompute >> on the Chassi, I created additional VMs and this issue did not happen. >> >> In my Chassi I have enable these parameters also: >> ovn-monitor-all="true" >> ovn-openflow-probe-interval="0" >> ovn-remote-probe-interval="18" >> >> Do you know this behaviour could be bug related? >> >> Tiago Pires >> >> > ___ > discuss mailing list > disc...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss > ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] ovn-controller stranger behaviour
On 6/23/22 22:23, Tiago Pires wrote: > Hi all, > Hi Tiago, > I did some troubleshooting and I'm seeing this error (ovs-vswitchd) always > when a VM is created in a Chassi: > 2022-06-23T11:47:08.385Z|07907|bridge|WARN|could not open network device > tap8a43df0c-fd (No such device) > 2022-06-23T11:47:09.282Z|07908|bridge|INFO|bridge br-int: added interface > tap8a43df0c-fd on port 51 > 2022-06-23T11:47:09.645Z|07909|bridge|INFO|bridge br-int: added interface > tap3200bf1c-20 on port 52 > 2022-06-23T11:47:19.329Z|07911|connmgr|INFO|br-int<->unix#1468: 430 > flow_mods in the 7 s starting 10 s ago (410 adds, 20 deletes) > It doesn't look to me like there's anything to worry about from these logs. > On this commit > http://patchwork.ozlabs.org/project/ovn/patch/1608197000-637-1-git-send-email-dce...@redhat.com/ > it solved something similar to my issue. It seems the ovs-vswitchd is > missing some flows and when I run the recompute it fixes it. > So, to avoid this issue I'm testing at this moment to run the recompute > through libvirt hook when a VM gets "started" status. > While this might "fix" the issue it's not really ideal. ovn-controller should properly install the flows all the time. Otherwise it's a bug. > Regards, > > Tiago Pires > > > Em qua., 22 de jun. de 2022 às 19:43, Tiago Pires > escreveu: > >> Hi all, >> >> I'm trying to understand a stranger's behaviour regarding to >> ovn-controller. >> In my setup I have OVN 21.09/ OVS 2.16 and Xena and sometimes when a new >> VM is created, this VM can reach other VMs in east-west traffic (even in >> differents Chassis) but it can't reach an external network (e.g. Internet) >> through Chassi Gateway. >> I ran the following trace: >> # ovs-appctl ofproto/trace br-int >> in_port="93",icmp,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=192.168.40.140,nw_dst=8.8.8.8,nw_ttl=64 >> >> And I got this output: >> Final flow: >> recirc_id=0xc157b1,eth,icmp,reg0=0x300,reg11=0xd,reg12=0x10,reg13=0xf,reg14=0x3,reg15=0x2,metadata=0x29,in_port=93,vlan_tci=0x,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=192.168.40.140,nw_dst=8.8.8.8,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=0,icmp_code=0 >> Megaflow: >> recirc_id=0xc157b1,ct_state=+new-est-rel-rpl-inv+trk,ct_label=0/0x1,eth,icmp,in_port=93,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src= >> 192.168.40.128/26,nw_dst=8.0.0.0/7,nw_ttl=64,nw_frag=no >> Datapath actions: >> ct(commit,zone=15,label=0/0x1,nat(src)),set(eth(src=fa:16:3e:ec:7f:dd,dst=00:00:00:00:00:00)),set(ipv4(ttl=63)),userspace(pid=3451843211,controller(reason=1,dont_send=1,continuation=0,recirc_id=12670898,rule_cookie=0x3e26215e,controller_id=0,max_len=65535)) >> It seems the Datapath is querying the controller and I did not understand >> the reason. >> >> So, I did an ovn-controller recompute (ovn-appctl -t ovn-controller >> recompute) on the Chassi where the VM is placed to check if it could change >> the behaviour and I could trace the packet with success and the VM started >> to communicate with the Internet normally: >> Final flow: >> recirc_id=0x2,eth,icmp,reg0=0x300,reg11=0xd,reg12=0x10,reg13=0xf,reg14=0x3,reg15=0x2,metadata=0x29,in_port=93,vlan_tci=0x,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=192.168.40.140,nw_dst=8.8.8.8,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=0,icmp_code=0 >> Megaflow: >> recirc_id=0x2,ct_state=+new-est-rel-rpl-inv+trk,ct_label=0/0x1,eth,icmp,tun_id=0/0xff,tun_metadata0=NP,in_port=93,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src= >> 192.168.40.128/26,nw_dst=8.0.0.0/7,nw_ecn=0,nw_ttl=64,nw_frag=no >> Datapath actions: >> ct(commit,zone=15,label=0/0x1,nat(src)),set(tunnel(tun_id=0x2a,dst=10.X6.X3.133,ttl=64,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x30002}),flags(df|csum|key))),set(eth(src=fa:16:3e:ec:7f:dd,dst=00:00:5e:00:04:00)),set(ipv4(ttl=63)),2 >> The Datapath action is using the tunnel with the Chassi Gateway. >> >> It happens always with new VMs but sometimes. After running the recompute >> on the Chassi, I created additional VMs and this issue did not happen. >> >> In my Chassi I have enable these parameters also: >> ovn-monitor-all="true" >> ovn-openflow-probe-interval="0" >> ovn-remote-probe-interval="18" >> >> Do you know this behaviour could be bug related? This is most definitely a bug. Very likely it's the bug that was fixed in this commit: https://github.com/ovn-org/ovn/commit/0a4e073f4124b58f1b21778ec2293bbc4180e3e0 The change is available in the 21.12 stable branch and later. So you need to upgrade the OVN version in your OpenStack deployment to something that includes it. Hope this helps. Regards, Dumitru >> >> Tiago Pires >> >> > > > ___ > discuss mailing list > disc...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listin
Re: [ovs-discuss] ovn-controller stranger behaviour
Hi all, I did some troubleshooting and I'm seeing this error (ovs-vswitchd) always when a VM is created in a Chassi: 2022-06-23T11:47:08.385Z|07907|bridge|WARN|could not open network device tap8a43df0c-fd (No such device) 2022-06-23T11:47:09.282Z|07908|bridge|INFO|bridge br-int: added interface tap8a43df0c-fd on port 51 2022-06-23T11:47:09.645Z|07909|bridge|INFO|bridge br-int: added interface tap3200bf1c-20 on port 52 2022-06-23T11:47:19.329Z|07911|connmgr|INFO|br-int<->unix#1468: 430 flow_mods in the 7 s starting 10 s ago (410 adds, 20 deletes) On this commit http://patchwork.ozlabs.org/project/ovn/patch/1608197000-637-1-git-send-email-dce...@redhat.com/ it solved something similar to my issue. It seems the ovs-vswitchd is missing some flows and when I run the recompute it fixes it. So, to avoid this issue I'm testing at this moment to run the recompute through libvirt hook when a VM gets "started" status. Regards, Tiago Pires Em qua., 22 de jun. de 2022 às 19:43, Tiago Pires escreveu: > Hi all, > > I'm trying to understand a stranger's behaviour regarding to > ovn-controller. > In my setup I have OVN 21.09/ OVS 2.16 and Xena and sometimes when a new > VM is created, this VM can reach other VMs in east-west traffic (even in > differents Chassis) but it can't reach an external network (e.g. Internet) > through Chassi Gateway. > I ran the following trace: > # ovs-appctl ofproto/trace br-int > in_port="93",icmp,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=192.168.40.140,nw_dst=8.8.8.8,nw_ttl=64 > > And I got this output: > Final flow: > recirc_id=0xc157b1,eth,icmp,reg0=0x300,reg11=0xd,reg12=0x10,reg13=0xf,reg14=0x3,reg15=0x2,metadata=0x29,in_port=93,vlan_tci=0x,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=192.168.40.140,nw_dst=8.8.8.8,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=0,icmp_code=0 > Megaflow: > recirc_id=0xc157b1,ct_state=+new-est-rel-rpl-inv+trk,ct_label=0/0x1,eth,icmp,in_port=93,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src= > 192.168.40.128/26,nw_dst=8.0.0.0/7,nw_ttl=64,nw_frag=no > Datapath actions: > ct(commit,zone=15,label=0/0x1,nat(src)),set(eth(src=fa:16:3e:ec:7f:dd,dst=00:00:00:00:00:00)),set(ipv4(ttl=63)),userspace(pid=3451843211,controller(reason=1,dont_send=1,continuation=0,recirc_id=12670898,rule_cookie=0x3e26215e,controller_id=0,max_len=65535)) > It seems the Datapath is querying the controller and I did not understand > the reason. > > So, I did an ovn-controller recompute (ovn-appctl -t ovn-controller > recompute) on the Chassi where the VM is placed to check if it could change > the behaviour and I could trace the packet with success and the VM started > to communicate with the Internet normally: > Final flow: > recirc_id=0x2,eth,icmp,reg0=0x300,reg11=0xd,reg12=0x10,reg13=0xf,reg14=0x3,reg15=0x2,metadata=0x29,in_port=93,vlan_tci=0x,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=192.168.40.140,nw_dst=8.8.8.8,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=0,icmp_code=0 > Megaflow: > recirc_id=0x2,ct_state=+new-est-rel-rpl-inv+trk,ct_label=0/0x1,eth,icmp,tun_id=0/0xff,tun_metadata0=NP,in_port=93,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src= > 192.168.40.128/26,nw_dst=8.0.0.0/7,nw_ecn=0,nw_ttl=64,nw_frag=no > Datapath actions: > ct(commit,zone=15,label=0/0x1,nat(src)),set(tunnel(tun_id=0x2a,dst=10.X6.X3.133,ttl=64,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x30002}),flags(df|csum|key))),set(eth(src=fa:16:3e:ec:7f:dd,dst=00:00:5e:00:04:00)),set(ipv4(ttl=63)),2 > The Datapath action is using the tunnel with the Chassi Gateway. > > It happens always with new VMs but sometimes. After running the recompute > on the Chassi, I created additional VMs and this issue did not happen. > > In my Chassi I have enable these parameters also: > ovn-monitor-all="true" > ovn-openflow-probe-interval="0" > ovn-remote-probe-interval="18" > > Do you know this behaviour could be bug related? > > Tiago Pires > > ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
[ovs-discuss] ovn-controller stranger behaviour
Hi all, I'm trying to understand a stranger's behaviour regarding to ovn-controller. In my setup I have OVN 21.09/ OVS 2.16 and Xena and sometimes when a new VM is created, this VM can reach other VMs in east-west traffic (even in differents Chassis) but it can't reach an external network (e.g. Internet) through Chassi Gateway. I ran the following trace: # ovs-appctl ofproto/trace br-int in_port="93",icmp,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=192.168.40.140,nw_dst=8.8.8.8,nw_ttl=64 And I got this output: Final flow: recirc_id=0xc157b1,eth,icmp,reg0=0x300,reg11=0xd,reg12=0x10,reg13=0xf,reg14=0x3,reg15=0x2,metadata=0x29,in_port=93,vlan_tci=0x,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=192.168.40.140,nw_dst=8.8.8.8,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=0,icmp_code=0 Megaflow: recirc_id=0xc157b1,ct_state=+new-est-rel-rpl-inv+trk,ct_label=0/0x1,eth,icmp,in_port=93,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src= 192.168.40.128/26,nw_dst=8.0.0.0/7,nw_ttl=64,nw_frag=no Datapath actions: ct(commit,zone=15,label=0/0x1,nat(src)),set(eth(src=fa:16:3e:ec:7f:dd,dst=00:00:00:00:00:00)),set(ipv4(ttl=63)),userspace(pid=3451843211,controller(reason=1,dont_send=1,continuation=0,recirc_id=12670898,rule_cookie=0x3e26215e,controller_id=0,max_len=65535)) It seems the Datapath is querying the controller and I did not understand the reason. So, I did an ovn-controller recompute (ovn-appctl -t ovn-controller recompute) on the Chassi where the VM is placed to check if it could change the behaviour and I could trace the packet with success and the VM started to communicate with the Internet normally: Final flow: recirc_id=0x2,eth,icmp,reg0=0x300,reg11=0xd,reg12=0x10,reg13=0xf,reg14=0x3,reg15=0x2,metadata=0x29,in_port=93,vlan_tci=0x,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src=192.168.40.140,nw_dst=8.8.8.8,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=0,icmp_code=0 Megaflow: recirc_id=0x2,ct_state=+new-est-rel-rpl-inv+trk,ct_label=0/0x1,eth,icmp,tun_id=0/0xff,tun_metadata0=NP,in_port=93,dl_src=fa:16:3e:26:34:ef,dl_dst=fa:16:3e:65:68:6e,nw_src= 192.168.40.128/26,nw_dst=8.0.0.0/7,nw_ecn=0,nw_ttl=64,nw_frag=no Datapath actions: ct(commit,zone=15,label=0/0x1,nat(src)),set(tunnel(tun_id=0x2a,dst=10.X6.X3.133,ttl=64,tp_dst=6081,geneve({class=0x102,type=0x80,len=4,0x30002}),flags(df|csum|key))),set(eth(src=fa:16:3e:ec:7f:dd,dst=00:00:5e:00:04:00)),set(ipv4(ttl=63)),2 The Datapath action is using the tunnel with the Chassi Gateway. It happens always with new VMs but sometimes. After running the recompute on the Chassi, I created additional VMs and this issue did not happen. In my Chassi I have enable these parameters also: ovn-monitor-all="true" ovn-openflow-probe-interval="0" ovn-remote-probe-interval="18" Do you know this behaviour could be bug related? Tiago Pires ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss