Re: [ovs-dev] [PATCH v2] ovn: Avoid nb_cfg update notification flooding
On Thu, Apr 12, 2018 at 10:19 AM, Ben Pfaffwrote: > > On Wed, Apr 11, 2018 at 11:29:28PM -0700, Han Zhou wrote: > > On Tue, Apr 10, 2018 at 6:21 PM, Han Zhou wrote: > > > > > > > > > > > > On Tue, Apr 10, 2018 at 5:04 PM, Ben Pfaff wrote: > > > > > > > > On Fri, Apr 06, 2018 at 02:40:21PM -0700, Han Zhou wrote: > > > > > On Fri, Apr 6, 2018 at 1:54 PM, Ben Pfaff wrote: > > > > > > > > > > > > Thanks for working on making OVN faster and scale better. > > > > > > > > > > > > I see what you mean about how nb_cfg can be a scale problem. > > Really, > > > > > > each hypervisor only cares about its own row, but each of them is > > being > > > > > > notified about the current value in every row. Ideally, we want > > the HVs > > > > > > to be able to say to ovsdb-server, "Give me all the data in the > > Chassis > > > > > > table, except don't bother with nb_cfg in any rows except my own > > row." > > > > > > > > > > > > Actually, there's a way for ovn-controller to do that: the OVSDB > > > > > > protocol definition of monitor_cond allows the client to specify > > > > > > multiple sets of columns to monitor in a table, and each set of > > columns > > > > > > has an independently attached condition. This can be used to > > specify > > > > > > that most of the columns are monitored unconditionally but nb_cfg is > > > > > > monitored only for a particular row. > > > > > > > > > > > > But there are still problems: > > > > > > > > > > > > 1. The implementation in ovsdb-server looks broken. I don't think > > it > > > > > >implements the spec. It certainly isn't tested--none of the > > existing > > > > > >clients ever passes more than a single set of columns. I think > > that > > > > > >it will work if all the sets of columns use the same condition, > > but > > > > > >that doesn't help with this case. It will need to be fixed. > > > > > > > > > > > > 2. The client implementation in ovsdb-idl doesn't anticipate > > difference > > > > > >conditions for different columns. It will need to be enhanced to > > > > > >support this use. > > > > > > > > > > > > In the short term this is going to be more work than just creating > > a new > > > > > > table. In the long term, though, it will allow us to be more > > flexible > > > > > > and more agile, since improvements similar to the one in this patch > > will > > > > > > require only client changes, not database schema changes. > > > > > > > > > > > > Comments? > > > > > > > > > > Thanks for the valuable information! I didn't even know that ovsdb is > > > > > supposed to support multiple set columns with different monitor > > conditions. > > > > > It surely would be the right way to go if it is supported. However, > > as you > > > > > mentioned the mechanism doesn't work yet and I am not sure how much > > work is > > > > > needed to make it work, it seems reasonable to me to "workaround" it > > at > > > > > least for short term until that is ready in the future. Then we could > > > > > simply resurrect the old column in the old table with the new monitor > > > > > conditions. What do you think? > > > > > > > > This kind of workaround would essentially persist forever. We'd never > > > > be able to get rid of it--or, more to the point, we would never be able > > > > to justify the work. We'd be stuck with something ugly. > > > > > > > > So: if you think the workaround is valuable, then one way to justify it > > > > would be to reformulate it in a way that isn't so ugly. As currently > > > > written, when a person looks at the table structure, (s)he notices a > > > > table that has a single column and is named for that single column. > > > > That sticks out, clearly, as some kind of compromise, and the > > > > documentation even calls it out as an optimization. But there might be > > > > a logical, conceptual reason why it makes sense to have a separate table > > > > for some kinds of chassis-related data. If there is, then that would > > > > suggest a name for the table, and for the class of data that should > > > > should go in there, and it would make it possible to have something that > > > > is both faster and a reasonable design. > > > > > > > > I'm not saying that there is necessarily a logical, conceptual reason > > > > that one can come up here, but if there is then it would certainly make > > > > it much easier to support the patch. > > > > > > Hi Ben, I'm not sure if I got your point completely. A logical reason for > > this separate table might be for holding chassis private data, the data > > that other chassis is not interested in. Of course the reason for holding > > such private data in a separate table is still this huge performance > > difference as it is shown in the test result. So I think the table can be > > named as Chassis_Private_Data, and I can document in ovn-sb.xml more about > > the performance considerations, and the future improvement about the > >
Re: [ovs-dev] [PATCH v2] ovn: Avoid nb_cfg update notification flooding
On Wed, Apr 11, 2018 at 11:29:28PM -0700, Han Zhou wrote: > On Tue, Apr 10, 2018 at 6:21 PM, Han Zhouwrote: > > > > > > > > On Tue, Apr 10, 2018 at 5:04 PM, Ben Pfaff wrote: > > > > > > On Fri, Apr 06, 2018 at 02:40:21PM -0700, Han Zhou wrote: > > > > On Fri, Apr 6, 2018 at 1:54 PM, Ben Pfaff wrote: > > > > > > > > > > Thanks for working on making OVN faster and scale better. > > > > > > > > > > I see what you mean about how nb_cfg can be a scale problem. > Really, > > > > > each hypervisor only cares about its own row, but each of them is > being > > > > > notified about the current value in every row. Ideally, we want > the HVs > > > > > to be able to say to ovsdb-server, "Give me all the data in the > Chassis > > > > > table, except don't bother with nb_cfg in any rows except my own > row." > > > > > > > > > > Actually, there's a way for ovn-controller to do that: the OVSDB > > > > > protocol definition of monitor_cond allows the client to specify > > > > > multiple sets of columns to monitor in a table, and each set of > columns > > > > > has an independently attached condition. This can be used to > specify > > > > > that most of the columns are monitored unconditionally but nb_cfg is > > > > > monitored only for a particular row. > > > > > > > > > > But there are still problems: > > > > > > > > > > 1. The implementation in ovsdb-server looks broken. I don't think > it > > > > >implements the spec. It certainly isn't tested--none of the > existing > > > > >clients ever passes more than a single set of columns. I think > that > > > > >it will work if all the sets of columns use the same condition, > but > > > > >that doesn't help with this case. It will need to be fixed. > > > > > > > > > > 2. The client implementation in ovsdb-idl doesn't anticipate > difference > > > > >conditions for different columns. It will need to be enhanced to > > > > >support this use. > > > > > > > > > > In the short term this is going to be more work than just creating > a new > > > > > table. In the long term, though, it will allow us to be more > flexible > > > > > and more agile, since improvements similar to the one in this patch > will > > > > > require only client changes, not database schema changes. > > > > > > > > > > Comments? > > > > > > > > Thanks for the valuable information! I didn't even know that ovsdb is > > > > supposed to support multiple set columns with different monitor > conditions. > > > > It surely would be the right way to go if it is supported. However, > as you > > > > mentioned the mechanism doesn't work yet and I am not sure how much > work is > > > > needed to make it work, it seems reasonable to me to "workaround" it > at > > > > least for short term until that is ready in the future. Then we could > > > > simply resurrect the old column in the old table with the new monitor > > > > conditions. What do you think? > > > > > > This kind of workaround would essentially persist forever. We'd never > > > be able to get rid of it--or, more to the point, we would never be able > > > to justify the work. We'd be stuck with something ugly. > > > > > > So: if you think the workaround is valuable, then one way to justify it > > > would be to reformulate it in a way that isn't so ugly. As currently > > > written, when a person looks at the table structure, (s)he notices a > > > table that has a single column and is named for that single column. > > > That sticks out, clearly, as some kind of compromise, and the > > > documentation even calls it out as an optimization. But there might be > > > a logical, conceptual reason why it makes sense to have a separate table > > > for some kinds of chassis-related data. If there is, then that would > > > suggest a name for the table, and for the class of data that should > > > should go in there, and it would make it possible to have something that > > > is both faster and a reasonable design. > > > > > > I'm not saying that there is necessarily a logical, conceptual reason > > > that one can come up here, but if there is then it would certainly make > > > it much easier to support the patch. > > > > Hi Ben, I'm not sure if I got your point completely. A logical reason for > this separate table might be for holding chassis private data, the data > that other chassis is not interested in. Of course the reason for holding > such private data in a separate table is still this huge performance > difference as it is shown in the test result. So I think the table can be > named as Chassis_Private_Data, and I can document in ovn-sb.xml more about > the performance considerations, and the future improvement about the > monitor_cond. Would this address your point or did I totally missed your > point? > > > Hi Ben, could you confirm so that I can work on v3? > > > Thanks, > > Han That sounds like a good direction. ___ dev mailing list
Re: [ovs-dev] [PATCH v2] ovn: Avoid nb_cfg update notification flooding
On Tue, Apr 10, 2018 at 6:21 PM, Han Zhouwrote: > > > > On Tue, Apr 10, 2018 at 5:04 PM, Ben Pfaff wrote: > > > > On Fri, Apr 06, 2018 at 02:40:21PM -0700, Han Zhou wrote: > > > On Fri, Apr 6, 2018 at 1:54 PM, Ben Pfaff wrote: > > > > > > > > Thanks for working on making OVN faster and scale better. > > > > > > > > I see what you mean about how nb_cfg can be a scale problem. Really, > > > > each hypervisor only cares about its own row, but each of them is being > > > > notified about the current value in every row. Ideally, we want the HVs > > > > to be able to say to ovsdb-server, "Give me all the data in the Chassis > > > > table, except don't bother with nb_cfg in any rows except my own row." > > > > > > > > Actually, there's a way for ovn-controller to do that: the OVSDB > > > > protocol definition of monitor_cond allows the client to specify > > > > multiple sets of columns to monitor in a table, and each set of columns > > > > has an independently attached condition. This can be used to specify > > > > that most of the columns are monitored unconditionally but nb_cfg is > > > > monitored only for a particular row. > > > > > > > > But there are still problems: > > > > > > > > 1. The implementation in ovsdb-server looks broken. I don't think it > > > >implements the spec. It certainly isn't tested--none of the existing > > > >clients ever passes more than a single set of columns. I think that > > > >it will work if all the sets of columns use the same condition, but > > > >that doesn't help with this case. It will need to be fixed. > > > > > > > > 2. The client implementation in ovsdb-idl doesn't anticipate difference > > > >conditions for different columns. It will need to be enhanced to > > > >support this use. > > > > > > > > In the short term this is going to be more work than just creating a new > > > > table. In the long term, though, it will allow us to be more flexible > > > > and more agile, since improvements similar to the one in this patch will > > > > require only client changes, not database schema changes. > > > > > > > > Comments? > > > > > > Thanks for the valuable information! I didn't even know that ovsdb is > > > supposed to support multiple set columns with different monitor conditions. > > > It surely would be the right way to go if it is supported. However, as you > > > mentioned the mechanism doesn't work yet and I am not sure how much work is > > > needed to make it work, it seems reasonable to me to "workaround" it at > > > least for short term until that is ready in the future. Then we could > > > simply resurrect the old column in the old table with the new monitor > > > conditions. What do you think? > > > > This kind of workaround would essentially persist forever. We'd never > > be able to get rid of it--or, more to the point, we would never be able > > to justify the work. We'd be stuck with something ugly. > > > > So: if you think the workaround is valuable, then one way to justify it > > would be to reformulate it in a way that isn't so ugly. As currently > > written, when a person looks at the table structure, (s)he notices a > > table that has a single column and is named for that single column. > > That sticks out, clearly, as some kind of compromise, and the > > documentation even calls it out as an optimization. But there might be > > a logical, conceptual reason why it makes sense to have a separate table > > for some kinds of chassis-related data. If there is, then that would > > suggest a name for the table, and for the class of data that should > > should go in there, and it would make it possible to have something that > > is both faster and a reasonable design. > > > > I'm not saying that there is necessarily a logical, conceptual reason > > that one can come up here, but if there is then it would certainly make > > it much easier to support the patch. > > Hi Ben, I'm not sure if I got your point completely. A logical reason for this separate table might be for holding chassis private data, the data that other chassis is not interested in. Of course the reason for holding such private data in a separate table is still this huge performance difference as it is shown in the test result. So I think the table can be named as Chassis_Private_Data, and I can document in ovn-sb.xml more about the performance considerations, and the future improvement about the monitor_cond. Would this address your point or did I totally missed your point? > Hi Ben, could you confirm so that I can work on v3? > Thanks, > Han ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2] ovn: Avoid nb_cfg update notification flooding
On Tue, Apr 10, 2018 at 5:04 PM, Ben Pfaffwrote: > > On Fri, Apr 06, 2018 at 02:40:21PM -0700, Han Zhou wrote: > > On Fri, Apr 6, 2018 at 1:54 PM, Ben Pfaff wrote: > > > > > > Thanks for working on making OVN faster and scale better. > > > > > > I see what you mean about how nb_cfg can be a scale problem. Really, > > > each hypervisor only cares about its own row, but each of them is being > > > notified about the current value in every row. Ideally, we want the HVs > > > to be able to say to ovsdb-server, "Give me all the data in the Chassis > > > table, except don't bother with nb_cfg in any rows except my own row." > > > > > > Actually, there's a way for ovn-controller to do that: the OVSDB > > > protocol definition of monitor_cond allows the client to specify > > > multiple sets of columns to monitor in a table, and each set of columns > > > has an independently attached condition. This can be used to specify > > > that most of the columns are monitored unconditionally but nb_cfg is > > > monitored only for a particular row. > > > > > > But there are still problems: > > > > > > 1. The implementation in ovsdb-server looks broken. I don't think it > > >implements the spec. It certainly isn't tested--none of the existing > > >clients ever passes more than a single set of columns. I think that > > >it will work if all the sets of columns use the same condition, but > > >that doesn't help with this case. It will need to be fixed. > > > > > > 2. The client implementation in ovsdb-idl doesn't anticipate difference > > >conditions for different columns. It will need to be enhanced to > > >support this use. > > > > > > In the short term this is going to be more work than just creating a new > > > table. In the long term, though, it will allow us to be more flexible > > > and more agile, since improvements similar to the one in this patch will > > > require only client changes, not database schema changes. > > > > > > Comments? > > > > Thanks for the valuable information! I didn't even know that ovsdb is > > supposed to support multiple set columns with different monitor conditions. > > It surely would be the right way to go if it is supported. However, as you > > mentioned the mechanism doesn't work yet and I am not sure how much work is > > needed to make it work, it seems reasonable to me to "workaround" it at > > least for short term until that is ready in the future. Then we could > > simply resurrect the old column in the old table with the new monitor > > conditions. What do you think? > > This kind of workaround would essentially persist forever. We'd never > be able to get rid of it--or, more to the point, we would never be able > to justify the work. We'd be stuck with something ugly. > > So: if you think the workaround is valuable, then one way to justify it > would be to reformulate it in a way that isn't so ugly. As currently > written, when a person looks at the table structure, (s)he notices a > table that has a single column and is named for that single column. > That sticks out, clearly, as some kind of compromise, and the > documentation even calls it out as an optimization. But there might be > a logical, conceptual reason why it makes sense to have a separate table > for some kinds of chassis-related data. If there is, then that would > suggest a name for the table, and for the class of data that should > should go in there, and it would make it possible to have something that > is both faster and a reasonable design. > > I'm not saying that there is necessarily a logical, conceptual reason > that one can come up here, but if there is then it would certainly make > it much easier to support the patch. Hi Ben, I'm not sure if I got your point completely. A logical reason for this separate table might be for holding chassis private data, the data that other chassis is not interested in. Of course the reason for holding such private data in a separate table is still this huge performance difference as it is shown in the test result. So I think the table can be named as Chassis_Private_Data, and I can document in ovn-sb.xml more about the performance considerations, and the future improvement about the monitor_cond. Would this address your point or did I totally missed your point? Thanks, Han ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2] ovn: Avoid nb_cfg update notification flooding
On Fri, Apr 06, 2018 at 02:40:21PM -0700, Han Zhou wrote: > On Fri, Apr 6, 2018 at 1:54 PM, Ben Pfaffwrote: > > > > Thanks for working on making OVN faster and scale better. > > > > I see what you mean about how nb_cfg can be a scale problem. Really, > > each hypervisor only cares about its own row, but each of them is being > > notified about the current value in every row. Ideally, we want the HVs > > to be able to say to ovsdb-server, "Give me all the data in the Chassis > > table, except don't bother with nb_cfg in any rows except my own row." > > > > Actually, there's a way for ovn-controller to do that: the OVSDB > > protocol definition of monitor_cond allows the client to specify > > multiple sets of columns to monitor in a table, and each set of columns > > has an independently attached condition. This can be used to specify > > that most of the columns are monitored unconditionally but nb_cfg is > > monitored only for a particular row. > > > > But there are still problems: > > > > 1. The implementation in ovsdb-server looks broken. I don't think it > >implements the spec. It certainly isn't tested--none of the existing > >clients ever passes more than a single set of columns. I think that > >it will work if all the sets of columns use the same condition, but > >that doesn't help with this case. It will need to be fixed. > > > > 2. The client implementation in ovsdb-idl doesn't anticipate difference > >conditions for different columns. It will need to be enhanced to > >support this use. > > > > In the short term this is going to be more work than just creating a new > > table. In the long term, though, it will allow us to be more flexible > > and more agile, since improvements similar to the one in this patch will > > require only client changes, not database schema changes. > > > > Comments? > > Thanks for the valuable information! I didn't even know that ovsdb is > supposed to support multiple set columns with different monitor conditions. > It surely would be the right way to go if it is supported. However, as you > mentioned the mechanism doesn't work yet and I am not sure how much work is > needed to make it work, it seems reasonable to me to "workaround" it at > least for short term until that is ready in the future. Then we could > simply resurrect the old column in the old table with the new monitor > conditions. What do you think? This kind of workaround would essentially persist forever. We'd never be able to get rid of it--or, more to the point, we would never be able to justify the work. We'd be stuck with something ugly. So: if you think the workaround is valuable, then one way to justify it would be to reformulate it in a way that isn't so ugly. As currently written, when a person looks at the table structure, (s)he notices a table that has a single column and is named for that single column. That sticks out, clearly, as some kind of compromise, and the documentation even calls it out as an optimization. But there might be a logical, conceptual reason why it makes sense to have a separate table for some kinds of chassis-related data. If there is, then that would suggest a name for the table, and for the class of data that should should go in there, and it would make it possible to have something that is both faster and a reasonable design. I'm not saying that there is necessarily a logical, conceptual reason that one can come up here, but if there is then it would certainly make it much easier to support the patch. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2] ovn: Avoid nb_cfg update notification flooding
On Fri, Apr 6, 2018 at 1:54 PM, Ben Pfaffwrote: > > Thanks for working on making OVN faster and scale better. > > I see what you mean about how nb_cfg can be a scale problem. Really, > each hypervisor only cares about its own row, but each of them is being > notified about the current value in every row. Ideally, we want the HVs > to be able to say to ovsdb-server, "Give me all the data in the Chassis > table, except don't bother with nb_cfg in any rows except my own row." > > Actually, there's a way for ovn-controller to do that: the OVSDB > protocol definition of monitor_cond allows the client to specify > multiple sets of columns to monitor in a table, and each set of columns > has an independently attached condition. This can be used to specify > that most of the columns are monitored unconditionally but nb_cfg is > monitored only for a particular row. > > But there are still problems: > > 1. The implementation in ovsdb-server looks broken. I don't think it >implements the spec. It certainly isn't tested--none of the existing >clients ever passes more than a single set of columns. I think that >it will work if all the sets of columns use the same condition, but >that doesn't help with this case. It will need to be fixed. > > 2. The client implementation in ovsdb-idl doesn't anticipate difference >conditions for different columns. It will need to be enhanced to >support this use. > > In the short term this is going to be more work than just creating a new > table. In the long term, though, it will allow us to be more flexible > and more agile, since improvements similar to the one in this patch will > require only client changes, not database schema changes. > > Comments? Thanks for the valuable information! I didn't even know that ovsdb is supposed to support multiple set columns with different monitor conditions. It surely would be the right way to go if it is supported. However, as you mentioned the mechanism doesn't work yet and I am not sure how much work is needed to make it work, it seems reasonable to me to "workaround" it at least for short term until that is ready in the future. Then we could simply resurrect the old column in the old table with the new monitor conditions. What do you think? Thanks, Han ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2] ovn: Avoid nb_cfg update notification flooding
Thanks for working on making OVN faster and scale better. I see what you mean about how nb_cfg can be a scale problem. Really, each hypervisor only cares about its own row, but each of them is being notified about the current value in every row. Ideally, we want the HVs to be able to say to ovsdb-server, "Give me all the data in the Chassis table, except don't bother with nb_cfg in any rows except my own row." Actually, there's a way for ovn-controller to do that: the OVSDB protocol definition of monitor_cond allows the client to specify multiple sets of columns to monitor in a table, and each set of columns has an independently attached condition. This can be used to specify that most of the columns are monitored unconditionally but nb_cfg is monitored only for a particular row. But there are still problems: 1. The implementation in ovsdb-server looks broken. I don't think it implements the spec. It certainly isn't tested--none of the existing clients ever passes more than a single set of columns. I think that it will work if all the sets of columns use the same condition, but that doesn't help with this case. It will need to be fixed. 2. The client implementation in ovsdb-idl doesn't anticipate difference conditions for different columns. It will need to be enhanced to support this use. In the short term this is going to be more work than just creating a new table. In the long term, though, it will allow us to be more flexible and more agile, since improvements similar to the one in this patch will require only client changes, not database schema changes. Comments? ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2] ovn: Avoid nb_cfg update notification flooding
Acked-By: Venkata AnilI have reviewed and tested the patch. It looks good to me. Thanks Anil On Mon, Mar 26, 2018 at 11:09 AM, Anil Venkata wrote: > > On Sat, Mar 24, 2018 at 7:06 AM, Han Zhou wrote: > >> Anil, thanks for the review! >> >> On Thu, Mar 22, 2018 at 11:12 PM, Anil Venkata >> wrote: >> > >> > Is it possible to add a test for this patch i.e testing if hypervisors >> correctly updating >> > Chassis_NB_Cfg's nb_cfg when a new value is assigned to NB_Global's >> nb_cfg? >> > Please see other comments inline >> > >> This mechanism has been used in some test cases (grep "wait=hv"), and >> this change doesn't change the expected behavior, so I didn't feel >> necessary to add new test cases. The impact of this change is basically the >> performance in a large scale environment. I tested it in ovn-scale-test, >> but it seems not feasible to add test cases here. >> >> > I have also manually tested the patch(i.e HV not receiving "update2" call > when other HVs update Chassis_NB_Cfg's nb_cfg). Thanks > > >> > Thanks >> > Anil >> > >> > On Thu, Mar 22, 2018 at 4:57 AM, Han Zhou wrote: >> >> >> >> nb_cfg as a mechanism to "ping" OVN control plane is very useful >> >> in many ways. However, the current implementation will trigger >> >> update notifications flooding in the whole control plane. Each >> >> HV updates to SB the nb_cfg number and all these updates are >> >> notified to all the other HVs, which is O(n^2). Although updates >> >> are batched in fewers notifications than n^2, it still generates >> >> significant load on SB DB and also on ovn-controllers. >> >> >> >> To solve this problem and make the mechanism more useful in large >> >> scale producation deployment, this patch separates the per HV >> >> nb_cfg data in SB from the Chassis table to a separate table, so >> >> that each HV can conditionally monitor and get updates only for >> >> its own record. >> >> >> >> Test result shows great improvement: >> >> In a test environment with 1K sandbox HVs, and 10K ports created >> >> on 40 lswitches and 8 lrouters, do the sync test when the system >> >> is idle, with command: >> >> >> >> time ovn-nbctl --wait=hv sync >> >> >> >> Original result: >> >> real4m52.926s >> >> user0m0.328s >> >> sys 0m0.004s >> >> >> >> With this patch: >> >> real0m11.405s >> >> user0m0.316s >> >> sys 0m0.016s >> >> >> >> Signed-off-by: Han Zhou >> >> --- >> >> >> >> Notes: >> >> v1 -> v2: >> >> - keep the old column in chassis table to avoid breaking upgrading >> >> - avoid the attempt of adding redundant entry during startup >> >> >> >> ovn/controller/chassis.c| 35 ++ >> - >> >> ovn/controller/chassis.h| 9 ++--- >> >> ovn/controller/ovn-controller.c | 21 - >> >> ovn/northd/ovn-northd.c | 21 ++--- >> >> ovn/ovn-sb.ovsschema| 8 +++- >> >> ovn/ovn-sb.xml | 26 ++ >> >> 6 files changed, 103 insertions(+), 17 deletions(-) >> >> >> >> diff --git a/ovn/controller/chassis.c b/ovn/controller/chassis.c >> >> index 6b5286a..6092dcc 100644 >> >> --- a/ovn/controller/chassis.c >> >> +++ b/ovn/controller/chassis.c >> >> @@ -72,12 +72,28 @@ get_cms_options(const struct smap *ext_ids) >> >> return smap_get_def(ext_ids, "ovn-cms-options", ""); >> >> } >> >> >> >> +static const struct sbrec_chassis_nb_cfg * >> >> +find_chassis_nb_cfg(struct ovsdb_idl *ovnsb_idl, const char >> *chassis_id) >> >> +{ >> >> +const struct sbrec_chassis_nb_cfg *chassis_nb_cfg_rec; >> >> + >> >> +SBREC_CHASSIS_NB_CFG_FOR_EACH (chassis_nb_cfg_rec, ovnsb_idl) { >> >> +if (!strcmp(chassis_nb_cfg_rec->chassis_name, chassis_id)) { >> >> +break; >> >> +} >> >> +} >> >> + >> >> +return chassis_nb_cfg_rec; >> >> +} >> >> + >> >> /* Returns this chassis's Chassis record, if it is available and is >> currently >> >> * amenable to a transaction. */ >> >> const struct sbrec_chassis * >> >> chassis_run(struct controller_ctx *ctx, const char *chassis_id, >> >> -const struct ovsrec_bridge *br_int) >> >> +const struct ovsrec_bridge *br_int, >> >> +const struct sbrec_chassis_nb_cfg **nb_cfg) >> >> { >> >> +*nb_cfg = NULL; >> >> if (!ctx->ovnsb_idl_txn) { >> >> return NULL; >> >> } >> >> @@ -135,8 +151,17 @@ chassis_run(struct controller_ctx *ctx, const >> char *chassis_id, >> >> ds_chomp(_types, ','); >> >> const char *iface_types_str = ds_cstr(_types); >> >> >> >> +const struct sbrec_chassis_nb_cfg *chassis_nb_cfg_rec >> >> += find_chassis_nb_cfg(ctx->ovnsb_idl, chassis_id); >> >> +if (!chassis_nb_cfg_rec) { >> >> +chassis_nb_cfg_rec = sbrec_chassis_nb_cfg_insert(ct >> x->ovnsb_idl_txn); >> >> +
Re: [ovs-dev] [PATCH v2] ovn: Avoid nb_cfg update notification flooding
On Sat, Mar 24, 2018 at 7:06 AM, Han Zhouwrote: > Anil, thanks for the review! > > On Thu, Mar 22, 2018 at 11:12 PM, Anil Venkata > wrote: > > > > Is it possible to add a test for this patch i.e testing if hypervisors > correctly updating > > Chassis_NB_Cfg's nb_cfg when a new value is assigned to NB_Global's > nb_cfg? > > Please see other comments inline > > > This mechanism has been used in some test cases (grep "wait=hv"), and this > change doesn't change the expected behavior, so I didn't feel necessary to > add new test cases. The impact of this change is basically the performance > in a large scale environment. I tested it in ovn-scale-test, but it seems > not feasible to add test cases here. > > I have also manually tested the patch(i.e HV not receiving "update2" call when other HVs update Chassis_NB_Cfg's nb_cfg). Thanks > > Thanks > > Anil > > > > On Thu, Mar 22, 2018 at 4:57 AM, Han Zhou wrote: > >> > >> nb_cfg as a mechanism to "ping" OVN control plane is very useful > >> in many ways. However, the current implementation will trigger > >> update notifications flooding in the whole control plane. Each > >> HV updates to SB the nb_cfg number and all these updates are > >> notified to all the other HVs, which is O(n^2). Although updates > >> are batched in fewers notifications than n^2, it still generates > >> significant load on SB DB and also on ovn-controllers. > >> > >> To solve this problem and make the mechanism more useful in large > >> scale producation deployment, this patch separates the per HV > >> nb_cfg data in SB from the Chassis table to a separate table, so > >> that each HV can conditionally monitor and get updates only for > >> its own record. > >> > >> Test result shows great improvement: > >> In a test environment with 1K sandbox HVs, and 10K ports created > >> on 40 lswitches and 8 lrouters, do the sync test when the system > >> is idle, with command: > >> > >> time ovn-nbctl --wait=hv sync > >> > >> Original result: > >> real4m52.926s > >> user0m0.328s > >> sys 0m0.004s > >> > >> With this patch: > >> real0m11.405s > >> user0m0.316s > >> sys 0m0.016s > >> > >> Signed-off-by: Han Zhou > >> --- > >> > >> Notes: > >> v1 -> v2: > >> - keep the old column in chassis table to avoid breaking upgrading > >> - avoid the attempt of adding redundant entry during startup > >> > >> ovn/controller/chassis.c| 35 ++ > - > >> ovn/controller/chassis.h| 9 ++--- > >> ovn/controller/ovn-controller.c | 21 - > >> ovn/northd/ovn-northd.c | 21 ++--- > >> ovn/ovn-sb.ovsschema| 8 +++- > >> ovn/ovn-sb.xml | 26 ++ > >> 6 files changed, 103 insertions(+), 17 deletions(-) > >> > >> diff --git a/ovn/controller/chassis.c b/ovn/controller/chassis.c > >> index 6b5286a..6092dcc 100644 > >> --- a/ovn/controller/chassis.c > >> +++ b/ovn/controller/chassis.c > >> @@ -72,12 +72,28 @@ get_cms_options(const struct smap *ext_ids) > >> return smap_get_def(ext_ids, "ovn-cms-options", ""); > >> } > >> > >> +static const struct sbrec_chassis_nb_cfg * > >> +find_chassis_nb_cfg(struct ovsdb_idl *ovnsb_idl, const char > *chassis_id) > >> +{ > >> +const struct sbrec_chassis_nb_cfg *chassis_nb_cfg_rec; > >> + > >> +SBREC_CHASSIS_NB_CFG_FOR_EACH (chassis_nb_cfg_rec, ovnsb_idl) { > >> +if (!strcmp(chassis_nb_cfg_rec->chassis_name, chassis_id)) { > >> +break; > >> +} > >> +} > >> + > >> +return chassis_nb_cfg_rec; > >> +} > >> + > >> /* Returns this chassis's Chassis record, if it is available and is > currently > >> * amenable to a transaction. */ > >> const struct sbrec_chassis * > >> chassis_run(struct controller_ctx *ctx, const char *chassis_id, > >> -const struct ovsrec_bridge *br_int) > >> +const struct ovsrec_bridge *br_int, > >> +const struct sbrec_chassis_nb_cfg **nb_cfg) > >> { > >> +*nb_cfg = NULL; > >> if (!ctx->ovnsb_idl_txn) { > >> return NULL; > >> } > >> @@ -135,8 +151,17 @@ chassis_run(struct controller_ctx *ctx, const char > *chassis_id, > >> ds_chomp(_types, ','); > >> const char *iface_types_str = ds_cstr(_types); > >> > >> +const struct sbrec_chassis_nb_cfg *chassis_nb_cfg_rec > >> += find_chassis_nb_cfg(ctx->ovnsb_idl, chassis_id); > >> +if (!chassis_nb_cfg_rec) { > >> +chassis_nb_cfg_rec = sbrec_chassis_nb_cfg_insert( > ctx->ovnsb_idl_txn); > >> +sbrec_chassis_nb_cfg_set_chassis_name(chassis_nb_cfg_rec, > chassis_id); > >> +} > >> +*nb_cfg = chassis_nb_cfg_rec; > >> + > > > > > > Can you please create this record after creation of chassis record i.e > > move "sbrec_chassis_nb_cfg_insert" after > > https://github.com/openvswitch/ovs/blob/master/ >
Re: [ovs-dev] [PATCH v2] ovn: Avoid nb_cfg update notification flooding
Anil, thanks for the review! On Thu, Mar 22, 2018 at 11:12 PM, Anil Venkatawrote: > > Is it possible to add a test for this patch i.e testing if hypervisors correctly updating > Chassis_NB_Cfg's nb_cfg when a new value is assigned to NB_Global's nb_cfg? > Please see other comments inline > This mechanism has been used in some test cases (grep "wait=hv"), and this change doesn't change the expected behavior, so I didn't feel necessary to add new test cases. The impact of this change is basically the performance in a large scale environment. I tested it in ovn-scale-test, but it seems not feasible to add test cases here. > Thanks > Anil > > On Thu, Mar 22, 2018 at 4:57 AM, Han Zhou wrote: >> >> nb_cfg as a mechanism to "ping" OVN control plane is very useful >> in many ways. However, the current implementation will trigger >> update notifications flooding in the whole control plane. Each >> HV updates to SB the nb_cfg number and all these updates are >> notified to all the other HVs, which is O(n^2). Although updates >> are batched in fewers notifications than n^2, it still generates >> significant load on SB DB and also on ovn-controllers. >> >> To solve this problem and make the mechanism more useful in large >> scale producation deployment, this patch separates the per HV >> nb_cfg data in SB from the Chassis table to a separate table, so >> that each HV can conditionally monitor and get updates only for >> its own record. >> >> Test result shows great improvement: >> In a test environment with 1K sandbox HVs, and 10K ports created >> on 40 lswitches and 8 lrouters, do the sync test when the system >> is idle, with command: >> >> time ovn-nbctl --wait=hv sync >> >> Original result: >> real4m52.926s >> user0m0.328s >> sys 0m0.004s >> >> With this patch: >> real0m11.405s >> user0m0.316s >> sys 0m0.016s >> >> Signed-off-by: Han Zhou >> --- >> >> Notes: >> v1 -> v2: >> - keep the old column in chassis table to avoid breaking upgrading >> - avoid the attempt of adding redundant entry during startup >> >> ovn/controller/chassis.c| 35 ++- >> ovn/controller/chassis.h| 9 ++--- >> ovn/controller/ovn-controller.c | 21 - >> ovn/northd/ovn-northd.c | 21 ++--- >> ovn/ovn-sb.ovsschema| 8 +++- >> ovn/ovn-sb.xml | 26 ++ >> 6 files changed, 103 insertions(+), 17 deletions(-) >> >> diff --git a/ovn/controller/chassis.c b/ovn/controller/chassis.c >> index 6b5286a..6092dcc 100644 >> --- a/ovn/controller/chassis.c >> +++ b/ovn/controller/chassis.c >> @@ -72,12 +72,28 @@ get_cms_options(const struct smap *ext_ids) >> return smap_get_def(ext_ids, "ovn-cms-options", ""); >> } >> >> +static const struct sbrec_chassis_nb_cfg * >> +find_chassis_nb_cfg(struct ovsdb_idl *ovnsb_idl, const char *chassis_id) >> +{ >> +const struct sbrec_chassis_nb_cfg *chassis_nb_cfg_rec; >> + >> +SBREC_CHASSIS_NB_CFG_FOR_EACH (chassis_nb_cfg_rec, ovnsb_idl) { >> +if (!strcmp(chassis_nb_cfg_rec->chassis_name, chassis_id)) { >> +break; >> +} >> +} >> + >> +return chassis_nb_cfg_rec; >> +} >> + >> /* Returns this chassis's Chassis record, if it is available and is currently >> * amenable to a transaction. */ >> const struct sbrec_chassis * >> chassis_run(struct controller_ctx *ctx, const char *chassis_id, >> -const struct ovsrec_bridge *br_int) >> +const struct ovsrec_bridge *br_int, >> +const struct sbrec_chassis_nb_cfg **nb_cfg) >> { >> +*nb_cfg = NULL; >> if (!ctx->ovnsb_idl_txn) { >> return NULL; >> } >> @@ -135,8 +151,17 @@ chassis_run(struct controller_ctx *ctx, const char *chassis_id, >> ds_chomp(_types, ','); >> const char *iface_types_str = ds_cstr(_types); >> >> +const struct sbrec_chassis_nb_cfg *chassis_nb_cfg_rec >> += find_chassis_nb_cfg(ctx->ovnsb_idl, chassis_id); >> +if (!chassis_nb_cfg_rec) { >> +chassis_nb_cfg_rec = sbrec_chassis_nb_cfg_insert(ctx->ovnsb_idl_txn); >> +sbrec_chassis_nb_cfg_set_chassis_name(chassis_nb_cfg_rec, chassis_id); >> +} >> +*nb_cfg = chassis_nb_cfg_rec; >> + > > > Can you please create this record after creation of chassis record i.e > move "sbrec_chassis_nb_cfg_insert" after > https://github.com/openvswitch/ovs/blob/master/ovn/controller/chassis.c#L223 > The purpose is to make sure it is created, so the order seems doesn't matter. In fact they will be added in the same ovsdb transaction as a result. >> >> const struct sbrec_chassis *chassis_rec >> = get_chassis(ctx->ovnsb_idl, chassis_id); >> + >> const char *encap_csum = smap_get_def(>external_ids, >>"ovn-encap-csum", "true"); >> if (chassis_rec) { >> @@ -256,6 +281,14
Re: [ovs-dev] [PATCH v2] ovn: Avoid nb_cfg update notification flooding
Is it possible to add a test for this patch i.e testing if hypervisors correctly updating Chassis_NB_Cfg's nb_cfg when a new value is assigned to NB_Global's nb_cfg? Please see other comments inline Thanks Anil On Thu, Mar 22, 2018 at 4:57 AM, Han Zhouwrote: > nb_cfg as a mechanism to "ping" OVN control plane is very useful > in many ways. However, the current implementation will trigger > update notifications flooding in the whole control plane. Each > HV updates to SB the nb_cfg number and all these updates are > notified to all the other HVs, which is O(n^2). Although updates > are batched in fewers notifications than n^2, it still generates > significant load on SB DB and also on ovn-controllers. > > To solve this problem and make the mechanism more useful in large > scale producation deployment, this patch separates the per HV > nb_cfg data in SB from the Chassis table to a separate table, so > that each HV can conditionally monitor and get updates only for > its own record. > > Test result shows great improvement: > In a test environment with 1K sandbox HVs, and 10K ports created > on 40 lswitches and 8 lrouters, do the sync test when the system > is idle, with command: > > time ovn-nbctl --wait=hv sync > > Original result: > real4m52.926s > user0m0.328s > sys 0m0.004s > > With this patch: > real0m11.405s > user0m0.316s > sys 0m0.016s > > Signed-off-by: Han Zhou > --- > > Notes: > v1 -> v2: > - keep the old column in chassis table to avoid breaking upgrading > - avoid the attempt of adding redundant entry during startup > > ovn/controller/chassis.c| 35 ++- > ovn/controller/chassis.h| 9 ++--- > ovn/controller/ovn-controller.c | 21 - > ovn/northd/ovn-northd.c | 21 ++--- > ovn/ovn-sb.ovsschema| 8 +++- > ovn/ovn-sb.xml | 26 ++ > 6 files changed, 103 insertions(+), 17 deletions(-) > > diff --git a/ovn/controller/chassis.c b/ovn/controller/chassis.c > index 6b5286a..6092dcc 100644 > --- a/ovn/controller/chassis.c > +++ b/ovn/controller/chassis.c > @@ -72,12 +72,28 @@ get_cms_options(const struct smap *ext_ids) > return smap_get_def(ext_ids, "ovn-cms-options", ""); > } > > +static const struct sbrec_chassis_nb_cfg * > +find_chassis_nb_cfg(struct ovsdb_idl *ovnsb_idl, const char *chassis_id) > +{ > +const struct sbrec_chassis_nb_cfg *chassis_nb_cfg_rec; > + > +SBREC_CHASSIS_NB_CFG_FOR_EACH (chassis_nb_cfg_rec, ovnsb_idl) { > +if (!strcmp(chassis_nb_cfg_rec->chassis_name, chassis_id)) { > +break; > +} > +} > + > +return chassis_nb_cfg_rec; > +} > + > /* Returns this chassis's Chassis record, if it is available and is > currently > * amenable to a transaction. */ > const struct sbrec_chassis * > chassis_run(struct controller_ctx *ctx, const char *chassis_id, > -const struct ovsrec_bridge *br_int) > +const struct ovsrec_bridge *br_int, > +const struct sbrec_chassis_nb_cfg **nb_cfg) > { > +*nb_cfg = NULL; > if (!ctx->ovnsb_idl_txn) { > return NULL; > } > @@ -135,8 +151,17 @@ chassis_run(struct controller_ctx *ctx, const char > *chassis_id, > ds_chomp(_types, ','); > const char *iface_types_str = ds_cstr(_types); > > +const struct sbrec_chassis_nb_cfg *chassis_nb_cfg_rec > += find_chassis_nb_cfg(ctx->ovnsb_idl, chassis_id); > +if (!chassis_nb_cfg_rec) { > +chassis_nb_cfg_rec = sbrec_chassis_nb_cfg_insert( > ctx->ovnsb_idl_txn); > +sbrec_chassis_nb_cfg_set_chassis_name(chassis_nb_cfg_rec, > chassis_id); > +} > +*nb_cfg = chassis_nb_cfg_rec; > + > Can you please create this record after creation of chassis record i.e move "sbrec_chassis_nb_cfg_insert" after https://github.com/openvswitch/ovs/blob/master/ovn/controller/chassis.c#L223 > const struct sbrec_chassis *chassis_rec > = get_chassis(ctx->ovnsb_idl, chassis_id); > + > const char *encap_csum = smap_get_def(>external_ids, >"ovn-encap-csum", "true"); > if (chassis_rec) { > @@ -256,6 +281,14 @@ chassis_cleanup(struct controller_ctx *ctx, >"ovn-controller: unregistering chassis > '%s'", >chassis_rec->name); > sbrec_chassis_delete(chassis_rec); > +const struct sbrec_chassis_nb_cfg *chassis_nb_cfg_rec > += find_chassis_nb_cfg(ctx->ovnsb_idl, chassis_rec->name); > +if (chassis_nb_cfg_rec) { > +sbrec_chassis_nb_cfg_delete(chassis_nb_cfg_rec); > +} else { > +VLOG_WARN("Chassis_NB_Cfg record didn't exist for chassis > '%s'", > + chassis_rec->name); > +} > } > return false; > } > diff --git a/ovn/controller/chassis.h
[ovs-dev] [PATCH v2] ovn: Avoid nb_cfg update notification flooding
nb_cfg as a mechanism to "ping" OVN control plane is very useful in many ways. However, the current implementation will trigger update notifications flooding in the whole control plane. Each HV updates to SB the nb_cfg number and all these updates are notified to all the other HVs, which is O(n^2). Although updates are batched in fewers notifications than n^2, it still generates significant load on SB DB and also on ovn-controllers. To solve this problem and make the mechanism more useful in large scale producation deployment, this patch separates the per HV nb_cfg data in SB from the Chassis table to a separate table, so that each HV can conditionally monitor and get updates only for its own record. Test result shows great improvement: In a test environment with 1K sandbox HVs, and 10K ports created on 40 lswitches and 8 lrouters, do the sync test when the system is idle, with command: time ovn-nbctl --wait=hv sync Original result: real4m52.926s user0m0.328s sys 0m0.004s With this patch: real0m11.405s user0m0.316s sys 0m0.016s Signed-off-by: Han Zhou--- Notes: v1 -> v2: - keep the old column in chassis table to avoid breaking upgrading - avoid the attempt of adding redundant entry during startup ovn/controller/chassis.c| 35 ++- ovn/controller/chassis.h| 9 ++--- ovn/controller/ovn-controller.c | 21 - ovn/northd/ovn-northd.c | 21 ++--- ovn/ovn-sb.ovsschema| 8 +++- ovn/ovn-sb.xml | 26 ++ 6 files changed, 103 insertions(+), 17 deletions(-) diff --git a/ovn/controller/chassis.c b/ovn/controller/chassis.c index 6b5286a..6092dcc 100644 --- a/ovn/controller/chassis.c +++ b/ovn/controller/chassis.c @@ -72,12 +72,28 @@ get_cms_options(const struct smap *ext_ids) return smap_get_def(ext_ids, "ovn-cms-options", ""); } +static const struct sbrec_chassis_nb_cfg * +find_chassis_nb_cfg(struct ovsdb_idl *ovnsb_idl, const char *chassis_id) +{ +const struct sbrec_chassis_nb_cfg *chassis_nb_cfg_rec; + +SBREC_CHASSIS_NB_CFG_FOR_EACH (chassis_nb_cfg_rec, ovnsb_idl) { +if (!strcmp(chassis_nb_cfg_rec->chassis_name, chassis_id)) { +break; +} +} + +return chassis_nb_cfg_rec; +} + /* Returns this chassis's Chassis record, if it is available and is currently * amenable to a transaction. */ const struct sbrec_chassis * chassis_run(struct controller_ctx *ctx, const char *chassis_id, -const struct ovsrec_bridge *br_int) +const struct ovsrec_bridge *br_int, +const struct sbrec_chassis_nb_cfg **nb_cfg) { +*nb_cfg = NULL; if (!ctx->ovnsb_idl_txn) { return NULL; } @@ -135,8 +151,17 @@ chassis_run(struct controller_ctx *ctx, const char *chassis_id, ds_chomp(_types, ','); const char *iface_types_str = ds_cstr(_types); +const struct sbrec_chassis_nb_cfg *chassis_nb_cfg_rec += find_chassis_nb_cfg(ctx->ovnsb_idl, chassis_id); +if (!chassis_nb_cfg_rec) { +chassis_nb_cfg_rec = sbrec_chassis_nb_cfg_insert(ctx->ovnsb_idl_txn); +sbrec_chassis_nb_cfg_set_chassis_name(chassis_nb_cfg_rec, chassis_id); +} +*nb_cfg = chassis_nb_cfg_rec; + const struct sbrec_chassis *chassis_rec = get_chassis(ctx->ovnsb_idl, chassis_id); + const char *encap_csum = smap_get_def(>external_ids, "ovn-encap-csum", "true"); if (chassis_rec) { @@ -256,6 +281,14 @@ chassis_cleanup(struct controller_ctx *ctx, "ovn-controller: unregistering chassis '%s'", chassis_rec->name); sbrec_chassis_delete(chassis_rec); +const struct sbrec_chassis_nb_cfg *chassis_nb_cfg_rec += find_chassis_nb_cfg(ctx->ovnsb_idl, chassis_rec->name); +if (chassis_nb_cfg_rec) { +sbrec_chassis_nb_cfg_delete(chassis_nb_cfg_rec); +} else { +VLOG_WARN("Chassis_NB_Cfg record didn't exist for chassis '%s'", + chassis_rec->name); +} } return false; } diff --git a/ovn/controller/chassis.h b/ovn/controller/chassis.h index 2cc47fc..5ecf7f4 100644 --- a/ovn/controller/chassis.h +++ b/ovn/controller/chassis.h @@ -17,6 +17,7 @@ #define OVN_CHASSIS_H 1 #include +#include "ovn/lib/ovn-sb-idl.h" struct controller_ctx; struct ovsdb_idl; @@ -24,9 +25,11 @@ struct ovsrec_bridge; struct sbrec_chassis; void chassis_register_ovs_idl(struct ovsdb_idl *); -const struct sbrec_chassis *chassis_run(struct controller_ctx *, -const char *chassis_id, -const struct ovsrec_bridge *br_int); +const struct sbrec_chassis *chassis_run( +struct controller_ctx *, +const char *chassis_id, +