> On Thu, May 16, 2019 at 06:05:24PM +0200, Lorenzo Bianconi wrote: > > Add Controller_Event table to OVN SBDB in order to > > report CMS related event. > > Introduce event_table hashmap array and controller_event related > > structures to ovn-controller in order to track pending events > > forwarded by ovs-vswitchd. Moreover integrate event_table hashmap > > array with event_table ovn-sbdb table > > > > Signed-off-by: Mark Michelson <[email protected]> > > Co-authored-by: Mark Michelson <[email protected]> > > Signed-off-by: Lorenzo Bianconi <[email protected]>
Hi Ben, thanks for the review, few comments inline. Regards, Lorenzo > > This seems like one potentially reasonable way for ovn-controller to > report events to the CMS. I want to point out some design aspects that > might need to be considered: > > 1. Controller_Event doesn't include a column to attribute an event to a > particular Chassis. This might be important for accounting: how can > an ovn-controller tell which events it owns and should eventually > delete? Currently it looks like every ovn-controller always iterates > over every Controller_Event; is that really desirable and scalable? > Such a column would also allow update_sb_monitors() to monitor only > the rows associated with the current chassis, increasing scalability. ack, I will add it in v2 > > Keep in mind that ovn-controller probably can't just keep in memory > which events belong to it, because it might be restarted and needs to > be able to recover that information. > > 2. Are these events (or each individual event, anyway) single-consumer? > I believe that this design only works for single-consumer events > because only one consumer can mark them as 'handled'. Yes, I guess a single-consumer approach would work, at least for the architecture we were thinking about > > 3. Is there value in having 'handled', instead of just having the CMS > delete rows that it has processed? Using 'handled' requires an extra > round-trip between ovn-controller and the database. The main reason is to manage the Controller_Event rows in the controller, but I guess we can delete 'handled' rows even in the CMS. I will fix it in v2 > > 4. What is the tolerance for events that are never delivered or that are > delivered more than once? What can actually be guaranteed, given > that the database can die and that ovn-controller can die? (Also, > OVSDB transactions cannot guarantee exactly-once semantics in corner > cases unless the transactions are idempotent.) If the ovn-controller dies I think there is no too much we can do, events will be lost until the controller restarts properly. If ovn-northd or the connection to the db dies, controller_event_run() will not manage the Controller_Event table and pinctrl_handle_event() will queue the pending events in the event_table hash until the upper limit is reached. We can probably add a garbage collector for the pending events in the table. What do you think? > > 5. How is the number of per-controller rows limited? If it enough to add a limit to the controller hash size (https://github.com/LorenzoBianconi/ovs/blob/ovn_unidling/ovn/controller/pinctrl.c#L3488) or do you mean to add a limit to per-controller rows in the Controller_Event table? > > I think the workflow for these events should be clearly specified. I > guess it's something like this: > > 1. ovn-controller detects that an event has occurred and adds a > corresponding row to the Controller_Event table with false 'handled'. > > 2. CMS consumes the row and acts on it. > > 3. CMS changes 'handled' to true. > > 4. ovn-controller deletes row. > > although I think that "3. CMS deletes row." seems more straightforward. > > Did you consider the inter-version compatibility issues of making > event_type an enum? It will force an upgrade order of first ovn-northd, > then the database schema, then ovn-controller, because any other order > will make it possible that ovn-controller tries to add an invalid event > type (which ovsdb-server will reject) or that an event that ovn-northd > doesn't understand nevertheless reaches it. However, > Documentation/intro/install/ovn-upgrades.rst recommends a different > order. In any case the upgrade implications need to be considered. I guess there are no compatibility issues during upgrade, since: - if we upgrade the ovn-northd first and then the controller 'unknown' events will not be forwarded by ovn-controller - if we upgrade ovn-controller first and then ovn-northd 'unknown' events will not be forwarded since the related logical flows are missing in ovn-northd Am I missing something? > > Some typo fixes: applied, thx > > diff --git a/ovn/ovn-sb.xml b/ovn/ovn-sb.xml > index ae5225e18fd0..a010c95b221b 100644 > --- a/ovn/ovn-sb.xml > +++ b/ovn/ovn-sb.xml > @@ -3476,8 +3476,8 @@ tcp.flags = RST; > </table> > <table name="Controller_Event" title="Controller Event table"> > <p> > - Database table used by ovn-controller to report CMS related > - events > + Database table used by <code>ovn-controller</code> to report CMS > related > + events. The workflow for event processing is: > </p> > <column name="event_type" > type='{"type": "string", "enum": ["set", > ["empty_lb_backends"]]}'> > @@ -3485,7 +3485,7 @@ tcp.flags = RST; > </column> > <column name="event_info"> > <p> > - Key-value pairs used to spcify evento info to the CMS. > + Key-value pairs used to specify event info to the CMS. > Possible values are: > </p> > <ul> > @@ -3498,7 +3498,7 @@ tcp.flags = RST; > <code>empty_lb_backends</code> event > </li> > <li> > - <code>load_balancer</code>: UUID fo the load balancer reported for > + <code>load_balancer</code>: UUID of the load balancer reported for > the <code>empty_lb_backends</code> event > </li> > </ul>
_______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
