On Tue, Sep 11, 2018 at 10:55:56AM -0700, Han Zhou wrote:
> On Tue, Sep 11, 2018 at 10:30 AM <[email protected]> wrote:
> >
> > From: Numan Siddique <[email protected]>
> >
> > The present code resets the database when it is in the state -
> > 'RPL_S_SCHEMA_REQUESTED' and repopulates the database when it
> > receives the monitor reply when it is in the state -
> > 'RPL_S_MONITOR_REQUESTED'. If however, it goes to active mode
> > before it processes the monitor reply, the whole data is lost.
> >
> > This patch alleviates the problem by resetting the database when it
> > receives the monitor reply (before processing it). So that
> > reset and repopulation of the db happens in the same state.
> >
> > This approach still has a window for data loss if the function
> > process_notification() when processing the monitor reply fails for
> > some reason or ovsdb-server crashes for some reason during
> > process_notification().
> >
> > Reported-by: Han Zhou <[email protected]>
> > Reported-at:
> https://mail.openvswitch.org/pipermail/ovs-discuss/2018-August/047161.html
> > Tested-by: aginwala <[email protected]>
> > Acked-by: Han Zhou <[email protected]>
> > Signed-off-by: Numan Siddique <[email protected]>
> > ---
> >  ovsdb/replication.c | 25 +++++++++++--------------
> >  1 file changed, 11 insertions(+), 14 deletions(-)
> >
> > v1 -> v2
> > --------
> >  * Updated the commit message as per Han's suggestion
> >  * Added few comments in the code where it resets the db.
> >
> > diff --git a/ovsdb/replication.c b/ovsdb/replication.c
> > index 2b9ae2f83..752b3c89c 100644
> > --- a/ovsdb/replication.c
> > +++ b/ovsdb/replication.c
> > @@ -299,19 +299,7 @@ replication_run(void)
> >                  /* After receiving schemas, reset the local databases
> that
> >                   * will be monitored and send out monitor requests for
> them. */
> >                  if (hmap_is_empty(&request_ids)) {
> > -                    struct shash_node *node, *next;
> > -
> > -                    SHASH_FOR_EACH_SAFE (node, next, replication_dbs) {
> > -                        db = node->data;
> > -                        error = reset_database(db);
> > -                        if (error) {
> > -                            const char *db_name = db->schema->name;
> > -                            shash_find_and_delete(replication_dbs,
> db_name);
> > -                            ovsdb_error_assert(error);
> > -                            VLOG_WARN("Failed to reset database, "
> > -                                      "%s not replicated.", db_name);
> > -                        }
> > -                    }
> > +                    struct shash_node *node;
> >
> >                      if (shash_is_empty(replication_dbs)) {
> >                          VLOG_WARN("Nothing to replicate.");
> > @@ -335,7 +323,16 @@ replication_run(void)
> >              case RPL_S_MONITOR_REQUESTED: {
> >                  /* Reply to monitor requests. */
> >                  struct ovsdb_error *error;
> > -                error = process_notification(msg->result, db);
> > +                VLOG_INFO("Monitor request received. Resetting the
> database");
> > +                /* Resetting the database here has few risks. If the
> > +                 * process_notification() fails, the database is
> completely
> > +                 * lost locally. In case that node becomes active, then
> > +                 * there is a chance of complete data loss in the
> active/standy
> > +                 * cluster. */
> > +                error = reset_database(db);
> > +                if (!error) {
> > +                    error = process_notification(msg->result, db);
> > +                }
> >                  if (error) {
> >                      ovsdb_error_assert(error);
> >                      state = RPL_S_ERR;
> > --
> > 2.17.1
> >
> > _______________________________________________
> > dev mailing list
> > [email protected]
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> 
> Thanks Numan!

Thanks Numan and Han.  I applied this to master.  If you'd like it
backported, please let me know (and how far).
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to