On Tue, Sep 11, 2018 at 12:56 AM Han Zhou <[email protected]> wrote:

> On Mon, Sep 10, 2018 at 12:56 AM <[email protected]> wrote:
> >
> > From: Numan Siddique <[email protected]>
> >
> > The present code resets the database when it is in the state -
> > 'RPL_S_SCHEMA_REQUESTED' and repopulates the database when it
> > receives the monitor reply when it is in the state -
> > 'RPL_S_MONITOR_REQUESTED'. If however, it goes to active mode
> > before it processes the monitor reply, the whole data is lost.
> >
> > This patch fixes the issue by resetting the database when it
> > receives the monitor reply (before processing it). so that
> > reset and repopulation of the db happens in the same state.
> >
> > This approach still has a very small window for data loss if the
> > function process_notification() which processes the monitor
> > reply fails for some reason.
> >
> Hi Numan,
>
> Thanks for the patch. As discussed, this method alleviates the problem
> instead of fixing the problem. So I'd suggest update the title and commit
> message.
>
> I'd like to point out that it is not only in failure scenario of
> process_notification, but also normal scenarios when data size is very big
> and process_notification() for the huge data takes time. During this window
> if the process is killed for whatever reason, the data is lost on this
> node. I'd also like to emphasize that the patch really helps in such
> scenario, because the original implementation would block at receiving that
> big amount of data from master, which would leave the data wiped-out state
> for much longer.
>
>
Hi Han, Thanks for the review and the suggestions. I agree. I submitted v2
- https://patchwork.ozlabs.org/patch/968638/ with the updates in commit
message and some comments in the code.

Thanks
Numan

Other than the commit message:
> Acked-by: Han Zhou <[email protected]>
>
> > Reported-by: Han Zhou <[email protected]>
> > Reported-at:
> https://mail.openvswitch.org/pipermail/ovs-discuss/2018-August/047161.html
> > Tested-by: aginwala <[email protected]>
> > Signed-off-by: Numan Siddique <[email protected]>
> > ---
> >  ovsdb/replication.c | 20 ++++++--------------
> >  1 file changed, 6 insertions(+), 14 deletions(-)
> >
> > diff --git a/ovsdb/replication.c b/ovsdb/replication.c
> > index 2b9ae2f83..44428a48e 100644
> > --- a/ovsdb/replication.c
> > +++ b/ovsdb/replication.c
> > @@ -299,19 +299,7 @@ replication_run(void)
> >                  /* After receiving schemas, reset the local databases
> that
> >                   * will be monitored and send out monitor requests for
> them. */
> >                  if (hmap_is_empty(&request_ids)) {
> > -                    struct shash_node *node, *next;
> > -
> > -                    SHASH_FOR_EACH_SAFE (node, next, replication_dbs) {
> > -                        db = node->data;
> > -                        error = reset_database(db);
> > -                        if (error) {
> > -                            const char *db_name = db->schema->name;
> > -                            shash_find_and_delete(replication_dbs,
> db_name);
> > -                            ovsdb_error_assert(error);
> > -                            VLOG_WARN("Failed to reset database, "
> > -                                      "%s not replicated.", db_name);
> > -                        }
> > -                    }
> > +                    struct shash_node *node;
> >
> >                      if (shash_is_empty(replication_dbs)) {
> >                          VLOG_WARN("Nothing to replicate.");
> > @@ -335,7 +323,11 @@ replication_run(void)
> >              case RPL_S_MONITOR_REQUESTED: {
> >                  /* Reply to monitor requests. */
> >                  struct ovsdb_error *error;
> > -                error = process_notification(msg->result, db);
> > +                VLOG_INFO("Monitor request received. Resetting the
> database");
> > +                error = reset_database(db);
> > +                if (!error) {
> > +                    error = process_notification(msg->result, db);
> > +                }
> >                  if (error) {
> >                      ovsdb_error_assert(error);
> >                      state = RPL_S_ERR;
> > --
> > 2.17.1
> >
> > _______________________________________________
> > dev mailing list
> > [email protected]
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to