On 12/13/22 11:57, Ilya Maximets wrote:
> On 12/12/22 20:42, Ilya Maximets wrote:
>> On 12/8/22 16:21, Dumitru Ceara wrote:
>>> Applications request specific monitor conditions via the
>>> ovsdb_cs_set_condition() API.  This returns the condition sequence
>>> number at which the client can assume that the server acknowledged
>>> and applied the new monitor condition.
>>>
>>> A corner case is when the client's first request is to set the condition
>>> to a value that's equivalent to the default one ("<true>").  In such
>>> cases, because it's requested explicitly by the client, we now
>>> explicitly send the monitor_cond_change request to the server.  This
>>> avoids cases in which the expected condition sequence number and the
>>> acked condition sequence number get out of sync.
>>
>> Sorry, I think I'm lost again.  What exactly we're trying to fix here?
>> [True] is a default condition if no other conditions are supplied for
>> the table.  Why do we need to send it explicitly?
>>

The problem is that there's a discrepancy between the returned expected
condition seqno and what happens if a condition is explicitly set to
[True] before vs after the initial connection happened.

>> At first I thought that newly created condition from ovsdb_cs_db_get_table()
>> has to be sent out, but...  if it is the first time we're setting
>> conditions for that table, default condition should be [True], hence
>> equal to the condition we're trying to set.
>>
>>
>> The test case is a bit synthetic as it doesn't seem to test the issue
>> that exists in OVN.  The fact that python IDL passes the test without
>> changes also doesn't make a lot of sense to me.
>>

Actually it should test exactly what's happenning in OVN.  That is:

- connect to database, no explicit monitor condition set
- get initial contents
- explicitly set monitor condition to [True] and get seqno S at
  which the condition should be "acked".
- expect that ovsdb_idl_get_condition_seqno() eventually reaches S

The Python IDL behaves differently unfortunately, I'll give more details
below.

>> Could you provide some more data on what exactly is going on and what
>> we're trying to fix?  With jsonrpc logs, if possible.
> 

For the C client, in the ovn-controller case, let's focus on 2 tables
Chassis_Private and Chassis_Template_Var.  The difference between these
2 is that Chassis_Template_Var was recently added to the schema so
ovn-controller only sets its monitor condition if
sbrec_server_has_chassis_template_var_table() is true.  So that's only
after the remote schema was received.

Sorry for the wide table below but this is the best way I could find
that should explain what's happening:

      operation/event                        reconnect-state                
ovsdb-cs-state                  Chassis_Private-new_cond     
Chassis_Template_Var-new_cond  cond-seqno   expected-cond-seqno
      
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1.    set_cond(Chassis_Private, [True])
                                                 BACKOFF                   
SERVER_SCHEMA_REQUESTED                 [True]                      
non-existent                    0                 1
2.    reconnect->CONNECTING
      - ovsdb_cs_db_sync_condition(OVN_Southbound) is called and
        ack_all == true
      - this will move all ovsdb_cs table->req_cond to table->new_cond;
        ovsdb_cs only knows of the existence of the Chassis_Private
        table.
                                                 CONNECTING                
SERVER_SCHEMA_REQUESTED                   nil                       
non-existent                    0                 1

# new event loop iteration
3.    set_cond(Chassis_Private, [True])
                                                 CONNECTING                
SERVER_SCHEMA_REQUESTED                   nil                       
non-existent                    0                 0
4.    reconnect->CONNECTED
      - ovsdb_cs requests server schema
      jsonrpc|DBG|ssl:127.0.0.1:6642: send request, method="get_schema", 
params=["_Server"], id=7
                                                 CONNECTED                 
SERVER_SCHEMA_REQUESTED                   nil                       
non-existent                    0                 0

# new event loop iteration
5.    set_cond(Chassis_Private, [True])
                                                 CONNECTED                 
SERVER_SCHEMA_REQUESTED                   nil                       
non-existent                    0                 0
6.    ovsdb_cs receives server schema:
      jsonrpc|DBG|ssl:127.0.0.1:6642: received reply, 
result={...,"name":"_Server",...}
      - ovsdb_cs sends monitor_cond for the server db:
      jsonrpc|DBG|ssl:127.0.0.1:6642: send request, method="monitor_cond", 
params=["_Server",...]
                                                 CONNECTED                 
SERVER_MONITOR_REQUESTED                  nil                       
non-existent                    0                 0

# new event loop iteration
7.    set_cond(Chassis_Private, [True])
                                                 CONNECTED                 
SERVER_MONITOR_REQUESTED                  nil                       
non-existent                    0                 0
8.    ovsdb_cs receives database schema:
      jsonrpc|DBG|ssl:127.0.0.1:6642: received reply, 
result={"Database":{"1eaa91c6-771d-4456-93d1-23fe3e559a05":{"initial":{"schema":"{...\"name\":\"OVN_Southbound\",
      - at this point the Chassis_Template_Var table becomes "available".
      - ovsdb_cs issues a monitor_cond_since with the current condition
        (everything default, no "where" clause):
      jsonrpc|DBG|ssl:127.0.0.1:6642: send request, 
method="monitor_cond_since", params=["OVN_Southbound"...
      - ovsdb_cs issues a set_db_change_aware request:
      jsonrpc|DBG|ssl:127.0.0.1:6642: send request, 
method="set_db_change_aware", params=[true]
                                                 CONNECTED                 
DATA_MONITOR_COND_SINCE_REQUESTED         nil                       
non-existent                    0                 0

# new event loop iteration
9.    set_cond(Chassis_Private, [True])
                                                 CONNECTED                 
SERVER_MONITOR_REQUESTED                  nil                       
non-existent                    0                 0
10.   set_cond(Chassis_Template_Var, [True])
      - the table didn't exist in ovsdb-cs so it's created, with new_cond
        set to [True]
      - because new_cond is the only condition set and the requested
        condition is also [True] then db->cond_changed is not touched
        so no monitor_cond_change request will be sent
      - but the expected cond seqno (returned value) is computed as 1
                                                 CONNECTED                 
SERVER_MONITOR_REQUESTED                  nil                            [True] 
                    0                 1

# new event loop iteration
9.    set_cond(Chassis_Private, [True])
                                                 CONNECTED                 
SERVER_MONITOR_REQUESTED                  nil                             nil   
                    0                 1
10.   set_cond(Chassis_Template_Var, [True])
      - nothing changed so expected cond seqno stays 1 and
        db->cond_changed is still false
                                                 CONNECTED                 
SERVER_MONITOR_REQUESTED                  nil                            [True] 
                    0                 1

At this point we have a discrepancy between the expected cond seqno and the
actual cond seqno.  This causes an issue because ovn-controller relies on
these being equal for determining if all monitor condition changes are
up to date.  Moreover, when ovn-monitor-all is set to True, no other
condition changes will be triggered by ovn-controller.  So we stay in
this broken state for ever.

Now, for the Python IDL part, the main difference there is in step "2":

2.    reconnect->CONNECTING
      - Idl.sync_conditions() is called and ack_all == true
      - this will move all IDL table->req_cond to table->new_cond;
        This applies to ALL tables the client registered for, including
        the Chassis_Template_Var table.
                                                 CONNECTING                
SERVER_SCHEMA_REQUESTED                   nil                             nil   
                    0                 1

So steps 9 and 10 would look like:
9.    set_cond(Chassis_Private, [True])
                                                 CONNECTED                 
SERVER_MONITOR_REQUESTED                  nil                             nil   
                    0                 0
10.   set_cond(Chassis_Template_Var, [True])
      - nothing really changed, we shouldn't send anything, seqnos stay 0
                                                 CONNECTED                 
SERVER_MONITOR_REQUESTED                  nil                             nil   
                    0                 0

> 
> Should we do this instead:
> 
> diff --git a/lib/ovsdb-cs.c b/lib/ovsdb-cs.c
> index a6fbd290c..0fca03d72 100644
> --- a/lib/ovsdb-cs.c
> +++ b/lib/ovsdb-cs.c
> @@ -892,7 +892,7 @@ ovsdb_cs_db_get_table(struct ovsdb_cs_db *db, const char 
> *table)
>  
>      t = xzalloc(sizeof *t);
>      t->name = xstrdup(table);
> -    t->new_cond = json_array_create_1(json_boolean_create(true));
> +    t->ack_cond = json_array_create_1(json_boolean_create(true));
>      hmap_insert(&db->tables, &t->hmap_node, hash);
>      return t;
>  }
> ---
> 
> ?
> 
> This breaks a few tests, but it seems to be a correct change because it
> is indeed a default condition that server already has for the table.
> 
> What do you think?  Will that fix the OVN oproblem?
> 

That would fix the OVN problem I think.  I'm not sure how to rewrite the
"[track, simple idl, initially populated, strong references, conditional]"
test however.

On the other had, I had something else in mind: what if we just do the
same as the python IDL?  That is, create all ovsdb-cs tables immediately
after the cs was created.  Like that we we don't change any other IDL
behavior.  I'm also a bit worried about fundamental changes given
that we need this change in OVN before the release that should happen
this week.

My suggested change (on top of master) is a bit larger..  Let me know
what you think.

Thanks,
Dumitru

---
diff --git a/lib/ovsdb-cs.c b/lib/ovsdb-cs.c
index a6fbd290c87d..7dd07c75eebe 100644
--- a/lib/ovsdb-cs.c
+++ b/lib/ovsdb-cs.c
@@ -815,6 +815,8 @@ struct ovsdb_cs_db_table {
     struct json *ack_cond; /* Last condition acked by the server. */
     struct json *req_cond; /* Last condition requested to the server. */
     struct json *new_cond; /* Latest condition set by the IDL client. */
+    bool in_server_schema; /* Indicates if this table is in the server schema
+                            * or not. */
 };
 
 /* A kind of condition, so that we can treat equivalent JSON as equivalent. */
@@ -876,6 +878,17 @@ condition_clone(const struct json *condition)
     OVS_NOT_REACHED();
 }
 
+static struct ovsdb_cs_db_table *
+ovsdb_cs_db_add_table(struct ovsdb_cs_db *db, const char *table)
+{
+    struct ovsdb_cs_db_table *t = xzalloc(sizeof *t);
+
+    t->name = xstrdup(table);
+    t->new_cond = json_array_create_1(json_boolean_create(true));
+    hmap_insert(&db->tables, &t->hmap_node, hash_string(table, 0));
+    return t;
+}
+
 /* Returns the ovsdb_cs_db_table associated with 'table' in 'db', creating an
  * empty one if necessary. */
 static struct ovsdb_cs_db_table *
@@ -890,11 +903,7 @@ ovsdb_cs_db_get_table(struct ovsdb_cs_db *db, const char 
*table)
         }
     }
 
-    t = xzalloc(sizeof *t);
-    t->name = xstrdup(table);
-    t->new_cond = json_array_create_1(json_boolean_create(true));
-    hmap_insert(&db->tables, &t->hmap_node, hash);
-    return t;
+    return ovsdb_cs_db_add_table(db, table);
 }
 
 static void
@@ -953,6 +962,21 @@ ovsdb_cs_db_set_condition(struct ovsdb_cs_db *db, const 
char *table,
     }
 }
 
+void
+ovsdb_cs_set_table_in_schema(struct ovsdb_cs *cs, const char *table,
+                             bool in_schema)
+{
+    struct ovsdb_cs_db_table *t = ovsdb_cs_db_get_table(&cs->data, table);
+
+    t->in_server_schema = in_schema;
+}
+
+bool
+ovsdb_cs_get_table_in_schema(struct ovsdb_cs *cs, const char *table)
+{
+    return ovsdb_cs_db_get_table(&cs->data, table)->in_server_schema;
+}
+
 /* Sets the replication condition for 'tc' in 'cs' to 'condition' and arranges
  * to send the new condition to the database server.
  *
@@ -1004,6 +1028,10 @@ ovsdb_cs_db_compose_cond_change(struct ovsdb_cs_db *db)
     struct json *monitor_cond_change_requests = NULL;
     struct ovsdb_cs_db_table *table;
     HMAP_FOR_EACH (table, hmap_node, &db->tables) {
+        if (!table->in_server_schema) {
+            continue;
+        }
+
         /* Always use the most recent conditions set by the CS client when
          * requesting monitor_cond_change, i.e., table->new_cond.
          */
@@ -1483,6 +1511,9 @@ ovsdb_cs_send_monitor_request(struct ovsdb_cs *cs, struct 
ovsdb_cs_db *db,
     if (version > 1) {
         struct ovsdb_cs_db_table *table;
         HMAP_FOR_EACH (table, hmap_node, &db->tables) {
+            if (!table->in_server_schema) {
+                continue;
+            }
             if (table->ack_cond) {
                 struct json *mr = shash_find_data(json_object(mrs),
                                                   table->name);
diff --git a/lib/ovsdb-cs.h b/lib/ovsdb-cs.h
index 5d5b58f0a0a6..c3d9e82df5fd 100644
--- a/lib/ovsdb-cs.h
+++ b/lib/ovsdb-cs.h
@@ -133,6 +133,10 @@ int ovsdb_cs_get_last_error(const struct ovsdb_cs *);
 
 void ovsdb_cs_set_probe_interval(const struct ovsdb_cs *, int probe_interval);
 
+void ovsdb_cs_set_table_in_schema(struct ovsdb_cs *, const char *table,
+                                  bool in_schema);
+bool ovsdb_cs_get_table_in_schema(struct ovsdb_cs *, const char *table);
+
 /* Conditional monitoring (specifying that only rows matching particular
  * criteria should be monitored).
  *
diff --git a/lib/ovsdb-idl-provider.h b/lib/ovsdb-idl-provider.h
index 8d2b7d6b9140..c0e1d1f4345d 100644
--- a/lib/ovsdb-idl-provider.h
+++ b/lib/ovsdb-idl-provider.h
@@ -125,8 +125,6 @@ struct ovsdb_idl_table {
     struct hmap rows;        /* Contains "struct ovsdb_idl_row"s. */
     struct ovsdb_idl *idl;   /* Containing IDL instance. */
     unsigned int change_seqno[OVSDB_IDL_CHANGE_MAX];
-    bool in_server_schema;   /* Indicates if this table is in the server schema
-                              * or not. */
     struct ovs_list indexes;    /* Contains "struct ovsdb_idl_index"s */
     struct ovs_list track_list; /* Tracked rows (ovsdb_idl_row.track_node). */
 };
diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c
index dbdfe45d87ea..ead4a5857d73 100644
--- a/lib/ovsdb-idl.c
+++ b/lib/ovsdb-idl.c
@@ -295,8 +295,8 @@ ovsdb_idl_create_unconnected(const struct ovsdb_idl_class 
*class,
             = table->change_seqno[OVSDB_IDL_CHANGE_MODIFY]
             = table->change_seqno[OVSDB_IDL_CHANGE_DELETE] = 0;
         table->idl = idl;
-        table->in_server_schema = false;
         sset_init(&table->schema_columns);
+        ovsdb_cs_set_table_in_schema(idl->cs, tc->name, false);
     }
 
     return idl;
@@ -807,10 +807,10 @@ ovsdb_idl_compose_monitor_request(const struct json 
*schema_json, void *idl_)
                           "(database needs upgrade?)",
                           idl->class_->database, table->class_->name);
                 json_destroy(columns);
-                table->in_server_schema = false;
+                ovsdb_cs_set_table_in_schema(idl->cs, tc->name, false);
                 continue;
             } else if (schema && table_schema) {
-                table->in_server_schema = true;
+                ovsdb_cs_set_table_in_schema(idl->cs, tc->name, true);
             }
 
             monitor_request = json_object_create();
@@ -964,7 +964,7 @@ ovsdb_idl_server_has_table(const struct ovsdb_idl *idl,
     const struct ovsdb_idl_table *table =
         ovsdb_idl_table_from_class(idl, table_class);
 
-    return (table && table->in_server_schema);
+    return (table && ovsdb_cs_get_table_in_schema(idl->cs, table_class->name));
 }
 
 /* Returns 'true' if the 'idl' has seen the 'column' in the schema
@@ -984,8 +984,8 @@ ovsdb_idl_server_has_column(const struct ovsdb_idl *idl,
     const struct ovsdb_idl_table *table =
         ovsdb_idl_table_from_column(idl, column);
 
-    return (table->in_server_schema && sset_find(&table->schema_columns,
-                                                 column->name));
+    return (ovsdb_cs_get_table_in_schema(idl->cs, table->class_->name)
+            && sset_find(&table->schema_columns, column->name));
 }
 ^L
 /* A single clause within an ovsdb_idl_condition. */
---

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to