On 12/13/22 14:56, Ilya Maximets wrote:
> On 12/13/22 13:21, Dumitru Ceara wrote:
>> On 12/13/22 11:57, Ilya Maximets wrote:
>>> On 12/12/22 20:42, Ilya Maximets wrote:
>>>> On 12/8/22 16:21, Dumitru Ceara wrote:
>>>>> Applications request specific monitor conditions via the
>>>>> ovsdb_cs_set_condition() API.  This returns the condition sequence
>>>>> number at which the client can assume that the server acknowledged
>>>>> and applied the new monitor condition.
>>>>>
>>>>> A corner case is when the client's first request is to set the condition
>>>>> to a value that's equivalent to the default one ("<true>").  In such
>>>>> cases, because it's requested explicitly by the client, we now
>>>>> explicitly send the monitor_cond_change request to the server.  This
>>>>> avoids cases in which the expected condition sequence number and the
>>>>> acked condition sequence number get out of sync.
>>>>
>>>> Sorry, I think I'm lost again.  What exactly we're trying to fix here?
>>>> [True] is a default condition if no other conditions are supplied for
>>>> the table.  Why do we need to send it explicitly?
>>>>
>>
>> The problem is that there's a discrepancy between the returned expected
>> condition seqno and what happens if a condition is explicitly set to
>> [True] before vs after the initial connection happened.
>>
>>>> At first I thought that newly created condition from 
>>>> ovsdb_cs_db_get_table()
>>>> has to be sent out, but...  if it is the first time we're setting
>>>> conditions for that table, default condition should be [True], hence
>>>> equal to the condition we're trying to set.
>>>>
>>>>
>>>> The test case is a bit synthetic as it doesn't seem to test the issue
>>>> that exists in OVN.  The fact that python IDL passes the test without
>>>> changes also doesn't make a lot of sense to me.
>>>>
>>
>> Actually it should test exactly what's happenning in OVN.  That is:
>>
>> - connect to database, no explicit monitor condition set
>> - get initial contents
>> - explicitly set monitor condition to [True] and get seqno S at
>>   which the condition should be "acked".
>> - expect that ovsdb_idl_get_condition_seqno() eventually reaches S
>>
>> The Python IDL behaves differently unfortunately, I'll give more details
>> below.
>>
>>>> Could you provide some more data on what exactly is going on and what
>>>> we're trying to fix?  With jsonrpc logs, if possible.
>>>
>>
>> For the C client, in the ovn-controller case, let's focus on 2 tables
>> Chassis_Private and Chassis_Template_Var.  The difference between these
>> 2 is that Chassis_Template_Var was recently added to the schema so
>> ovn-controller only sets its monitor condition if
>> sbrec_server_has_chassis_template_var_table() is true.  So that's only
>> after the remote schema was received.
>>
>> Sorry for the wide table below but this is the best way I could find
>> that should explain what's happening:
>>
>>       operation/event                        reconnect-state                
>> ovsdb-cs-state                  Chassis_Private-new_cond     
>> Chassis_Template_Var-new_cond  cond-seqno   expected-cond-seqno
>>       
>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>> 1.    set_cond(Chassis_Private, [True])
>>                                                  BACKOFF                   
>> SERVER_SCHEMA_REQUESTED                 [True]                      
>> non-existent                    0                 1
> 
> 
> I think, this very first operation/event is wrong.
> The condition change is never going to be sent, but the expected
> sequence number is reported as 1.  The only reason this doesn't
> cause any problems right away is that we're going to reconnect
> here and drop the state, so the application will not rely on that
> number in the end.  But it is still an incorrect return value.
> 
> The change that I suggested should fix that fundamental issue.
> 

It should, yes.

> <snip>
> 
>>>
>>> Should we do this instead:
>>>
>>> diff --git a/lib/ovsdb-cs.c b/lib/ovsdb-cs.c
>>> index a6fbd290c..0fca03d72 100644
>>> --- a/lib/ovsdb-cs.c
>>> +++ b/lib/ovsdb-cs.c
>>> @@ -892,7 +892,7 @@ ovsdb_cs_db_get_table(struct ovsdb_cs_db *db, const 
>>> char *table)
>>>  
>>>      t = xzalloc(sizeof *t);
>>>      t->name = xstrdup(table);
>>> -    t->new_cond = json_array_create_1(json_boolean_create(true));
>>> +    t->ack_cond = json_array_create_1(json_boolean_create(true));
>>>      hmap_insert(&db->tables, &t->hmap_node, hash);
>>>      return t;
>>>  }
>>> ---
>>>
>>> ?
>>>
>>> This breaks a few tests, but it seems to be a correct change because it
>>> is indeed a default condition that server already has for the table.
>>>
>>> What do you think?  Will that fix the OVN oproblem?
>>>
>>
>> That would fix the OVN problem I think.  I'm not sure how to rewrite the
>> "[track, simple idl, initially populated, strong references, conditional]"
>> test however.
> 
> Something like this should fix it:
> 
> diff --git a/tests/ovsdb-idl.at b/tests/ovsdb-idl.at
> index 57ed0ac9d..1846cb521 100644
> --- a/tests/ovsdb-idl.at
> +++ b/tests/ovsdb-idl.at
> @@ -1585,10 +1585,12 @@ OVSDB_CHECK_IDL([simple idl, initially populated, 
> strong references, conditional
>        {"op": "insert",
>         "table": "simple",
>         "row": {"s": "row0_s"}}]']],
> -  [[000: change conditions
> +  [[000: simple3: conditions unchanged
> +000: simple4: conditions unchanged
> +000: simple: conditions unchanged
>  001: table simple3: name=row0_s3 uset=[] uref=[<0>] uuid=<1>
>  001: table simple4: name=row0_s4 uuid=<0>
> -002: change conditions
> +002: simple4: change conditions
>  003: table simple3: name=row0_s3 uset=[] uref=[] uuid=<1>
>  004: {"error":null,"result":[{"uuid":["uuid","<2>"]}]}
>  005: table simple3: name=row0_s3 uset=[] uref=[] uuid=<1>
> diff --git a/tests/test-ovsdb.c b/tests/test-ovsdb.c
> index 23cfd79a4..1bc5ac17a 100644
> --- a/tests/test-ovsdb.c
> +++ b/tests/test-ovsdb.c
> @@ -2628,7 +2628,7 @@ parse_link2_json_clause(struct ovsdb_idl_condition 
> *cond,
>  }
>  
>  static unsigned int
> -update_conditions(struct ovsdb_idl *idl, char *commands)
> +update_conditions(struct ovsdb_idl *idl, char *commands, int step)
>  {
>      const struct ovsdb_idl_table_class *tc;
>      unsigned int next_cond_seqno = 0;
> @@ -2683,7 +2683,10 @@ update_conditions(struct ovsdb_idl *idl, char 
> *commands)
>          unsigned int seqno = ovsdb_idl_get_condition_seqno(idl);
>          unsigned int next_seqno = ovsdb_idl_set_condition(idl, tc, &cond);
>          if (seqno == next_seqno ) {
> -            ovs_fatal(0, "condition unchanged");
> +            print_and_log("%03d: %s: conditions unchanged",
> +                          step, table_name);
> +        } else {
> +            print_and_log("%03d: %s: change conditions", step, table_name);
>          }
>          unsigned int new_next_seqno = ovsdb_idl_set_condition(idl, tc, 
> &cond);
>          if (next_seqno != new_next_seqno) {
> @@ -2740,8 +2743,7 @@ do_idl(struct ovs_cmdl_context *ctx)
>      const char cond_s[] = "condition ";
>      if (ctx->argc > 2 && strstr(ctx->argv[2], cond_s)) {
>          next_cond_seqno =
> -            update_conditions(idl, ctx->argv[2] + strlen(cond_s));
> -        print_and_log("%03d: change conditions", step++);
> +            update_conditions(idl, ctx->argv[2] + strlen(cond_s), step++);
>          i = 3;
>      } else {
>          i = 2;
> @@ -2809,8 +2811,8 @@ do_idl(struct ovs_cmdl_context *ctx)
>                            arg + strlen(remote_s),
>                            ovsdb_idl_is_connected(idl) ? "true" : "false");
>          }  else if (!strncmp(arg, cond_s, strlen(cond_s))) {
> -            next_cond_seqno = update_conditions(idl, arg + strlen(cond_s));
> -            print_and_log("%03d: change conditions", step++);
> +            next_cond_seqno = update_conditions(idl, arg + strlen(cond_s),
> +                                                step++);
>          } else if (arg[0] != '[') {
>              if (!idl_set(idl, arg, step++)) {
>                  /* If idl_set() returns false, then no transaction
> ---
> 
> I.e. just don't crash if canditions didn't change, because that is expected.
> 
> 

Ok.

>>
>> On the other had, I had something else in mind: what if we just do the
>> same as the python IDL?  That is, create all ovsdb-cs tables immediately
>> after the cs was created.  Like that we we don't change any other IDL
>> behavior.  I'm also a bit worried about fundamental changes given
>> that we need this change in OVN before the release that should happen
>> this week.
>>
>> My suggested change (on top of master) is a bit larger..  Let me know
>> what you think.
> 
> I'm not sure about that.  This doesn't change the fundamental issue, IMO,
> but masks it out.
> 

I'm ok either way in the end.  Do you plan to post a formal patch with
your version or should I?

Thanks,
Dumitru


_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to