On 12/13/22 14:56, Ilya Maximets wrote:
> On 12/13/22 13:21, Dumitru Ceara wrote:
>> On 12/13/22 11:57, Ilya Maximets wrote:
>>> On 12/12/22 20:42, Ilya Maximets wrote:
>>>> On 12/8/22 16:21, Dumitru Ceara wrote:
>>>>> Applications request specific monitor conditions via the
>>>>> ovsdb_cs_set_condition() API. This returns the condition sequence
>>>>> number at which the client can assume that the server acknowledged
>>>>> and applied the new monitor condition.
>>>>>
>>>>> A corner case is when the client's first request is to set the condition
>>>>> to a value that's equivalent to the default one ("<true>"). In such
>>>>> cases, because it's requested explicitly by the client, we now
>>>>> explicitly send the monitor_cond_change request to the server. This
>>>>> avoids cases in which the expected condition sequence number and the
>>>>> acked condition sequence number get out of sync.
>>>>
>>>> Sorry, I think I'm lost again. What exactly we're trying to fix here?
>>>> [True] is a default condition if no other conditions are supplied for
>>>> the table. Why do we need to send it explicitly?
>>>>
>>
>> The problem is that there's a discrepancy between the returned expected
>> condition seqno and what happens if a condition is explicitly set to
>> [True] before vs after the initial connection happened.
>>
>>>> At first I thought that newly created condition from
>>>> ovsdb_cs_db_get_table()
>>>> has to be sent out, but... if it is the first time we're setting
>>>> conditions for that table, default condition should be [True], hence
>>>> equal to the condition we're trying to set.
>>>>
>>>>
>>>> The test case is a bit synthetic as it doesn't seem to test the issue
>>>> that exists in OVN. The fact that python IDL passes the test without
>>>> changes also doesn't make a lot of sense to me.
>>>>
>>
>> Actually it should test exactly what's happenning in OVN. That is:
>>
>> - connect to database, no explicit monitor condition set
>> - get initial contents
>> - explicitly set monitor condition to [True] and get seqno S at
>> which the condition should be "acked".
>> - expect that ovsdb_idl_get_condition_seqno() eventually reaches S
>>
>> The Python IDL behaves differently unfortunately, I'll give more details
>> below.
>>
>>>> Could you provide some more data on what exactly is going on and what
>>>> we're trying to fix? With jsonrpc logs, if possible.
>>>
>>
>> For the C client, in the ovn-controller case, let's focus on 2 tables
>> Chassis_Private and Chassis_Template_Var. The difference between these
>> 2 is that Chassis_Template_Var was recently added to the schema so
>> ovn-controller only sets its monitor condition if
>> sbrec_server_has_chassis_template_var_table() is true. So that's only
>> after the remote schema was received.
>>
>> Sorry for the wide table below but this is the best way I could find
>> that should explain what's happening:
>>
>> operation/event reconnect-state
>> ovsdb-cs-state Chassis_Private-new_cond
>> Chassis_Template_Var-new_cond cond-seqno expected-cond-seqno
>>
>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>> 1. set_cond(Chassis_Private, [True])
>> BACKOFF
>> SERVER_SCHEMA_REQUESTED [True]
>> non-existent 0 1
>
>
> I think, this very first operation/event is wrong.
> The condition change is never going to be sent, but the expected
> sequence number is reported as 1. The only reason this doesn't
> cause any problems right away is that we're going to reconnect
> here and drop the state, so the application will not rely on that
> number in the end. But it is still an incorrect return value.
>
> The change that I suggested should fix that fundamental issue.
>
It should, yes.
> <snip>
>
>>>
>>> Should we do this instead:
>>>
>>> diff --git a/lib/ovsdb-cs.c b/lib/ovsdb-cs.c
>>> index a6fbd290c..0fca03d72 100644
>>> --- a/lib/ovsdb-cs.c
>>> +++ b/lib/ovsdb-cs.c
>>> @@ -892,7 +892,7 @@ ovsdb_cs_db_get_table(struct ovsdb_cs_db *db, const
>>> char *table)
>>>
>>> t = xzalloc(sizeof *t);
>>> t->name = xstrdup(table);
>>> - t->new_cond = json_array_create_1(json_boolean_create(true));
>>> + t->ack_cond = json_array_create_1(json_boolean_create(true));
>>> hmap_insert(&db->tables, &t->hmap_node, hash);
>>> return t;
>>> }
>>> ---
>>>
>>> ?
>>>
>>> This breaks a few tests, but it seems to be a correct change because it
>>> is indeed a default condition that server already has for the table.
>>>
>>> What do you think? Will that fix the OVN oproblem?
>>>
>>
>> That would fix the OVN problem I think. I'm not sure how to rewrite the
>> "[track, simple idl, initially populated, strong references, conditional]"
>> test however.
>
> Something like this should fix it:
>
> diff --git a/tests/ovsdb-idl.at b/tests/ovsdb-idl.at
> index 57ed0ac9d..1846cb521 100644
> --- a/tests/ovsdb-idl.at
> +++ b/tests/ovsdb-idl.at
> @@ -1585,10 +1585,12 @@ OVSDB_CHECK_IDL([simple idl, initially populated,
> strong references, conditional
> {"op": "insert",
> "table": "simple",
> "row": {"s": "row0_s"}}]']],
> - [[000: change conditions
> + [[000: simple3: conditions unchanged
> +000: simple4: conditions unchanged
> +000: simple: conditions unchanged
> 001: table simple3: name=row0_s3 uset=[] uref=[<0>] uuid=<1>
> 001: table simple4: name=row0_s4 uuid=<0>
> -002: change conditions
> +002: simple4: change conditions
> 003: table simple3: name=row0_s3 uset=[] uref=[] uuid=<1>
> 004: {"error":null,"result":[{"uuid":["uuid","<2>"]}]}
> 005: table simple3: name=row0_s3 uset=[] uref=[] uuid=<1>
> diff --git a/tests/test-ovsdb.c b/tests/test-ovsdb.c
> index 23cfd79a4..1bc5ac17a 100644
> --- a/tests/test-ovsdb.c
> +++ b/tests/test-ovsdb.c
> @@ -2628,7 +2628,7 @@ parse_link2_json_clause(struct ovsdb_idl_condition
> *cond,
> }
>
> static unsigned int
> -update_conditions(struct ovsdb_idl *idl, char *commands)
> +update_conditions(struct ovsdb_idl *idl, char *commands, int step)
> {
> const struct ovsdb_idl_table_class *tc;
> unsigned int next_cond_seqno = 0;
> @@ -2683,7 +2683,10 @@ update_conditions(struct ovsdb_idl *idl, char
> *commands)
> unsigned int seqno = ovsdb_idl_get_condition_seqno(idl);
> unsigned int next_seqno = ovsdb_idl_set_condition(idl, tc, &cond);
> if (seqno == next_seqno ) {
> - ovs_fatal(0, "condition unchanged");
> + print_and_log("%03d: %s: conditions unchanged",
> + step, table_name);
> + } else {
> + print_and_log("%03d: %s: change conditions", step, table_name);
> }
> unsigned int new_next_seqno = ovsdb_idl_set_condition(idl, tc,
> &cond);
> if (next_seqno != new_next_seqno) {
> @@ -2740,8 +2743,7 @@ do_idl(struct ovs_cmdl_context *ctx)
> const char cond_s[] = "condition ";
> if (ctx->argc > 2 && strstr(ctx->argv[2], cond_s)) {
> next_cond_seqno =
> - update_conditions(idl, ctx->argv[2] + strlen(cond_s));
> - print_and_log("%03d: change conditions", step++);
> + update_conditions(idl, ctx->argv[2] + strlen(cond_s), step++);
> i = 3;
> } else {
> i = 2;
> @@ -2809,8 +2811,8 @@ do_idl(struct ovs_cmdl_context *ctx)
> arg + strlen(remote_s),
> ovsdb_idl_is_connected(idl) ? "true" : "false");
> } else if (!strncmp(arg, cond_s, strlen(cond_s))) {
> - next_cond_seqno = update_conditions(idl, arg + strlen(cond_s));
> - print_and_log("%03d: change conditions", step++);
> + next_cond_seqno = update_conditions(idl, arg + strlen(cond_s),
> + step++);
> } else if (arg[0] != '[') {
> if (!idl_set(idl, arg, step++)) {
> /* If idl_set() returns false, then no transaction
> ---
>
> I.e. just don't crash if canditions didn't change, because that is expected.
>
>
Ok.
>>
>> On the other had, I had something else in mind: what if we just do the
>> same as the python IDL? That is, create all ovsdb-cs tables immediately
>> after the cs was created. Like that we we don't change any other IDL
>> behavior. I'm also a bit worried about fundamental changes given
>> that we need this change in OVN before the release that should happen
>> this week.
>>
>> My suggested change (on top of master) is a bit larger.. Let me know
>> what you think.
>
> I'm not sure about that. This doesn't change the fundamental issue, IMO,
> but masks it out.
>
I'm ok either way in the end. Do you plan to post a formal patch with
your version or should I?
Thanks,
Dumitru
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev