On 10/19/20 8:22 PM, Han Zhou wrote:
> 
> 
> On Mon, Oct 12, 2020 at 7:32 AM Mark Gray <[email protected] 
> <mailto:[email protected]>> wrote:
>>
>> On 10/10/2020 16:42, Mark Gray wrote:
>> > On 08/10/2020 19:57, Dumitru Ceara wrote:
>> >> On 10/2/20 1:13 AM, Han Zhou wrote:
>> >>>
>> >>>
>> >>> On Thu, Oct 1, 2020 at 2:30 AM Mark Gray <[email protected] 
>> >>> <mailto:[email protected]>
>> >>> <mailto:[email protected] <mailto:[email protected]>>> wrote:
>> >>>>
>> >>>> On 30/09/2020 21:04, Han Zhou wrote:
>> >>>>> On Wed, Sep 30, 2020 at 2:21 AM Dumitru Ceara <[email protected] 
>> >>>>> <mailto:[email protected]>
>> >>> <mailto:[email protected] <mailto:[email protected]>>> wrote:
>> >>>>>>
>> >>>>>> On 9/29/20 8:34 PM, Mark Gray wrote:
>> >>>>>>> Currently all functions of the type *_is_new() always return
>> >>>>>>> 'false'. This patch resolves this issue by using the
>> >>>>>>> 'OVSDB_IDL_CHANGE_INSERT' 'change_seqno' instead of the
>> >>>>>>> 'OVSDB_IDL_CHANGE_MODIFY' 'change_seqno' to determine if a row
>> >>>>>>> is new and by resetting the 'OVSDB_IDL_CHANGE_INSERT'
>> >>>>>>> 'change_seqno' on clear.
>> >>>>>>>
>> >>>>>>> Further to this, the code is also updated to match this
>> >>>>>>> behaviour:
>> >>>>>>>
>> >>>>>>> When a row is inserted, the 'OVSDB_IDL_CHANGE_INSERT'
>> >>>>>>> 'change_seqno' is updated to match the new database
>> >>>>>>> change_seqno. The 'OVSDB_IDL_CHANGE_MODIFY' 'change_seqno'
>> >>>>>>> is not set for inserted rows (only for updated rows).
>> >>>>>>>
>> >>>>>>> At the end of a run, ovsdb_idl_db_track_clear() should be
>> >>>>>>> called to clear all tracking information, this includes
>> >>>>>>> resetting all row 'change_seqno' to zero. This will ensure
>> >>>>>>> that subsequent runs will not see a previously 'new' row.
>> >>>>>>>
>> >>>>>>> add_tracked_change_for_references() is updated to only
>> >>>>>>> track rows that reference the current row.
>> >>>>>>>
>> >>>>>>> Signed-off-by: Mark Gray <[email protected] 
>> >>>>>>> <mailto:[email protected]>
>> >>> <mailto:[email protected] <mailto:[email protected]>>>
>> >>>>>>> Suggested-by: Dumitru Ceara <[email protected] 
>> >>>>>>> <mailto:[email protected]>
>> >>> <mailto:[email protected] <mailto:[email protected]>>>
>> >>>>>>> Reported-at: https://bugzilla.redhat.com/1883562
>> >>>>>>
>> >>>>>> Hi Mark,
>> >>>>>>
>> >>>>>> Thanks for working on this, it definitely looks like a nasty bug to 
>> >>>>>> me!
>> >>>>>>
>> >>>>>> We were lucky to not hit it in ovn-controller before. That's just
>> >>>>>> because all the "update" operations were handled there as "delete" +
>> >>>>>> "insert" but that is not mandatory and might change.>>
>> >>>>>
>> >>>>> Agree. Thanks for this critical finding! It was my bad - commit
>> >>> ca545a787ac
>> >>>>> introduced this bug, which was fixing another bug related to is_new. I
>> >>>>> should have added test cases to avoid the miss. Apologize.
>> >>>>
>> >>>> No worries, I will reference this commit in the commit message.
>> >>>>>
>> >>>>> For this fix, I don't understand why we need to check
>> >>>>> OVSDB_IDL_CHANGE_INSERT. I think the root cause is that in
>> >>>>> add_tracked_change_for_references() it updated the
>> >>> OVSDB_IDL_CHANGE_MODIFY
>> >>>>> seqno for the initial inserted *new* row, while it shouldn't. So
>> >>> should the
>> >>>>> fix only include the changes in add_tracked_change_for_references() and
>> >>>>> ovsdb_idl_row_change__()? Why do we need to change the implementation 
>> >>>>> of
>> >>>>> _is_new() itself? Could you explain?
>> >>>>
>> >>>> In the case of a newly inserted row, the change_seqno triplet will still
>> >>>> be {X, 0, 0}. It will remain as such until the row is modified, at which
>> >>>> point it will become {X, Y, 0}. This means that the _*is_new() function
>> >>>> will continue to return true until the row is modified which is not
>> >>>> correct behaviour.
>> >>>
>> >>> Hmm, this is not a problem because if a row inserted in a previous run
>> >>> is not modified/deleted in the current run, it won't be added to
>> >>> tracking so no one would call _is_new() for the row. However, there will
>> >>> be a problem when a row is deleted (see below).
>> >>>
>> >>>> If we reset the change_seqno triplet on each run, it
>> >>>> will still not work. In the end, it is probably better to use the
>> >>>> OVSDB_IDL_CHANGE_INSERT element to track this.
>> >>>>
>> >>> Indeed, we should use OVSDB_IDL_CHANGE_INSERT to check _is_new() because
>> >>> if row is deleted without any modification after insertion, the
>> >>> OVSDB_IDL_CHANGE_MODIFY seqno would still be zero when it is being
>> >>> deleted. In that case calling the _is_new() function would return true
>> >>> for a deleted row, which is wrong. So, your change LGTM.
>> >>>
>> >>>> On a seperate point, I don't expect add_tracked_change_for_references()
>> >>>> will ever get called on insert as nothing will be referenced at that
>> >>>> stage? Am I correct in my understanding?
>> >>>>
>> >>> Yes, I think so. More accurately, it will get called once on the
>> >>> inserted row but will not trigger any recursive calls.
>> >>>
>> >>>>>>> ---
>> >>>>>>>  lib/ovsdb-idl.c     | 39 +++++++++++++++++++++++++++------------
>> >>>>>>>  ovsdb/ovsdb-idlc.in <http://ovsdb-idlc.in> <http://ovsdb-idlc.in> | 
>> >>>>>>>  2 +-
>> >>>>>>>  2 files changed, 28 insertions(+), 13 deletions(-)
>> >>>>>>>
>> >>>>>>> diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c
>> >>>>>>> index d8f221ca6..3cfd73871 100644
>> >>>>>>> --- a/lib/ovsdb-idl.c
>> >>>>>>> +++ b/lib/ovsdb-idl.c
>> >>>>>>> @@ -1959,6 +1959,11 @@ ovsdb_idl_db_track_clear(struct
>> >>> ovsdb_idl_db *db)
>> >>>>>>>                      free(row->updated);
>> >>>>>>>                      row->updated = NULL;
>> >>>>>>>                  }
>> >>>>>>> +
>> >>>>>>> +                row->change_seqno[OVSDB_IDL_CHANGE_INSERT] =
>> >>>>>>> +                    row->change_seqno[OVSDB_IDL_CHANGE_MODIFY] =
>> >>>>>>> +                    row->change_seqno[OVSDB_IDL_CHANGE_DELETE] = 0;
>> >>>>>>> +
>> >>>>>>>                  ovs_list_remove(&row->track_node);
>> >>>>>>>                  ovs_list_init(&row->track_node);
>> >>>>>>>                  if (ovsdb_idl_row_is_orphan(row) &&
>> >>>>> row->tracked_old_datum) {
>> >>>>>>> @@ -2684,24 +2689,27 @@ ovsdb_idl_process_update2(struct
>> >>>>> ovsdb_idl_table *table,
>> >>>>>>>      return OVSDB_IDL_UPDATE_DB_CHANGED;
>> >>>>>>>  }
>> >>>>>>>
>> >>>>>>> -/* Recursively add rows to tracked change lists for current row
>> >>>>>>> - * and the rows that reference this row. */
>> >>>>>>> +/* Recursively add rows to tracked change lists for all rows that
>> >>>>> reference
>> >>>>>>> +   'row'. */
>> >>>>>>>  static void
>> >>>>>>>  add_tracked_change_for_references(struct ovsdb_idl_row *row)
>> >>>>>>>  {
>> >>>>>>> -    if (ovs_list_is_empty(&row->track_node) &&
>> >>>>>>> -            ovsdb_idl_track_is_set(row->table)) {
>> >>>>>>> -        ovs_list_push_back(&row->table->track_list,
>> >>>>>>> -                           &row->track_node);
>> >>>>>>> -        row->change_seqno[OVSDB_IDL_CHANGE_MODIFY]
>> >>>>>>> -            = row->table->change_seqno[OVSDB_IDL_CHANGE_MODIFY]
>> >>>>>>> -            = row->table->db->change_seqno + 1;
>> >>>>>>> -
>> >>>>>>>          const struct ovsdb_idl_arc *arc;
>> >>>>>>>          LIST_FOR_EACH (arc, dst_node, &row->dst_arcs) {
>> >>>>>>> -            add_tracked_change_for_references(arc->src);
>> >>>>>>> +            struct ovsdb_idl_row *ref = arc->src;
>> >>>>>>> +
>> >>>>>>> +            if (ovs_list_is_empty(&ref->track_node) &&
>> >>>>>>> +                ovsdb_idl_track_is_set(ref->table)) {
>> >>>>>>> +                    ovs_list_push_back(&ref->table->track_list,
>> >>>>>>> +                                       &ref->track_node);
>> >>>>>>> +
>> >>>>>>> +                ref->change_seqno[OVSDB_IDL_CHANGE_MODIFY]
>> >>>>>>> +                    =
>> >>> ref->table->change_seqno[OVSDB_IDL_CHANGE_MODIFY]
>> >>>>>>> +                    = ref->table->db->change_seqno + 1;
>> >>>>>>> +
>> >>>>>>> +                add_tracked_change_for_references(ref);
>> >>>>>>> +            }
>> >>>>>>>          }
>> >>>>>>> -    }
>> >>>>>>>  }
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> @@ -2767,7 +2775,14 @@ ovsdb_idl_row_change__(struct ovsdb_idl_row
>> >>>>> *row, const struct json *row_json,
>> >>>>>>>                      row->change_seqno[change]
>> >>>>>>>                          = row->table->change_seqno[change]
>> >>>>>>>                          = row->table->db->change_seqno + 1;
>> >>>>>>> +
>> >>>>>>>                      if (table->modes[column_idx] & OVSDB_IDL_TRACK) 
>> >>>>>>> {
>> >>>>>>> +                        if (ovs_list_is_empty(&row->track_node) &&
>> >>>>>>> +                            ovsdb_idl_track_is_set(row->table)) {
>> >>>>>>> +                          
>> >>>  ovs_list_push_back(&row->table->track_list,
>> >>>>>>> +                                               &row->track_node);
>> >>>>>>> +                        }
>> >>>>>>> +
>> >>>>>>>                          add_tracked_change_for_references(row);
>> >>>>>>>                          if (!row->updated) {
>> >>>>>>>                              row->updated =
>> >>>>> bitmap_allocate(class->n_columns);
>> >>>>>>> diff --git a/ovsdb/ovsdb-idlc.in <http://ovsdb-idlc.in> 
>> >>>>>>> <http://ovsdb-idlc.in>
>> >>> b/ovsdb/ovsdb-idlc.in <http://ovsdb-idlc.in> <http://ovsdb-idlc.in>
>> >>>>>>> index 698fe25f3..89c3eb682 100755
>> >>>>>>> --- a/ovsdb/ovsdb-idlc.in <http://ovsdb-idlc.in> 
>> >>>>>>> <http://ovsdb-idlc.in>
>> >>>>>>> +++ b/ovsdb/ovsdb-idlc.in <http://ovsdb-idlc.in> 
>> >>>>>>> <http://ovsdb-idlc.in>
>> >>>>>>> @@ -282,7 +282,7 @@ const struct %(s)s
>> >>>>> *%(s)s_table_track_get_first(const struct %(s)s_table *);
>> >>>>>>>  /* Returns true if 'row' was inserted since the last change tracking
>> >>>>> reset. */
>> >>>>>>>  static inline bool %(s)s_is_new(const struct %(s)s *row)
>> >>>>>>>  {
>> >>>>>>> -    return %(s)s_row_get_seqno(row, OVSDB_IDL_CHANGE_MODIFY) == 0;
>> >>>>>>> +    return %(s)s_row_get_seqno(row, OVSDB_IDL_CHANGE_INSERT) > 0;
>> >>>>>>
>> >>>>>> This causes some issues with records created by the IDL client (e.g.,
>> >>>>>> ovn-controller). Those are created with ovsdb_idl_txn_insert() and we
>> >>>>>> don't set the OVSDB_IDL_CHANGE_INSERT change_seqno in that case.
>> >>>>>>
>> >>>>>> Essentially *_is_new() will never return true for rows inserted
>> >>> locally.
>> >>>>>>
>> >>>>> Hi Dumitru, I have a different understanding here. I think this is not 
>> >>>>> a
>> >>>>> problem. Whatever changes made locally by the IDL will be discarded 
>> >>>>> when
>> >>>>> committing to server and then rely on the notifications from the
>> >>> server to
>> >>>>> finally update the IDL data, which updates the change_seqno 
>> >>>>> accordingly.
>> >>>>
>> >>>> Dumitru appears to be right. There is a failure on one ovn UT with this
>> >>>> change. It gets resolved by the fix that Dumitru suggests.
>> >>>>
>> >>> I am confused here. Which UT fails and what's the suggested fix? If the
>> >>> row is added locally, I don't expect it to be tracked at all in the
>> >>> current iteration (while it will be in the iteration after the
>> >>> transaction commit). Did I miss anything?
>> >>>
>> >>> Thanks,
>> >>> Han
>> >>>
>> >>
>> >> Hi Han,
>> >>
>> >> One way to see the issue is:
>> >>
>> >> make sandbox
>> >> ovn-sbctl destroy chassis .
>> >>
>> >> At this point ovn-controller continuously tries to transact insert
>> >> Chassis_Private even though Chassis_Private was only updated:
>> >>
>> >> 2020-10-08T18:42:27.784Z|00024|jsonrpc|DBG|unix:sb1.ovsdb: send request,
>> >> method="transact",
>> >>
> params=["OVN_Southbound",{"uuid-name":"row3e499288_5f8c_4859_b82a_adcc474eb114","row":{"name":"chassis-1","chassis":["named-uuid","row6534b1df_bcbb_4b91_8eb1_5bc823f1673e"]},"op":"insert","table":"Chassis_Private"},{"uuid-name":"row9d86c159_0be5_4d13_a8d6_a3414a855511","row":{"ip":"127.0.0.1","options":["map",[["csum","true"]]],"chassis_name":"chassis-1","type":"geneve"},"op":"insert","table":"Encap"},{"uuid-name":"row6534b1df_bcbb_4b91_8eb1_5bc823f1673e","row":{"name":"chassis-1","hostname":"sandbox","other_config":["map",[["datapath-type",""],["iface-types","dummy,dummy-internal,dummy-pmd,erspan,geneve,gre,gtpu,internal,ip6erspan,ip6gre,lisp,patch,stt,system,tap,vxlan"],["is-interconn","false"],["ovn-bridge-mappings",""],["ovn-chassis-mac-mappings",""],["ovn-cms-options",""],["ovn-enable-lflow-cache","true"],["ovn-monitor-all","false"]]],"encaps":["named-uuid","row9d86c159_0be5_4d13_a8d6_a3414a855511"],"external_ids":["map",[["datapath-type",""],["iface-types","dummy,dummy-internal,dummy-pmd,erspan,geneve,gre,gtpu,internal,ip6erspan,ip6gre,lisp,patch,stt,system,tap,vxlan"],["is-interconn","false"],["ovn-bridge-mappings",""],["ovn-chassis-mac-mappings",""],["ovn-cms-options",""],["ovn-enable-lflow-cache","true"],["ovn-monitor-all","false"]]]},"op":"insert","table":"Chassis"},{"comment":"ovn-controller:
>> >> registering chassis 'chassis-1'","op":"comment"}], id=14
>> >>
>> >> While, if I set OVSDB_IDL_CHANGE_INSERT in ovsdb_idl_txn_insert() and
>> >> redo the same steps as above, ovn-controller doesn't try to insert
>> >> Chassis_Private but instead it updates the record.
>> >>
>> >> The only use of _is_new() for potentially locally created records is
>> >> here afaik:
>> >> https://github.com/ovn-org/ovn/blob/9d2e8d32fb9865513b70408a665184a67564390d/controller/ovn-controller.c#L184
>> >>
> 
> Hi Dumitru and Mark,
> 
> Thanks for this information and sorry for the delayed response. As discussed 
> in the last OVN meeting, I think the original design of _is_new and 
> _is_deleted were only for evaluating against tracked changes, which always 
> come from server updates.
> 
> If we make the change so that _is_new() can be used to evaluate against local 
> (uncommitted) changes, it may work but can be confusing. If a row R is 
> inserted by current transaction locally, _is_new(R) returns true, but when 
> the transaction is committed the local copy will be discarded, and if the 
> commit is successful the IDL will finally receive a server notification and 
> the _is_new(R) evaluation will return true again. I think this could cause 
> problems for the code that relies on _is_new(R). Moreover, if we update the 
> OVSDB_IDL_CHANGE_INSERT seq for locally inserted rows, we may need to do the 
> same for locally modified/deleted rows, to be consistent, which needs more 
> careful considerations for the implications of the change.
> 
> Because of the above concerns, I'd rather suggest to stick to the original 
> design (maybe document it more clearly) that the functions used only for 
> tracked changes.

One even bigger concern from my side is what process should not take any actions
until ovsdb-server accepts the change.  Otherwise it might be very tricky to
handle rejected transactions.  Even if it might be not a likely event in OVN 
case.

> 
> For the only place the _is_new() currently used for evaluating non-tracked 
> change [0], it was introduced in [1], which we have discussed earlier that it 
> is better to avoid the complex logic for handling system-id change, but let 
> the operator to follow the steps of completely deleting and re-adding a 
> chassis [2]. So I guess this kind of usage would not be necessary at least 
> for existing use cases.
> 
> P.S. Ilya found an old patch that I submitted a year ago [3] but never merged 
> for fixing the _is_new() problem (that I've forgotten myself), which is 
> similar to v1 but didn't use the "INSERT" seqno. I think the current patch 
> (v1) from Mark is better, so @Ilya Maximets <mailto:[email protected]> 
> please ignore my old patch.

OK.  I'll mark it as superseded.

> 
> [0] 
> https://github.com/ovn-org/ovn/blob/9d2e8d32fb9865513b70408a665184a67564390d/controller/ovn-controller.c#L184
> [1] commit dce1af31b5 ("chassis: Fix chassis_private record updates when the 
> system-id changes.")
> [2] https://mail.openvswitch.org/pipermail/ovs-dev/2020-September/374801.html
> [3] 
> https://patchwork.ozlabs.org/project/openvswitch/patch/[email protected]/
> 
> Thanks,
> Han
> 
>> >> I didn't dig further though.
>> >
>> > Sorry for the delay. The following test fails:
>> >
>> >  90: ovn -- ensure one gw controller restart in HA doesn't bounce the
>> > master FAILED (ovs-macros.at:220 <http://ovs-macros.at:220>)
>> >
>> > <snip>
>> > ovn.at:12213 <http://ovn.at:12213>: waiting until test 1 = `ovn-sbctl 
>> > --bare --columns=_uuid
>> > find Chassis name=gw2 | wc -l`...
>> > ovn.at:12213 <http://ovn.at:12213>: wait failed after 30 seconds
>> > ./ovs-macros.at:220 <http://ovs-macros.at:220>: hard failure
>> > 90. ovn.at:12139 <http://ovn.at:12139>: 90. ovn -- ensure one gw 
>> > controller restart in HA
>> > doesn't bounce the master (ovn.at:12139 <http://ovn.at:12139>): FAILED 
>> > (ovs-macros.at:220 <http://ovs-macros.at:220>)
>> > <snip>
>> >
>> > This is similar failure as the repoducer highlighted by Dumitru above.
>> > It is resolved by:
>> >
>> > +++ b/lib/ovsdb-idl.c
>> > @@ -4858,6 +4858,11 @@ ovsdb_idl_txn_insert(struct ovsdb_idl_txn *txn,
>> >      hmap_insert(&row->table->rows, &row->hmap_node, 
>> > uuid_hash(&row->uuid));
>> >      hmap_insert(&txn->txn_rows, &row->txn_node, uuid_hash(&row->uuid));
>> >      ovsdb_idl_add_to_indexes(row);
>> > +
>> > +    row->change_seqno[OVSDB_IDL_CHANGE_INSERT]
>> > +        = row->table->change_seqno[OVSDB_IDL_CHANGE_INSERT]
>> > +        = row->table->db->change_seqno + 1;
>> > +
>> >
>> >
>> >>
>> >> Regards,
>> >> Dumitru
>> >>
>> >
>>
>> v2 published at
>> https://mail.openvswitch.org/pipermail/ovs-dev/2020-October/375962.html

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to