On 10/10/2020 16:42, Mark Gray wrote: > On 08/10/2020 19:57, Dumitru Ceara wrote: >> On 10/2/20 1:13 AM, Han Zhou wrote: >>> >>> >>> On Thu, Oct 1, 2020 at 2:30 AM Mark Gray <[email protected] >>> <mailto:[email protected]>> wrote: >>>> >>>> On 30/09/2020 21:04, Han Zhou wrote: >>>>> On Wed, Sep 30, 2020 at 2:21 AM Dumitru Ceara <[email protected] >>> <mailto:[email protected]>> wrote: >>>>>> >>>>>> On 9/29/20 8:34 PM, Mark Gray wrote: >>>>>>> Currently all functions of the type *_is_new() always return >>>>>>> 'false'. This patch resolves this issue by using the >>>>>>> 'OVSDB_IDL_CHANGE_INSERT' 'change_seqno' instead of the >>>>>>> 'OVSDB_IDL_CHANGE_MODIFY' 'change_seqno' to determine if a row >>>>>>> is new and by resetting the 'OVSDB_IDL_CHANGE_INSERT' >>>>>>> 'change_seqno' on clear. >>>>>>> >>>>>>> Further to this, the code is also updated to match this >>>>>>> behaviour: >>>>>>> >>>>>>> When a row is inserted, the 'OVSDB_IDL_CHANGE_INSERT' >>>>>>> 'change_seqno' is updated to match the new database >>>>>>> change_seqno. The 'OVSDB_IDL_CHANGE_MODIFY' 'change_seqno' >>>>>>> is not set for inserted rows (only for updated rows). >>>>>>> >>>>>>> At the end of a run, ovsdb_idl_db_track_clear() should be >>>>>>> called to clear all tracking information, this includes >>>>>>> resetting all row 'change_seqno' to zero. This will ensure >>>>>>> that subsequent runs will not see a previously 'new' row. >>>>>>> >>>>>>> add_tracked_change_for_references() is updated to only >>>>>>> track rows that reference the current row. >>>>>>> >>>>>>> Signed-off-by: Mark Gray <[email protected] >>> <mailto:[email protected]>> >>>>>>> Suggested-by: Dumitru Ceara <[email protected] >>> <mailto:[email protected]>> >>>>>>> Reported-at: https://bugzilla.redhat.com/1883562 >>>>>> >>>>>> Hi Mark, >>>>>> >>>>>> Thanks for working on this, it definitely looks like a nasty bug to me! >>>>>> >>>>>> We were lucky to not hit it in ovn-controller before. That's just >>>>>> because all the "update" operations were handled there as "delete" + >>>>>> "insert" but that is not mandatory and might change.>> >>>>> >>>>> Agree. Thanks for this critical finding! It was my bad - commit >>> ca545a787ac >>>>> introduced this bug, which was fixing another bug related to is_new. I >>>>> should have added test cases to avoid the miss. Apologize. >>>> >>>> No worries, I will reference this commit in the commit message. >>>>> >>>>> For this fix, I don't understand why we need to check >>>>> OVSDB_IDL_CHANGE_INSERT. I think the root cause is that in >>>>> add_tracked_change_for_references() it updated the >>> OVSDB_IDL_CHANGE_MODIFY >>>>> seqno for the initial inserted *new* row, while it shouldn't. So >>> should the >>>>> fix only include the changes in add_tracked_change_for_references() and >>>>> ovsdb_idl_row_change__()? Why do we need to change the implementation of >>>>> _is_new() itself? Could you explain? >>>> >>>> In the case of a newly inserted row, the change_seqno triplet will still >>>> be {X, 0, 0}. It will remain as such until the row is modified, at which >>>> point it will become {X, Y, 0}. This means that the _*is_new() function >>>> will continue to return true until the row is modified which is not >>>> correct behaviour. >>> >>> Hmm, this is not a problem because if a row inserted in a previous run >>> is not modified/deleted in the current run, it won't be added to >>> tracking so no one would call _is_new() for the row. However, there will >>> be a problem when a row is deleted (see below). >>> >>>> If we reset the change_seqno triplet on each run, it >>>> will still not work. In the end, it is probably better to use the >>>> OVSDB_IDL_CHANGE_INSERT element to track this. >>>> >>> Indeed, we should use OVSDB_IDL_CHANGE_INSERT to check _is_new() because >>> if row is deleted without any modification after insertion, the >>> OVSDB_IDL_CHANGE_MODIFY seqno would still be zero when it is being >>> deleted. In that case calling the _is_new() function would return true >>> for a deleted row, which is wrong. So, your change LGTM. >>> >>>> On a seperate point, I don't expect add_tracked_change_for_references() >>>> will ever get called on insert as nothing will be referenced at that >>>> stage? Am I correct in my understanding? >>>> >>> Yes, I think so. More accurately, it will get called once on the >>> inserted row but will not trigger any recursive calls. >>> >>>>>>> --- >>>>>>> lib/ovsdb-idl.c | 39 +++++++++++++++++++++++++++------------ >>>>>>> ovsdb/ovsdb-idlc.in <http://ovsdb-idlc.in> | 2 +- >>>>>>> 2 files changed, 28 insertions(+), 13 deletions(-) >>>>>>> >>>>>>> diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c >>>>>>> index d8f221ca6..3cfd73871 100644 >>>>>>> --- a/lib/ovsdb-idl.c >>>>>>> +++ b/lib/ovsdb-idl.c >>>>>>> @@ -1959,6 +1959,11 @@ ovsdb_idl_db_track_clear(struct >>> ovsdb_idl_db *db) >>>>>>> free(row->updated); >>>>>>> row->updated = NULL; >>>>>>> } >>>>>>> + >>>>>>> + row->change_seqno[OVSDB_IDL_CHANGE_INSERT] = >>>>>>> + row->change_seqno[OVSDB_IDL_CHANGE_MODIFY] = >>>>>>> + row->change_seqno[OVSDB_IDL_CHANGE_DELETE] = 0; >>>>>>> + >>>>>>> ovs_list_remove(&row->track_node); >>>>>>> ovs_list_init(&row->track_node); >>>>>>> if (ovsdb_idl_row_is_orphan(row) && >>>>> row->tracked_old_datum) { >>>>>>> @@ -2684,24 +2689,27 @@ ovsdb_idl_process_update2(struct >>>>> ovsdb_idl_table *table, >>>>>>> return OVSDB_IDL_UPDATE_DB_CHANGED; >>>>>>> } >>>>>>> >>>>>>> -/* Recursively add rows to tracked change lists for current row >>>>>>> - * and the rows that reference this row. */ >>>>>>> +/* Recursively add rows to tracked change lists for all rows that >>>>> reference >>>>>>> + 'row'. */ >>>>>>> static void >>>>>>> add_tracked_change_for_references(struct ovsdb_idl_row *row) >>>>>>> { >>>>>>> - if (ovs_list_is_empty(&row->track_node) && >>>>>>> - ovsdb_idl_track_is_set(row->table)) { >>>>>>> - ovs_list_push_back(&row->table->track_list, >>>>>>> - &row->track_node); >>>>>>> - row->change_seqno[OVSDB_IDL_CHANGE_MODIFY] >>>>>>> - = row->table->change_seqno[OVSDB_IDL_CHANGE_MODIFY] >>>>>>> - = row->table->db->change_seqno + 1; >>>>>>> - >>>>>>> const struct ovsdb_idl_arc *arc; >>>>>>> LIST_FOR_EACH (arc, dst_node, &row->dst_arcs) { >>>>>>> - add_tracked_change_for_references(arc->src); >>>>>>> + struct ovsdb_idl_row *ref = arc->src; >>>>>>> + >>>>>>> + if (ovs_list_is_empty(&ref->track_node) && >>>>>>> + ovsdb_idl_track_is_set(ref->table)) { >>>>>>> + ovs_list_push_back(&ref->table->track_list, >>>>>>> + &ref->track_node); >>>>>>> + >>>>>>> + ref->change_seqno[OVSDB_IDL_CHANGE_MODIFY] >>>>>>> + = >>> ref->table->change_seqno[OVSDB_IDL_CHANGE_MODIFY] >>>>>>> + = ref->table->db->change_seqno + 1; >>>>>>> + >>>>>>> + add_tracked_change_for_references(ref); >>>>>>> + } >>>>>>> } >>>>>>> - } >>>>>>> } >>>>>>> >>>>>>> >>>>>>> @@ -2767,7 +2775,14 @@ ovsdb_idl_row_change__(struct ovsdb_idl_row >>>>> *row, const struct json *row_json, >>>>>>> row->change_seqno[change] >>>>>>> = row->table->change_seqno[change] >>>>>>> = row->table->db->change_seqno + 1; >>>>>>> + >>>>>>> if (table->modes[column_idx] & OVSDB_IDL_TRACK) { >>>>>>> + if (ovs_list_is_empty(&row->track_node) && >>>>>>> + ovsdb_idl_track_is_set(row->table)) { >>>>>>> + >>> ovs_list_push_back(&row->table->track_list, >>>>>>> + &row->track_node); >>>>>>> + } >>>>>>> + >>>>>>> add_tracked_change_for_references(row); >>>>>>> if (!row->updated) { >>>>>>> row->updated = >>>>> bitmap_allocate(class->n_columns); >>>>>>> diff --git a/ovsdb/ovsdb-idlc.in <http://ovsdb-idlc.in> >>> b/ovsdb/ovsdb-idlc.in <http://ovsdb-idlc.in> >>>>>>> index 698fe25f3..89c3eb682 100755 >>>>>>> --- a/ovsdb/ovsdb-idlc.in <http://ovsdb-idlc.in> >>>>>>> +++ b/ovsdb/ovsdb-idlc.in <http://ovsdb-idlc.in> >>>>>>> @@ -282,7 +282,7 @@ const struct %(s)s >>>>> *%(s)s_table_track_get_first(const struct %(s)s_table *); >>>>>>> /* Returns true if 'row' was inserted since the last change tracking >>>>> reset. */ >>>>>>> static inline bool %(s)s_is_new(const struct %(s)s *row) >>>>>>> { >>>>>>> - return %(s)s_row_get_seqno(row, OVSDB_IDL_CHANGE_MODIFY) == 0; >>>>>>> + return %(s)s_row_get_seqno(row, OVSDB_IDL_CHANGE_INSERT) > 0; >>>>>> >>>>>> This causes some issues with records created by the IDL client (e.g., >>>>>> ovn-controller). Those are created with ovsdb_idl_txn_insert() and we >>>>>> don't set the OVSDB_IDL_CHANGE_INSERT change_seqno in that case. >>>>>> >>>>>> Essentially *_is_new() will never return true for rows inserted >>> locally. >>>>>> >>>>> Hi Dumitru, I have a different understanding here. I think this is not a >>>>> problem. Whatever changes made locally by the IDL will be discarded when >>>>> committing to server and then rely on the notifications from the >>> server to >>>>> finally update the IDL data, which updates the change_seqno accordingly. >>>> >>>> Dumitru appears to be right. There is a failure on one ovn UT with this >>>> change. It gets resolved by the fix that Dumitru suggests. >>>> >>> I am confused here. Which UT fails and what's the suggested fix? If the >>> row is added locally, I don't expect it to be tracked at all in the >>> current iteration (while it will be in the iteration after the >>> transaction commit). Did I miss anything? >>> >>> Thanks, >>> Han >>> >> >> Hi Han, >> >> One way to see the issue is: >> >> make sandbox >> ovn-sbctl destroy chassis . >> >> At this point ovn-controller continuously tries to transact insert >> Chassis_Private even though Chassis_Private was only updated: >> >> 2020-10-08T18:42:27.784Z|00024|jsonrpc|DBG|unix:sb1.ovsdb: send request, >> method="transact", >> params=["OVN_Southbound",{"uuid-name":"row3e499288_5f8c_4859_b82a_adcc474eb114","row":{"name":"chassis-1","chassis":["named-uuid","row6534b1df_bcbb_4b91_8eb1_5bc823f1673e"]},"op":"insert","table":"Chassis_Private"},{"uuid-name":"row9d86c159_0be5_4d13_a8d6_a3414a855511","row":{"ip":"127.0.0.1","options":["map",[["csum","true"]]],"chassis_name":"chassis-1","type":"geneve"},"op":"insert","table":"Encap"},{"uuid-name":"row6534b1df_bcbb_4b91_8eb1_5bc823f1673e","row":{"name":"chassis-1","hostname":"sandbox","other_config":["map",[["datapath-type",""],["iface-types","dummy,dummy-internal,dummy-pmd,erspan,geneve,gre,gtpu,internal,ip6erspan,ip6gre,lisp,patch,stt,system,tap,vxlan"],["is-interconn","false"],["ovn-bridge-mappings",""],["ovn-chassis-mac-mappings",""],["ovn-cms-options",""],["ovn-enable-lflow-cache","true"],["ovn-monitor-all","false"]]],"encaps":["named-uuid","row9d86c159_0be5_4d13_a8d6_a3414a855511"],"external_ids":["map",[["datapath-type",""],["iface-types","dummy,dummy-internal,dummy-pmd,erspan,geneve,gre,gtpu,internal,ip6erspan,ip6gre,lisp,patch,stt,system,tap,vxlan"],["is-interconn","false"],["ovn-bridge-mappings",""],["ovn-chassis-mac-mappings",""],["ovn-cms-options",""],["ovn-enable-lflow-cache","true"],["ovn-monitor-all","false"]]]},"op":"insert","table":"Chassis"},{"comment":"ovn-controller: >> registering chassis 'chassis-1'","op":"comment"}], id=14 >> >> While, if I set OVSDB_IDL_CHANGE_INSERT in ovsdb_idl_txn_insert() and >> redo the same steps as above, ovn-controller doesn't try to insert >> Chassis_Private but instead it updates the record. >> >> The only use of _is_new() for potentially locally created records is >> here afaik: >> https://github.com/ovn-org/ovn/blob/9d2e8d32fb9865513b70408a665184a67564390d/controller/ovn-controller.c#L184 >> >> I didn't dig further though. > > Sorry for the delay. The following test fails: > > 90: ovn -- ensure one gw controller restart in HA doesn't bounce the > master FAILED (ovs-macros.at:220) > > <snip> > ovn.at:12213: waiting until test 1 = `ovn-sbctl --bare --columns=_uuid > find Chassis name=gw2 | wc -l`... > ovn.at:12213: wait failed after 30 seconds > ./ovs-macros.at:220: hard failure > 90. ovn.at:12139: 90. ovn -- ensure one gw controller restart in HA > doesn't bounce the master (ovn.at:12139): FAILED (ovs-macros.at:220) > <snip> > > This is similar failure as the repoducer highlighted by Dumitru above. > It is resolved by: > > +++ b/lib/ovsdb-idl.c > @@ -4858,6 +4858,11 @@ ovsdb_idl_txn_insert(struct ovsdb_idl_txn *txn, > hmap_insert(&row->table->rows, &row->hmap_node, uuid_hash(&row->uuid)); > hmap_insert(&txn->txn_rows, &row->txn_node, uuid_hash(&row->uuid)); > ovsdb_idl_add_to_indexes(row); > + > + row->change_seqno[OVSDB_IDL_CHANGE_INSERT] > + = row->table->change_seqno[OVSDB_IDL_CHANGE_INSERT] > + = row->table->db->change_seqno + 1; > + > > >> >> Regards, >> Dumitru >> >
v2 published at https://mail.openvswitch.org/pipermail/ovs-dev/2020-October/375962.html _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
