On 10/19/20 8:22 PM, Han Zhou wrote: > > > On Mon, Oct 12, 2020 at 7:32 AM Mark Gray <[email protected] > <mailto:[email protected]>> wrote: >> >> On 10/10/2020 16:42, Mark Gray wrote: >> > On 08/10/2020 19:57, Dumitru Ceara wrote: >> >> On 10/2/20 1:13 AM, Han Zhou wrote: >> >>> >> >>> >> >>> On Thu, Oct 1, 2020 at 2:30 AM Mark Gray <[email protected] >> >>> <mailto:[email protected]> >> >>> <mailto:[email protected] <mailto:[email protected]>>> wrote: >> >>>> >> >>>> On 30/09/2020 21:04, Han Zhou wrote: >> >>>>> On Wed, Sep 30, 2020 at 2:21 AM Dumitru Ceara <[email protected] >> >>>>> <mailto:[email protected]> >> >>> <mailto:[email protected] <mailto:[email protected]>>> wrote: >> >>>>>> >> >>>>>> On 9/29/20 8:34 PM, Mark Gray wrote: >> >>>>>>> Currently all functions of the type *_is_new() always return >> >>>>>>> 'false'. This patch resolves this issue by using the >> >>>>>>> 'OVSDB_IDL_CHANGE_INSERT' 'change_seqno' instead of the >> >>>>>>> 'OVSDB_IDL_CHANGE_MODIFY' 'change_seqno' to determine if a row >> >>>>>>> is new and by resetting the 'OVSDB_IDL_CHANGE_INSERT' >> >>>>>>> 'change_seqno' on clear. >> >>>>>>> >> >>>>>>> Further to this, the code is also updated to match this >> >>>>>>> behaviour: >> >>>>>>> >> >>>>>>> When a row is inserted, the 'OVSDB_IDL_CHANGE_INSERT' >> >>>>>>> 'change_seqno' is updated to match the new database >> >>>>>>> change_seqno. The 'OVSDB_IDL_CHANGE_MODIFY' 'change_seqno' >> >>>>>>> is not set for inserted rows (only for updated rows). >> >>>>>>> >> >>>>>>> At the end of a run, ovsdb_idl_db_track_clear() should be >> >>>>>>> called to clear all tracking information, this includes >> >>>>>>> resetting all row 'change_seqno' to zero. This will ensure >> >>>>>>> that subsequent runs will not see a previously 'new' row. >> >>>>>>> >> >>>>>>> add_tracked_change_for_references() is updated to only >> >>>>>>> track rows that reference the current row. >> >>>>>>> >> >>>>>>> Signed-off-by: Mark Gray <[email protected] >> >>>>>>> <mailto:[email protected]> >> >>> <mailto:[email protected] <mailto:[email protected]>>> >> >>>>>>> Suggested-by: Dumitru Ceara <[email protected] >> >>>>>>> <mailto:[email protected]> >> >>> <mailto:[email protected] <mailto:[email protected]>>> >> >>>>>>> Reported-at: https://bugzilla.redhat.com/1883562 >> >>>>>> >> >>>>>> Hi Mark, >> >>>>>> >> >>>>>> Thanks for working on this, it definitely looks like a nasty bug to >> >>>>>> me! >> >>>>>> >> >>>>>> We were lucky to not hit it in ovn-controller before. That's just >> >>>>>> because all the "update" operations were handled there as "delete" + >> >>>>>> "insert" but that is not mandatory and might change.>> >> >>>>> >> >>>>> Agree. Thanks for this critical finding! It was my bad - commit >> >>> ca545a787ac >> >>>>> introduced this bug, which was fixing another bug related to is_new. I >> >>>>> should have added test cases to avoid the miss. Apologize. >> >>>> >> >>>> No worries, I will reference this commit in the commit message. >> >>>>> >> >>>>> For this fix, I don't understand why we need to check >> >>>>> OVSDB_IDL_CHANGE_INSERT. I think the root cause is that in >> >>>>> add_tracked_change_for_references() it updated the >> >>> OVSDB_IDL_CHANGE_MODIFY >> >>>>> seqno for the initial inserted *new* row, while it shouldn't. So >> >>> should the >> >>>>> fix only include the changes in add_tracked_change_for_references() and >> >>>>> ovsdb_idl_row_change__()? Why do we need to change the implementation >> >>>>> of >> >>>>> _is_new() itself? Could you explain? >> >>>> >> >>>> In the case of a newly inserted row, the change_seqno triplet will still >> >>>> be {X, 0, 0}. It will remain as such until the row is modified, at which >> >>>> point it will become {X, Y, 0}. This means that the _*is_new() function >> >>>> will continue to return true until the row is modified which is not >> >>>> correct behaviour. >> >>> >> >>> Hmm, this is not a problem because if a row inserted in a previous run >> >>> is not modified/deleted in the current run, it won't be added to >> >>> tracking so no one would call _is_new() for the row. However, there will >> >>> be a problem when a row is deleted (see below). >> >>> >> >>>> If we reset the change_seqno triplet on each run, it >> >>>> will still not work. In the end, it is probably better to use the >> >>>> OVSDB_IDL_CHANGE_INSERT element to track this. >> >>>> >> >>> Indeed, we should use OVSDB_IDL_CHANGE_INSERT to check _is_new() because >> >>> if row is deleted without any modification after insertion, the >> >>> OVSDB_IDL_CHANGE_MODIFY seqno would still be zero when it is being >> >>> deleted. In that case calling the _is_new() function would return true >> >>> for a deleted row, which is wrong. So, your change LGTM. >> >>> >> >>>> On a seperate point, I don't expect add_tracked_change_for_references() >> >>>> will ever get called on insert as nothing will be referenced at that >> >>>> stage? Am I correct in my understanding? >> >>>> >> >>> Yes, I think so. More accurately, it will get called once on the >> >>> inserted row but will not trigger any recursive calls. >> >>> >> >>>>>>> --- >> >>>>>>> lib/ovsdb-idl.c | 39 +++++++++++++++++++++++++++------------ >> >>>>>>> ovsdb/ovsdb-idlc.in <http://ovsdb-idlc.in> <http://ovsdb-idlc.in> | >> >>>>>>> 2 +- >> >>>>>>> 2 files changed, 28 insertions(+), 13 deletions(-) >> >>>>>>> >> >>>>>>> diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c >> >>>>>>> index d8f221ca6..3cfd73871 100644 >> >>>>>>> --- a/lib/ovsdb-idl.c >> >>>>>>> +++ b/lib/ovsdb-idl.c >> >>>>>>> @@ -1959,6 +1959,11 @@ ovsdb_idl_db_track_clear(struct >> >>> ovsdb_idl_db *db) >> >>>>>>> free(row->updated); >> >>>>>>> row->updated = NULL; >> >>>>>>> } >> >>>>>>> + >> >>>>>>> + row->change_seqno[OVSDB_IDL_CHANGE_INSERT] = >> >>>>>>> + row->change_seqno[OVSDB_IDL_CHANGE_MODIFY] = >> >>>>>>> + row->change_seqno[OVSDB_IDL_CHANGE_DELETE] = 0; >> >>>>>>> + >> >>>>>>> ovs_list_remove(&row->track_node); >> >>>>>>> ovs_list_init(&row->track_node); >> >>>>>>> if (ovsdb_idl_row_is_orphan(row) && >> >>>>> row->tracked_old_datum) { >> >>>>>>> @@ -2684,24 +2689,27 @@ ovsdb_idl_process_update2(struct >> >>>>> ovsdb_idl_table *table, >> >>>>>>> return OVSDB_IDL_UPDATE_DB_CHANGED; >> >>>>>>> } >> >>>>>>> >> >>>>>>> -/* Recursively add rows to tracked change lists for current row >> >>>>>>> - * and the rows that reference this row. */ >> >>>>>>> +/* Recursively add rows to tracked change lists for all rows that >> >>>>> reference >> >>>>>>> + 'row'. */ >> >>>>>>> static void >> >>>>>>> add_tracked_change_for_references(struct ovsdb_idl_row *row) >> >>>>>>> { >> >>>>>>> - if (ovs_list_is_empty(&row->track_node) && >> >>>>>>> - ovsdb_idl_track_is_set(row->table)) { >> >>>>>>> - ovs_list_push_back(&row->table->track_list, >> >>>>>>> - &row->track_node); >> >>>>>>> - row->change_seqno[OVSDB_IDL_CHANGE_MODIFY] >> >>>>>>> - = row->table->change_seqno[OVSDB_IDL_CHANGE_MODIFY] >> >>>>>>> - = row->table->db->change_seqno + 1; >> >>>>>>> - >> >>>>>>> const struct ovsdb_idl_arc *arc; >> >>>>>>> LIST_FOR_EACH (arc, dst_node, &row->dst_arcs) { >> >>>>>>> - add_tracked_change_for_references(arc->src); >> >>>>>>> + struct ovsdb_idl_row *ref = arc->src; >> >>>>>>> + >> >>>>>>> + if (ovs_list_is_empty(&ref->track_node) && >> >>>>>>> + ovsdb_idl_track_is_set(ref->table)) { >> >>>>>>> + ovs_list_push_back(&ref->table->track_list, >> >>>>>>> + &ref->track_node); >> >>>>>>> + >> >>>>>>> + ref->change_seqno[OVSDB_IDL_CHANGE_MODIFY] >> >>>>>>> + = >> >>> ref->table->change_seqno[OVSDB_IDL_CHANGE_MODIFY] >> >>>>>>> + = ref->table->db->change_seqno + 1; >> >>>>>>> + >> >>>>>>> + add_tracked_change_for_references(ref); >> >>>>>>> + } >> >>>>>>> } >> >>>>>>> - } >> >>>>>>> } >> >>>>>>> >> >>>>>>> >> >>>>>>> @@ -2767,7 +2775,14 @@ ovsdb_idl_row_change__(struct ovsdb_idl_row >> >>>>> *row, const struct json *row_json, >> >>>>>>> row->change_seqno[change] >> >>>>>>> = row->table->change_seqno[change] >> >>>>>>> = row->table->db->change_seqno + 1; >> >>>>>>> + >> >>>>>>> if (table->modes[column_idx] & OVSDB_IDL_TRACK) >> >>>>>>> { >> >>>>>>> + if (ovs_list_is_empty(&row->track_node) && >> >>>>>>> + ovsdb_idl_track_is_set(row->table)) { >> >>>>>>> + >> >>> ovs_list_push_back(&row->table->track_list, >> >>>>>>> + &row->track_node); >> >>>>>>> + } >> >>>>>>> + >> >>>>>>> add_tracked_change_for_references(row); >> >>>>>>> if (!row->updated) { >> >>>>>>> row->updated = >> >>>>> bitmap_allocate(class->n_columns); >> >>>>>>> diff --git a/ovsdb/ovsdb-idlc.in <http://ovsdb-idlc.in> >> >>>>>>> <http://ovsdb-idlc.in> >> >>> b/ovsdb/ovsdb-idlc.in <http://ovsdb-idlc.in> <http://ovsdb-idlc.in> >> >>>>>>> index 698fe25f3..89c3eb682 100755 >> >>>>>>> --- a/ovsdb/ovsdb-idlc.in <http://ovsdb-idlc.in> >> >>>>>>> <http://ovsdb-idlc.in> >> >>>>>>> +++ b/ovsdb/ovsdb-idlc.in <http://ovsdb-idlc.in> >> >>>>>>> <http://ovsdb-idlc.in> >> >>>>>>> @@ -282,7 +282,7 @@ const struct %(s)s >> >>>>> *%(s)s_table_track_get_first(const struct %(s)s_table *); >> >>>>>>> /* Returns true if 'row' was inserted since the last change tracking >> >>>>> reset. */ >> >>>>>>> static inline bool %(s)s_is_new(const struct %(s)s *row) >> >>>>>>> { >> >>>>>>> - return %(s)s_row_get_seqno(row, OVSDB_IDL_CHANGE_MODIFY) == 0; >> >>>>>>> + return %(s)s_row_get_seqno(row, OVSDB_IDL_CHANGE_INSERT) > 0; >> >>>>>> >> >>>>>> This causes some issues with records created by the IDL client (e.g., >> >>>>>> ovn-controller). Those are created with ovsdb_idl_txn_insert() and we >> >>>>>> don't set the OVSDB_IDL_CHANGE_INSERT change_seqno in that case. >> >>>>>> >> >>>>>> Essentially *_is_new() will never return true for rows inserted >> >>> locally. >> >>>>>> >> >>>>> Hi Dumitru, I have a different understanding here. I think this is not >> >>>>> a >> >>>>> problem. Whatever changes made locally by the IDL will be discarded >> >>>>> when >> >>>>> committing to server and then rely on the notifications from the >> >>> server to >> >>>>> finally update the IDL data, which updates the change_seqno >> >>>>> accordingly. >> >>>> >> >>>> Dumitru appears to be right. There is a failure on one ovn UT with this >> >>>> change. It gets resolved by the fix that Dumitru suggests. >> >>>> >> >>> I am confused here. Which UT fails and what's the suggested fix? If the >> >>> row is added locally, I don't expect it to be tracked at all in the >> >>> current iteration (while it will be in the iteration after the >> >>> transaction commit). Did I miss anything? >> >>> >> >>> Thanks, >> >>> Han >> >>> >> >> >> >> Hi Han, >> >> >> >> One way to see the issue is: >> >> >> >> make sandbox >> >> ovn-sbctl destroy chassis . >> >> >> >> At this point ovn-controller continuously tries to transact insert >> >> Chassis_Private even though Chassis_Private was only updated: >> >> >> >> 2020-10-08T18:42:27.784Z|00024|jsonrpc|DBG|unix:sb1.ovsdb: send request, >> >> method="transact", >> >> > params=["OVN_Southbound",{"uuid-name":"row3e499288_5f8c_4859_b82a_adcc474eb114","row":{"name":"chassis-1","chassis":["named-uuid","row6534b1df_bcbb_4b91_8eb1_5bc823f1673e"]},"op":"insert","table":"Chassis_Private"},{"uuid-name":"row9d86c159_0be5_4d13_a8d6_a3414a855511","row":{"ip":"127.0.0.1","options":["map",[["csum","true"]]],"chassis_name":"chassis-1","type":"geneve"},"op":"insert","table":"Encap"},{"uuid-name":"row6534b1df_bcbb_4b91_8eb1_5bc823f1673e","row":{"name":"chassis-1","hostname":"sandbox","other_config":["map",[["datapath-type",""],["iface-types","dummy,dummy-internal,dummy-pmd,erspan,geneve,gre,gtpu,internal,ip6erspan,ip6gre,lisp,patch,stt,system,tap,vxlan"],["is-interconn","false"],["ovn-bridge-mappings",""],["ovn-chassis-mac-mappings",""],["ovn-cms-options",""],["ovn-enable-lflow-cache","true"],["ovn-monitor-all","false"]]],"encaps":["named-uuid","row9d86c159_0be5_4d13_a8d6_a3414a855511"],"external_ids":["map",[["datapath-type",""],["iface-types","dummy,dummy-internal,dummy-pmd,erspan,geneve,gre,gtpu,internal,ip6erspan,ip6gre,lisp,patch,stt,system,tap,vxlan"],["is-interconn","false"],["ovn-bridge-mappings",""],["ovn-chassis-mac-mappings",""],["ovn-cms-options",""],["ovn-enable-lflow-cache","true"],["ovn-monitor-all","false"]]]},"op":"insert","table":"Chassis"},{"comment":"ovn-controller: >> >> registering chassis 'chassis-1'","op":"comment"}], id=14 >> >> >> >> While, if I set OVSDB_IDL_CHANGE_INSERT in ovsdb_idl_txn_insert() and >> >> redo the same steps as above, ovn-controller doesn't try to insert >> >> Chassis_Private but instead it updates the record. >> >> >> >> The only use of _is_new() for potentially locally created records is >> >> here afaik: >> >> https://github.com/ovn-org/ovn/blob/9d2e8d32fb9865513b70408a665184a67564390d/controller/ovn-controller.c#L184 >> >> > > Hi Dumitru and Mark, > > Thanks for this information and sorry for the delayed response. As discussed > in the last OVN meeting, I think the original design of _is_new and > _is_deleted were only for evaluating against tracked changes, which always > come from server updates. > > If we make the change so that _is_new() can be used to evaluate against local > (uncommitted) changes, it may work but can be confusing. If a row R is > inserted by current transaction locally, _is_new(R) returns true, but when > the transaction is committed the local copy will be discarded, and if the > commit is successful the IDL will finally receive a server notification and > the _is_new(R) evaluation will return true again. I think this could cause > problems for the code that relies on _is_new(R). Moreover, if we update the > OVSDB_IDL_CHANGE_INSERT seq for locally inserted rows, we may need to do the > same for locally modified/deleted rows, to be consistent, which needs more > careful considerations for the implications of the change. > > Because of the above concerns, I'd rather suggest to stick to the original > design (maybe document it more clearly) that the functions used only for > tracked changes.
One even bigger concern from my side is what process should not take any actions until ovsdb-server accepts the change. Otherwise it might be very tricky to handle rejected transactions. Even if it might be not a likely event in OVN case. > > For the only place the _is_new() currently used for evaluating non-tracked > change [0], it was introduced in [1], which we have discussed earlier that it > is better to avoid the complex logic for handling system-id change, but let > the operator to follow the steps of completely deleting and re-adding a > chassis [2]. So I guess this kind of usage would not be necessary at least > for existing use cases. > > P.S. Ilya found an old patch that I submitted a year ago [3] but never merged > for fixing the _is_new() problem (that I've forgotten myself), which is > similar to v1 but didn't use the "INSERT" seqno. I think the current patch > (v1) from Mark is better, so @Ilya Maximets <mailto:[email protected]> > please ignore my old patch. OK. I'll mark it as superseded. > > [0] > https://github.com/ovn-org/ovn/blob/9d2e8d32fb9865513b70408a665184a67564390d/controller/ovn-controller.c#L184 > [1] commit dce1af31b5 ("chassis: Fix chassis_private record updates when the > system-id changes.") > [2] https://mail.openvswitch.org/pipermail/ovs-dev/2020-September/374801.html > [3] > https://patchwork.ozlabs.org/project/openvswitch/patch/[email protected]/ > > Thanks, > Han > >> >> I didn't dig further though. >> > >> > Sorry for the delay. The following test fails: >> > >> > 90: ovn -- ensure one gw controller restart in HA doesn't bounce the >> > master FAILED (ovs-macros.at:220 <http://ovs-macros.at:220>) >> > >> > <snip> >> > ovn.at:12213 <http://ovn.at:12213>: waiting until test 1 = `ovn-sbctl >> > --bare --columns=_uuid >> > find Chassis name=gw2 | wc -l`... >> > ovn.at:12213 <http://ovn.at:12213>: wait failed after 30 seconds >> > ./ovs-macros.at:220 <http://ovs-macros.at:220>: hard failure >> > 90. ovn.at:12139 <http://ovn.at:12139>: 90. ovn -- ensure one gw >> > controller restart in HA >> > doesn't bounce the master (ovn.at:12139 <http://ovn.at:12139>): FAILED >> > (ovs-macros.at:220 <http://ovs-macros.at:220>) >> > <snip> >> > >> > This is similar failure as the repoducer highlighted by Dumitru above. >> > It is resolved by: >> > >> > +++ b/lib/ovsdb-idl.c >> > @@ -4858,6 +4858,11 @@ ovsdb_idl_txn_insert(struct ovsdb_idl_txn *txn, >> > hmap_insert(&row->table->rows, &row->hmap_node, >> > uuid_hash(&row->uuid)); >> > hmap_insert(&txn->txn_rows, &row->txn_node, uuid_hash(&row->uuid)); >> > ovsdb_idl_add_to_indexes(row); >> > + >> > + row->change_seqno[OVSDB_IDL_CHANGE_INSERT] >> > + = row->table->change_seqno[OVSDB_IDL_CHANGE_INSERT] >> > + = row->table->db->change_seqno + 1; >> > + >> > >> > >> >> >> >> Regards, >> >> Dumitru >> >> >> > >> >> v2 published at >> https://mail.openvswitch.org/pipermail/ovs-dev/2020-October/375962.html _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
