On 11/5/25 2:50 PM, Ilya Maximets wrote:
> On 11/5/25 7:15 PM, Rosemarie O'Riorden wrote:
>> Hi Ales, thanks so much for the review!
>>
>> I agree with all of the changes you suggest. I thought checking the
>> transaction count was necessary because there are two cases where it
>> could be wrong, but it seems the test consistently fails at checking the
>> number of tunnels in both of those cases, so it should be fine without.
>>
>> Thanks again!
>>
>>
>> On 11/5/25 6:31 AM, Ales Musil wrote:
>>>
>>>
>>> On Wed, Nov 5, 2025 at 12:57 AM Rosemarie O'Riorden via dev <ovs-
>>> [email protected] <mailto:[email protected]>> wrote:
>>>
>>> If you set a chassis' encap ip to be the same as that of another
>>> chassis, OVN should detect it, log an error, and delete the chassis
>>> record.
>>>
>>> But this was not the case; OVN would allow the transaction. However, the
>>> commit to the southbound database would fail, because there was already
>>> an encap record with that same ip and type (from the other chassis).
>>> The OVS db on the problematic chassis would then rapidly increase in
>>> size. Essentially, the controller would get stuck in a loop of deleting
>>> and recreating the tunnels.
>>>
>>> This change makes it so when encaps are reconfigured, before being
>>> built, in chassis_build_encaps(), the controller detects if there is
>>> already an existing encap tunnel of the same ip and type. If so, and
>>> there are no good (unique) encaps configured to use, only bad ones (as
>>> in already used on another chassis), the controller logs an error,
>>> refuses to build the encaps, and deletes the associated chassis and
>>> chassis private record. If there are both good encaps IPs and bad ones
>>> configured, the good ones will be used to build encaps and the bad ones
>>> will be ignored.
>>>
>>> In chassis_build_encaps(), only if an Encap record does not already
>>> exist will a new record be built. And options will only be updated if
>>> they've changed.
>>>
>>> Another change this patch introduces is if there's currently a
>>> southbound transaction in flight, encaps_run() will not run. This is to
>>> maintain symmetry with chassis_run()'s behavior. This change prevents an
>>> inconsistency where the controller saw ports in the local OVS db but not
>>> in sb-db, and then deleted and recreated them.
>>>
>>> Reported-at: https://issues.redhat.com/browse/FDP-1481 <https://
>>> issues.redhat.com/browse/FDP-1481>
>>>
>>>
>>> nit: This should be https://issues.redhat.com/browse/FDP-2013 <https://
>>> issues.redhat.com/browse/FDP-2013>
>>>
>>>
>>>
>>> Co-authored-by: Nicholas Hubbard <[email protected]
>>> <mailto:[email protected]>>
>>> Signed-off-by: Nicholas Hubbard <[email protected]
>>> <mailto:[email protected]>>
>>> Signed-off-by: Rosemarie O'Riorden <[email protected]
>>> <mailto:[email protected]>>
>>> ---
>>>
>>>
>>> Hi Rosemarie,
>>>
>>> thank you for the patch, I have a couple of comments down below.
>>>
>>>
>>> v4:
>>> - Fix bug Jacob discovered where ovn-controller got stuck in a loop.
>>> -> Change chassis_build_encaps() to not always recreate all
>>> encaps, but
>>> instead, only modify records if they've changed, and create new
>>> records if
>>> there are not existing ones.
>>> -> Document above change with comments.
>>> -> Also add a comment to explain how good IPs will be used even if bad
>>> ones are configured.
>>> -> Make encap_lookup_by_ip_and_type() not return const, so encap
>>> recs can be
>>> modified
>>> -> Add ovnsb_idl_txn != NULL as a requirement for encaps_run() to run
>>> to have symmetry with chassis_build_encaps() so controller won't
>>> unnecessarily delete and recreate tunnels.
>>> - Change test in many ways
>>> -> Update name.
>>> -> Update method of verifying success, check # of transactions
>>> with jsonrpc
>>> debug logging.
>>> -> Add the case where there are good and bad IPs.
>>> -> Check that the warning I added appears.
>>> - Update commit message to reflect all changes above.
>>>
>>> v3:
>>> - Update misleading comment
>>> - Add deletion of chassis private record
>>> - Move declaration function name to same line as return type
>>> - Delete "Transaction causes multiple rows" error from exceptions of
>>> affected
>>> test
>>> - Update my own error to be more specific in exceptions of affected test
>>> - Update test to not be racy; Check for db row counts instead of
>>> saving file
>>> size after change
>>> - Update commit message to reflect changes
>>>
>>> v2: Fix typo in title.
>>> ---
>>> controller/chassis.c | 133 +++++++++++++++++++++++++-----------
>>> controller/chassis.h | 3 +-
>>> controller/encaps.c | 3 +-
>>> controller/encaps.h | 1 +
>>> controller/ovn-controller.c | 7 +-
>>> lib/chassis-index.c | 20 ++++++
>>> lib/chassis-index.h | 3 +
>>> tests/ovn-controller.at <http://ovn-controller.at> | 65 ++++++
>>> ++++++++++--
>>> 8 files changed, 186 insertions(+), 49 deletions(-)
>>>
>>> diff --git a/controller/chassis.c b/controller/chassis.c
>>> index 07bef1c56..2b6bfc247 100644
>>> --- a/controller/chassis.c
>>> +++ b/controller/chassis.c
>>> @@ -77,6 +77,12 @@ struct ovs_chassis_cfg {
>>> bool ct_label_flush;
>>> };
>>>
>>> +enum chassis_update_status {
>>> + CHASSIS_UPDATED,
>>> + CHASSIS_NOT_UPDATED,
>>> + CHASSIS_NEED_DELETE
>>> +};
>>> +
>>> static void
>>> ovs_chassis_cfg_init(struct ovs_chassis_cfg *cfg)
>>> {
>>> @@ -699,7 +705,8 @@ chassis_build_encaps(struct ovsdb_idl_txn
>>> *ovnsb_idl_txn,
>>> const char *encap_ip_default,
>>> const char *chassis_id,
>>> const char *encap_csum,
>>> - size_t *n_encap)
>>> + size_t *n_encap,
>>> + struct ovsdb_idl_index *sbrec_encaps)
>>> {
>>> size_t tunnel_count = 0;
>>>
>>> @@ -713,20 +720,45 @@ chassis_build_encaps(struct ovsdb_idl_txn
>>> *ovnsb_idl_txn,
>>>
>>> SSET_FOR_EACH (encap_ip, encap_ip_set) {
>>> SSET_FOR_EACH (encap_type, encap_type_set) {
>>> - struct sbrec_encap *encap =
>>> sbrec_encap_insert(ovnsb_idl_txn);
>>> -
>>> - sbrec_encap_set_type(encap, encap_type);
>>> - sbrec_encap_set_ip(encap, encap_ip);
>>> - if (encap_ip_default && !strcmp(encap_ip_default,
>>> encap_ip)) {
>>> - struct smap _options;
>>> - smap_clone(&_options, &options);
>>> - smap_add(&_options, "is_default", "true");
>>> - sbrec_encap_set_options(encap, &_options);
>>> - smap_destroy(&_options);
>>> + struct sbrec_encap *encap_rec =
>>> + encap_lookup_by_ip_and_type(sbrec_encaps,
>>> + encap_ip, encap_type);
>>> + if (encap_rec && strcmp(encap_rec->chassis_name,
>>> chassis_id)) {
>>> + static struct vlog_rate_limit rl =
>>> VLOG_RATE_LIMIT_INIT(5, 1);
>>> + VLOG_WARN_RL(&rl, "'%s' already has encap ip '%s' and "
>>> + "type '%s', cannot duplicate on '%s'",
>>> + encap_rec->chassis_name, encap_rec->ip,
>>> + encap_rec->type, chassis_id);
>>> +
>>> + /* Build encaps for valid IPs even if bad IPs are
>>> + * configured. */
>>> + continue;
>>> + }
>>> +
>>> + /* Only if an Encap record does not already exist do we
>>> create a
>>> + * new record. */
>>> + struct sbrec_encap *encap;
>>>
>>>
>>> No need to add a new variable, the encap_rec can be reused.
>>>
>>>
>>> + if (!encap_rec) {
>>> + encap = sbrec_encap_insert(ovnsb_idl_txn);
>>> + sbrec_encap_set_type(encap, encap_type);
>>> + sbrec_encap_set_ip(encap, encap_ip);
>>> + sbrec_encap_set_chassis_name(encap, chassis_id);
>>> } else {
>>> - sbrec_encap_set_options(encap, &options);
>>> + encap = encap_rec;
>>> + }
>>> +
>>> + if (!smap_is_empty(&encap->options) ||
>>> + !smap_equal(&options, &encap->options)) {
>>> + if (encap_ip_default && !strcmp(encap_ip_default,
>>> encap_ip)) {
>>> + struct smap _options;
>>> + smap_clone(&_options, &options);
>>> + smap_add(&_options, "is_default", "true");
>>> + sbrec_encap_set_options(encap, &_options);
>>> + smap_destroy(&_options);
>>> + } else {
>>> + sbrec_encap_set_options(encap, &options);
>>> + }
>>> }
>>> - sbrec_encap_set_chassis_name(encap, chassis_id);
>>>
>>> encaps[tunnel_count] = encap;
>>> tunnel_count++;
>>> @@ -778,7 +810,7 @@ update_supported_sset(struct sset *supported)
>>>
>>> static void
>>> remove_unsupported_options(const struct sbrec_chassis *chassis_rec,
>>> - bool *updated)
>>> + enum chassis_update_status *update_status)
>>> {
>>> struct sset supported_options =
>>> SSET_INITIALIZER(&supported_options);
>>> update_supported_sset(&supported_options);
>>> @@ -789,7 +821,7 @@ remove_unsupported_options(const struct
>>> sbrec_chassis *chassis_rec,
>>> VLOG_WARN("Removing unsupported key \"%s\" from chassis
>>> record.",
>>> node->key);
>>> sbrec_chassis_update_other_config_delkey(chassis_rec,
>>> node->key);
>>> - *updated = true;
>>> + *update_status = CHASSIS_UPDATED;
>>> }
>>> }
>>>
>>> @@ -827,25 +859,28 @@ chassis_get_record(struct ovsdb_idl_txn
>>> *ovnsb_idl_txn,
>>> }
>>>
>>> /* Update a Chassis record based on the config in the ovs config.
>>> - * Returns true if 'chassis_rec' was updated, false otherwise.
>>> + * Returns CHASSIS_UPDATED if 'chassis_rec' was updated,
>>> CHASSIS_NEED_DELETE if
>>> + * another chassis exists with an encap with same IP and type as
>>> 'chassis', and
>>> + * returns CHASSIS_NOT_UPDATED otherwise.
>>> */
>>> -static bool
>>> +static enum chassis_update_status
>>> chassis_update(const struct sbrec_chassis *chassis_rec,
>>> struct ovsdb_idl_txn *ovnsb_idl_txn,
>>> const struct ovs_chassis_cfg *ovs_cfg,
>>> const char *chassis_id,
>>> - const struct sset *transport_zones)
>>> + const struct sset *transport_zones,
>>> + struct ovsdb_idl_index *sbrec_encaps)
>>> {
>>> - bool updated = false;
>>> + enum chassis_update_status update_status = CHASSIS_NOT_UPDATED;
>>>
>>> if (strcmp(chassis_id, chassis_rec->name)) {
>>> sbrec_chassis_set_name(chassis_rec, chassis_id);
>>> - updated = true;
>>> + update_status = CHASSIS_UPDATED;
>>> }
>>>
>>> if (strcmp(ovs_cfg->hostname, chassis_rec->hostname)) {
>>> sbrec_chassis_set_hostname(chassis_rec, ovs_cfg->hostname);
>>> - updated = true;
>>> + update_status = CHASSIS_UPDATED;
>>> }
>>>
>>> if (chassis_other_config_changed(ovs_cfg, chassis_rec)) {
>>> @@ -856,12 +891,12 @@ chassis_update(const struct sbrec_chassis
>>> *chassis_rec,
>>> sbrec_chassis_verify_other_config(chassis_rec);
>>> sbrec_chassis_set_other_config(chassis_rec, &other_config);
>>> smap_destroy(&other_config);
>>> - updated = true;
>>> + update_status = CHASSIS_UPDATED;
>>> }
>>>
>>> update_chassis_transport_zones(transport_zones, chassis_rec);
>>>
>>> - remove_unsupported_options(chassis_rec, &updated);
>>> + remove_unsupported_options(chassis_rec, &update_status);
>>>
>>> /* If any of the encaps should change, update them. */
>>> bool tunnels_changed =
>>> @@ -871,7 +906,7 @@ chassis_update(const struct sbrec_chassis
>>> *chassis_rec,
>>> ovs_cfg->encap_csum,
>>> chassis_rec);
>>> if (!tunnels_changed) {
>>> - return updated;
>>> + return update_status;
>>> }
>>>
>>> struct sbrec_encap **encaps;
>>> @@ -881,10 +916,16 @@ chassis_update(const struct sbrec_chassis
>>> *chassis_rec,
>>> chassis_build_encaps(ovnsb_idl_txn, &ovs_cfg->encap_type_set,
>>> &ovs_cfg->encap_ip_set,
>>> ovs_cfg->encap_ip_default, chassis_id,
>>> - ovs_cfg->encap_csum, &n_encap);
>>> + ovs_cfg->encap_csum, &n_encap,
>>> + sbrec_encaps);
>>> sbrec_chassis_set_encaps(chassis_rec, encaps, n_encap);
>>> free(encaps);
>>> - return true;
>>> +
>>> + if (!n_encap) {
>>> + return CHASSIS_NEED_DELETE;
>>> + }
>>> +
>>> + return CHASSIS_UPDATED;
>>> }
>>>
>>> /* If this is a chassis_private config update after we initialized
>>> the record
>>> @@ -935,7 +976,8 @@ chassis_run(struct ovsdb_idl_txn *ovnsb_idl_txn,
>>> const char *chassis_id,
>>> const struct ovsrec_bridge *br_int,
>>> const struct sset *transport_zones,
>>> - const struct sbrec_chassis_private **chassis_private)
>>> + const struct sbrec_chassis_private **chassis_private,
>>> + struct ovsdb_idl_index *sbrec_encaps)
>>> {
>>> struct ovs_chassis_cfg ovs_cfg;
>>>
>>> @@ -956,22 +998,31 @@ chassis_run(struct ovsdb_idl_txn *ovnsb_idl_txn,
>>> * modified in the ovs table.
>>> */
>>> if (chassis_rec && ovnsb_idl_txn) {
>>> - bool updated = chassis_update(chassis_rec, ovnsb_idl_txn,
>>> &ovs_cfg,
>>> - chassis_id, transport_zones);
>>> -
>>> - if (!existed || updated) {
>>> - ovsdb_idl_txn_add_comment(ovnsb_idl_txn,
>>> - "ovn-controller: %s chassis
>>> '%s'",
>>> - !existed ? "registering" :
>>> "updating",
>>> - chassis_id);
>>> - }
>>> + enum chassis_update_status update_status =
>>> + chassis_update(chassis_rec, ovnsb_idl_txn, &ovs_cfg,
>>> + chassis_id,
>>> + transport_zones, sbrec_encaps);
>>> +
>>> + *chassis_private = chassis_private_get_record(ovnsb_idl_txn,
>>> + sbrec_chassis_private_by_name,
>>> chassis_id);
>>>
>>> - *chassis_private =
>>> - chassis_private_get_record(ovnsb_idl_txn,
>>> - sbrec_chassis_private_by_name,
>>> + if (update_status == CHASSIS_NEED_DELETE) {
>>> + sbrec_chassis_delete(chassis_rec);
>>> + chassis_rec = NULL;
>>> + if (*chassis_private) {
>>> + sbrec_chassis_private_delete(*chassis_private);
>>> + }
>>>
>>>
>>> nit: We can return early here instead of nesting.
>>>
>>>
>>> + } else {
>>> + if (!existed || update_status == CHASSIS_UPDATED) {
>>> + ovsdb_idl_txn_add_comment(ovnsb_idl_txn,
>>> + "ovn-controller: %s
>>> chassis '%s'",
>>> + !existed ? "registering"
>>> + : "updating", chassis_id);
>>> + }
>>> + if (*chassis_private) {
>>> + chassis_private_update(*chassis_private, chassis_rec,
>>> chassis_id);
>>> - if (*chassis_private) {
>>> - chassis_private_update(*chassis_private, chassis_rec,
>>> chassis_id);
>>> + }
>>> }
>>> }
>>>
>>> diff --git a/controller/chassis.h b/controller/chassis.h
>>> index 03cc2f906..14251bdad 100644
>>> --- a/controller/chassis.h
>>> +++ b/controller/chassis.h
>>> @@ -44,7 +44,8 @@ const struct sbrec_chassis *chassis_run(
>>> const struct ovsrec_open_vswitch_table *,
>>> const char *chassis_id, const struct ovsrec_bridge *br_int,
>>> const struct sset *transport_zones,
>>> - const struct sbrec_chassis_private **chassis_private);
>>> + const struct sbrec_chassis_private **chassis_private,
>>> + struct ovsdb_idl_index *sbrec_chassis_encaps);
>>> bool chassis_cleanup(struct ovsdb_idl_txn *ovs_idl_txn,
>>> struct ovsdb_idl_txn *ovnsb_idl_txn,
>>> const struct ovsrec_open_vswitch_table *,
>>> diff --git a/controller/encaps.c b/controller/encaps.c
>>> index 67f8c9c9c..fd34a8ea0 100644
>>> --- a/controller/encaps.c
>>> +++ b/controller/encaps.c
>>> @@ -549,6 +549,7 @@ create_evpn_tunnels(struct tunnel_ctx *tc)
>>>
>>> void
>>> encaps_run(struct ovsdb_idl_txn *ovs_idl_txn,
>>> + struct ovsdb_idl_txn *ovnsb_idl_txn,
>>> const struct ovsrec_bridge *br_int,
>>> const struct sbrec_chassis_table *chassis_table,
>>> const struct sbrec_chassis *this_chassis,
>>> @@ -557,7 +558,7 @@ encaps_run(struct ovsdb_idl_txn *ovs_idl_txn,
>>> const struct sset *transport_zones,
>>> const struct ovsrec_bridge_table *bridge_table)
>>> {
>>> - if (!ovs_idl_txn || !br_int) {
>>> + if (!ovs_idl_txn || !ovnsb_idl_txn || !br_int) {
>>> return;
>>> }
>>>
>>> diff --git a/controller/encaps.h b/controller/encaps.h
>>> index 6f9891ee5..3f4a82d68 100644
>>> --- a/controller/encaps.h
>>> +++ b/controller/encaps.h
>>> @@ -30,6 +30,7 @@ struct sset;
>>>
>>> void encaps_register_ovs_idl(struct ovsdb_idl *);
>>> void encaps_run(struct ovsdb_idl_txn *ovs_idl_txn,
>>> + struct ovsdb_idl_txn *ovnsb_idl_txn,
>>> const struct ovsrec_bridge *br_int,
>>> const struct sbrec_chassis_table *,
>>> const struct sbrec_chassis *,
>>> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
>>> index c2dab41c1..6ef32d6d5 100644
>>> --- a/controller/ovn-controller.c
>>> +++ b/controller/ovn-controller.c
>>> @@ -6728,6 +6728,9 @@ main(int argc, char *argv[])
>>> struct ovsdb_idl_index *sbrec_advertised_mac_binding_index_by_dp
>>> = ovsdb_idl_index_create1(ovnsb_idl_loop.idl,
>>>
>>> &sbrec_advertised_mac_binding_col_datapath);
>>> + struct ovsdb_idl_index *sbrec_encaps
>>>
>>>
>>> nit: We should use the naming convention as for other indexes:
>>> "sbrec_encaps_index_by_ip_and_type".
>>>
>>> + = ovsdb_idl_index_create2(ovnsb_idl_loop.idl,
>>> + &sbrec_encap_col_type,
>>> &sbrec_encap_col_ip);
>>>
>>> ovsdb_idl_track_add_all(ovnsb_idl_loop.idl);
>>> ovsdb_idl_omit_alert(ovnsb_idl_loop.idl,
>>> @@ -7450,7 +7453,7 @@ main(int argc, char *argv[])
>>> sbrec_chassis_private_by_name,
>>> ovs_table, chassis_id,
>>> br_int, &transport_zones,
>>> - &chassis_private);
>>> + &chassis_private, sbrec_encaps);
>>> }
>>>
>>> /* If any OVS feature support changed, force a full
>>> recompute.
>>> @@ -7489,7 +7492,7 @@ main(int argc, char *argv[])
>>> }
>>>
>>> if (chassis && ovs_feature_set_discovered()) {
>>> - encaps_run(ovs_idl_txn, br_int,
>>> + encaps_run(ovs_idl_txn, ovnsb_idl_txn, br_int,
>>>
>>> sbrec_chassis_table_get(ovnsb_idl_loop.idl),
>>> chassis,
>>>
>>> sbrec_sb_global_first(ovnsb_idl_loop.idl),
>>> diff --git a/lib/chassis-index.c b/lib/chassis-index.c
>>> index 4b38036cb..c1e0234e3 100644
>>> --- a/lib/chassis-index.c
>>> +++ b/lib/chassis-index.c
>>> @@ -115,3 +115,23 @@ ha_chassis_group_lookup_by_name(
>>>
>>> return retval;
>>> }
>>> +
>>> +/* Finds and returns the encap with the given ip and type, or NULL
>>> if no such
>>> + * encap exists. */
>>> +struct sbrec_encap *
>>> +encap_lookup_by_ip_and_type(struct ovsdb_idl_index
>>> *sbrec_chassis_encaps,
>>> + const char *ip, const char *type)
>>> +{
>>> + struct sbrec_encap *target = sbrec_encap_index_init_row(
>>> + sbrec_chassis_encaps);
>>> +
>>> + sbrec_encap_index_set_ip(target, ip);
>>> + sbrec_encap_index_set_type(target, type);
>>> +
>>> + struct sbrec_encap *retval = sbrec_encap_index_find(
>>> + sbrec_chassis_encaps, target);
>>> +
>>> + sbrec_encap_index_destroy_row(target);
>>> +
>>> + return retval;
>>> +}
>>> diff --git a/lib/chassis-index.h b/lib/chassis-index.h
>>> index bc654da13..3c88ae631 100644
>>> --- a/lib/chassis-index.h
>>> +++ b/lib/chassis-index.h
>>> @@ -35,5 +35,8 @@ chassis_private_lookup_by_name(
>>> struct ovsdb_idl_index *ha_chassis_group_index_create(struct
>>> ovsdb_idl *idl);
>>> const struct sbrec_ha_chassis_group *ha_chassis_group_lookup_by_name(
>>> struct ovsdb_idl_index *sbrec_ha_chassis_grp_by_name, const
>>> char *name);
>>> +struct sbrec_encap *encap_lookup_by_ip_and_type(
>>> + struct ovsdb_idl_index *sbrec_chassis_encaps,
>>> + const char *ip, const char *type);
>>>
>>> #endif /* lib/chassis-index.h */
>>> diff --git a/tests/ovn-controller.at <http://ovn-controller.at> b/
>>> tests/ovn-controller.at <http://ovn-controller.at>
>>> index bbf939d25..9842bd976 100644
>>> --- a/tests/ovn-controller.at <http://ovn-controller.at>
>>> +++ b/tests/ovn-controller.at <http://ovn-controller.at>
>>> @@ -277,7 +277,8 @@ ovs-vsctl -- set Open_vSwitch . external-
>>> ids:hostname="${new_sysid}" \
>>> -- set Open_vSwitch . external-ids:ovn-
>>> remote="${current_remote}"
>>>
>>> OVS_WAIT_UNTIL([
>>> - grep -q 'Transaction causes multiple rows in \\"Encap\\" table
>>> to have identical values' hv/ovn-controller.log
>>> + grep -q "'hv' already has encap ip '192.168.0.1' and type
>>> 'geneve', " \
>>> + "cannot duplicate on 'hv-foo'" hv/ovn-controller.log
>>> ])
Since Ilya pointed out "geneve" could also be "vxlan", should we replace
this with:
+grep -qE "'hv' already has encap ip '192.168.0.1' and type" \
+ "'(geneve|vxlan)', cannot duplicate on 'hv-foo'" \
+ hv/ovn-controller.log
This doesn't work in the other case though, so I suppose the type could
be removed entirely from the error message. I figured it was best to
include as much information as possible, but if it's up to the order in
the hashmap that doesn't seem very important...
>>>
>>> # Destroy the stale entries manually and ovn-controller should now
>>> be able
>>> @@ -305,12 +306,12 @@ check ovn-sbctl destroy chassis_private . --
>>> destroy chassis .
>>> # - could not create datapath br-int of unknown type foobar
>>> # - unknown datapath type bar
>>> # - could not create datapath br-int of unknown type bar
>>> -# - Transaction causes multiple rows ...
>>> # - failed to create bridge br-int: Address family not supported by
>>> protocol # due to previous errors.
>>> +# - 'hv' already has encap ip '192.168.0.1' and type 'geneve',
>>> cannot duplicate on 'hv-foo'
>>> OVN_CLEANUP_SBOX([hv], ["/foo/d
>>> /bar/d
>>> -/Transaction causes multiple rows/d
>>> -/failed to create bridge/d"])
>>> +/failed to create bridge/d
>>> +/already has encap ip.*cannot duplicate on/d"])
>>>
>>>
>>> nit: We shouldn't change the order.
>>>
>>>
>>>
>>> OVN_CLEANUP_VSWITCH([main])
>>> as ovn-sb
>>> @@ -3788,3 +3789,59 @@ wait_ports 0
>>> OVN_CLEANUP([hv1
>>> /Invalid VXLAN port number.*/d])
>>> AT_CLEANUP
>>> +
>>> +OVN_FOR_EACH_NORTHD([
>>> +AT_SETUP([ovn-controller - two chassis set same encap ip and type])
>>> +AT_KEYWORDS([ovn])
>>> +ovn_start
>>> +net_add n1
>>> +
>>> +sim_add hv1
>>> +as hv1
>>> +check ovs-vsctl add-br br-phys
>>> +ovn_attach n1 br-phys 192.168.0.11
>>> +
>>> +sim_add hv2
>>> +as hv2
>>> +check ovs-vsctl add-br br-phys
>>> +ovn_attach n1 br-phys 192.168.0.12
>>> +
>>> +# Wait for tunnels to appear in OVS, encaps to appear in sb-db, and set
>>> +# controller's jsonrpc debug logging.
>>> +as hv2
>>> +OVS_WAIT_UNTIL([test 1 = $(ovs-vsctl find Interface type=geneve |
>>> grep -c _uuid)])
>>>
>>>
>>> nit: We can use OVS_WAIT_FOR_OUTPUT instead.
>>>
>>>
>>> +wait_row_count Encap 4
>>> +as hv1
>>> +OVS_WAIT_UNTIL([test 1 = $(ovs-vsctl find Interface type=geneve |
>>> grep -c _uuid)])
>>> +wait_row_count Encap 4
>>> +ovn-appctl vlog/set jsonrpc:dbg
>>> +as hv2
>>> +ovn-appctl vlog/set jsonrpc:dbg
>>> +
>>> +# Give hv2 the same encap IP as hv1 (.11) with other valid IPs.
>>> +check ovs-vsctl set Open_vSwitch . \
>>> + external-ids:ovn-encap-
>>> ip="192.168.0.10,192.168.0.11,192.168.0.12,192.168.0.13"
>>> +
>>> +as hv1 wait_row_count Encap 2 chassis_name="hv1"
>>> +
>>> +# Make sure tunnels are created for the other three valid IPs
>>> configured.
>>> +as hv2
>>> +OVS_WAIT_UNTIL([test 3 = $(ovs-vsctl find Interface type=geneve |
>>> grep -c _uuid)])
>>> +wait_row_count Encap 6 chassis_name="hv2"
>>> +OVS_WAIT_UNTIL([check test 1 = $(cat hv2/ovn-controller.log |
>>> + grep -c "'hv1' already has encap ip '192.168.0.11'
>>> and type "
>>> + "'geneve', cannot duplicate on 'hv2'")])
>
> This check fails with different compiler flags. Depending on the
> order inside the hash map, we'll get either geneve or vxlan here.
Again, either remove the type from the warning altogether, or we could
match on only the beginning of the warning.
>
>>>
>>> +
>>> +# Make sure controller logs have correct number of transactions.
>>> +OVS_WAIT_UNTIL([check test 2 = $(cat hv2/ovn-controller.log | grep
>>> transact | wc -l)])
>>> +OVS_WAIT_UNTIL([check test 1 = $(cat hv1/ovn-controller.log | grep
>>> transact | wc -l)])
>>>
>>>
>>> I don't think the transaction count is helping TBF. IMO we can
>>> completely skip it.
>
> We can't really skipt it. For exmaple, if you make the follwoing change in
> the code on top of the suggested follow up:
>
> diff --git a/controller/chassis.c b/controller/chassis.c
> index 4afb6da93..8168a9e82 100644
> --- a/controller/chassis.c
> +++ b/controller/chassis.c
> @@ -737,12 +737,12 @@ chassis_build_encaps(struct ovsdb_idl_txn
> *ovnsb_idl_txn,
>
> /* Only if an Encap record does not already exist do we create a
> * new record. */
> - if (!encap_rec) {
> + //if (!encap_rec) {
> encap_rec = sbrec_encap_insert(ovnsb_idl_txn);
> sbrec_encap_set_type(encap_rec, encap_type);
> sbrec_encap_set_ip(encap_rec, encap_ip);
> sbrec_encap_set_chassis_name(encap_rec, chassis_id);
> - }
> + //}
>
> if (!smap_is_empty(&encap_rec->options) ||
> !smap_equal(&options, &encap_rec->options)) {
> ---
>
> The test is still passing, but in the short time while the test is ruuning,
> ovn-controller manages to send hundreds of transactions towards Sb,
> re-creating
> all the encaps from scratch on every iteration. Which is one of the issues
> this
> patch is trying to fix, beside the transations sent towards OVS database.
Ah yes, I was trying to figure out how to make the test pass wrongly and
this is it.
I thought removing the check for !ovnsb_idl_txn from encaps_run() would
let it pass with extra transactions but it still failed at the check for
# of tunnels.
Yes we should keep the transaction count.
>
> Best regards, Ilya Maximets.
>
--
Rosemarie O'Riorden
Lowell, MA, United States
[email protected]
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev