On 9/18/24 12:31, Ilya Maximets wrote:
> On 9/18/24 12:26, Ilya Maximets wrote:
>> On 9/18/24 12:06, Dumitru Ceara wrote:
>>> On 9/17/24 19:00, Rosemarie O'Riorden wrote:
>>>> When preparing to build ECMP and static route flows, routes are sorted
>>>> into unique routes (meaning they are not part of a group) or they are
>>>> added to EMCP groups. Then, ECMP route flows are built out of the
>>>> groups, and static route flows are built out of the unique routes.
>>>> However, 'unique routes' include ones that use the
>>>> --ecmp-symmetric-reply flag, meaning that they may not be added to an
>>>> ECMP group, and thus ECMP symmetric reply would not be used for those
>>>> flows.
>>>>
>>>> For example, if one route is added and traffic is started, and then
>>>> later another route is added, the already flowing traffic might be
>>>> rerouted since it wasn't conntracked initially. This could break
>>>> symmetric reply with traffic using a different next-hop than before.
>>>>
>>>> This change makes it so that when the --ecmp-symmetric-reply flag is
>>>> used, even for unique routes, an ECMP group is created which they are
>>>> added to. Thus they are added to the ECMP route flow, rather than
>>>> static. This allows ECMP groups to persist even when there is only one
>>>> route.
>>>>
>>>> Edited documentation to support this change.
>>>> Also updated incorrect actions in documentation.
>>>>
>>>> Fixes: 4fdca656857d ("Add ECMP symmetric replies.")
>>>> Reported-at: https://issues.redhat.com/browse/FDP-786
>>>> Signed-off-by: Rosemarie O'Riorden <[email protected]>
>>>> ---
>>>
>>> Hi Rosemarie,
>>>
>>> Thanks for the fix!
>>>
>>>>  northd/northd.c         | 33 ++++++++++++++++++++++-----------
>>>>  northd/ovn-northd.8.xml | 13 ++++++++++++-
>>>>  tests/ovn-northd.at     |  5 ++++-
>>>>  3 files changed, 38 insertions(+), 13 deletions(-)
>>>>
>>>> diff --git a/northd/northd.c b/northd/northd.c
>>>> index 983c464eb..8ae3a75bd 100644
>>>> --- a/northd/northd.c
>>>> +++ b/northd/northd.c
>>>> @@ -11567,21 +11567,27 @@ build_ecmp_route_flow(struct lflow_table 
>>>> *lflows, struct ovn_datapath *od,
>>>>  
>>>>      struct ds actions = DS_EMPTY_INITIALIZER;
>>>>      ds_put_format(&actions, "ip.ttl--; flags.loopback = 1; %s = %"PRIu16
>>>> -                  "; %s = select(", REG_ECMP_GROUP_ID, eg->id,
>>>> -                  REG_ECMP_MEMBER_ID);
>>>> +                  "; %s = ", REG_ECMP_GROUP_ID, eg->id, 
>>>> REG_ECMP_MEMBER_ID);
>>>>  
>>>> -    bool is_first = true;
>>>> -    LIST_FOR_EACH (er, list_node, &eg->route_list) {
>>>> -        if (is_first) {
>>>> -            is_first = false;
>>>> -        } else {
>>>> -            ds_put_cstr(&actions, ", ");
>>>> +    if (!ovs_list_is_singleton(&eg->route_list)) {
>>>> +        bool is_first = true;
>>>> +
>>>> +        ds_put_cstr(&actions, "select(");
>>>> +        LIST_FOR_EACH (er, list_node, &eg->route_list) {
>>>> +            if (is_first) {
>>>> +                is_first = false;
>>>> +            } else {
>>>> +                ds_put_cstr(&actions, ", ");
>>>> +            }
>>>> +            ds_put_format(&actions, "%"PRIu16, er->id);
>>>>          }
>>>> -        ds_put_format(&actions, "%"PRIu16, er->id);
>>>> +        ds_put_cstr(&actions, ");");
>>>> +    } else {
>>>> +        er = CONTAINER_OF(ovs_list_front(&eg->route_list),
>>>> +                          struct ecmp_route_list_node, list_node);
>>>> +        ds_put_format(&actions, "%"PRIu16"; next;", er->id);
>>>>      }
>>>>  
>>>> -    ds_put_cstr(&actions, ");");
>>>> -
>>>
>>> Isn't it simpler to just allow "select(<single-value>)" logical actions?
>>>  It doesn't feel too wrong to allow select to select a value from a set
>>> that only contains a single value.
>>>
>>> As far as I can tell, from a performance perspective this doesn't make
>>> much difference.  Also, in general, we expect "ecmp routes with a single
>>> path" to be a transient state, e.g., due to maintenance some paths are
>>> brought down for a finite duration of time.
>>>
>>> We do have to remove the restriction in `parse_select_action()` to allow
>>> ovn-controller to parse actions of form "select(<single-value>)".
>>>
>>> Then we don't need this special treatment for the single next hop case here.
>>
>> Wouldn't this require even more special treatment in a form of a
>> feature flag that ovn-controller reports so we don't generate actions
>> it doesn't understand?
>>
>> Might be painful, especially if we're going to backport this fix
>> to stable releases.

Good point.  Although our official upgrade policy still states that
ovn-controller should be upgraded first or version pinning should be
enabled (in this case ovn-controller will not try to parse the new SB
flow until it is restarted).

https://docs.ovn.org/en/latest/intro/install/ovn-upgrades.html#upgrade-procedures

This bug report came from OpenShift and there, as far as I know, there's
a guarantee that the ovn-controller version will always be >= to the
ovn-northd version.  To my knowledge they're the only users of
ecmp-symmetric-reply but of course we can't know that 100% sure.

> 
> Also, there is already a similar code in northd for router policies.  So,
> it would make more sense to change both at once outside of a bug fix.
> 

It's a bit different there as there's no concept of "equal cost" policy
so we can't really know if the intention is to add more paths in the
future or not.  While with routes, IMO, it doesn't make sense to have a
"single path equal cost multi-path" route - except when that state is
transient.

But given that we have this similar code for policies, we might as well
do it for routes too as Rosemarie's patch does.

I'd still prefer to have system tests to cover the single path case too
though.

Regards,
Dumitru

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to