On 11/24/22 13:08, Ales Musil wrote:
> On Thu, Nov 24, 2022 at 12:44 PM Dumitru Ceara <[email protected]> wrote:
>
>> On 11/24/22 07:37, Ales Musil wrote:
>>> Improve the affinity code to reuse ds buffers as much as possible
>>> without constantly repeating some parts. Add ct.new for the LB flows
>>> so it is clear that the commit happens only when we have a new
>>> connection.
>>>
>>> Signed-off-by: Ales Musil <[email protected]>
>>> ---
>>
>> Hi Ales,
>>
>>> northd/northd.c | 162 ++++++++++++++++++----------------------
>>> northd/ovn-northd.8.xml | 15 ++--
>>> tests/ovn-northd.at | 10 +--
>>> tests/system-ovn.at | 8 +-
>>> 4 files changed, 91 insertions(+), 104 deletions(-)
>>>
>>> diff --git a/northd/northd.c b/northd/northd.c
>>> index 040f46e1a..188042bca 100644
>>> --- a/northd/northd.c
>>> +++ b/northd/northd.c
>>> @@ -243,7 +243,6 @@ enum ovn_stage {
>>> #define REGBIT_DST_NAT_IP_LOCAL "reg9[4]"
>>> #define REGBIT_KNOWN_ECMP_NH "reg9[5]"
>>> #define REGBIT_KNOWN_LB_SESSION "reg9[6]"
>>> -#define REG
>>
>> Oops, I thought I had fixed this up before applying the original patch.
>> I guess not. Thanks for spotting it!
>>
>>>
>>> /* Register to store the eth address associated to a router port for
>> packets
>>> * received in S_ROUTER_IN_ADMISSION.
>>> @@ -6963,6 +6962,7 @@ build_lb_affinity_flows(struct hmap *lflows,
>> struct ovn_northd_lb *lb,
>>> return;
>>> }
>>>
>>> + static char *aff_check = REGBIT_KNOWN_LB_SESSION" = chk_lb_aff();
>> next;";
>>> enum ovn_stage stage0 = router_pipeline ?
>>> S_ROUTER_IN_LB_AFF_CHECK : S_SWITCH_IN_LB_AFF_CHECK;
>>> struct ovn_lflow *lflow_ref_aff_check = NULL;
>>> @@ -6970,80 +6970,102 @@ build_lb_affinity_flows(struct hmap *lflows,
>> struct ovn_northd_lb *lb,
>>> * tuple and we are in affinity timeslot. */
>>> uint32_t hash_aff_check = ovn_logical_flow_hash(
>>> ovn_stage_get_table(stage0),
>> ovn_stage_get_pipeline(stage0), 100,
>>> - check_lb_match, REGBIT_KNOWN_LB_SESSION" = chk_lb_aff();
>> next;");
>>> + check_lb_match, aff_check);
>>>
>>> for (size_t i = 0; i < n_dplist; i++) {
>>> if (!ovn_dp_group_add_with_reference(lflow_ref_aff_check,
>> dplist[i])) {
>>> lflow_ref_aff_check = ovn_lflow_add_at_with_hash(
>>> lflows, dplist[i], stage0, 100, check_lb_match,
>>> - REGBIT_KNOWN_LB_SESSION" = chk_lb_aff(); next;",
>>> - NULL, NULL, &lb->nlb->header_,
>>> + aff_check, NULL, NULL, &lb->nlb->header_,
>>> OVS_SOURCE_LOCATOR, hash_aff_check);
>>> }
>>> }
>>>
>>> + const char *reg_vip;
>>> + const char *reg_backend;
>>> +
>>> + struct ds aff_action = DS_EMPTY_INITIALIZER;
>>> struct ds aff_action_learn = DS_EMPTY_INITIALIZER;
>>> - struct ds aff_match_learn = DS_EMPTY_INITIALIZER;
>>> - struct ds aff_action_lb_common = DS_EMPTY_INITIALIZER;
>>> - struct ds aff_action_lb = DS_EMPTY_INITIALIZER;
>>> struct ds aff_match = DS_EMPTY_INITIALIZER;
>>> + struct ds aff_match_learn = DS_EMPTY_INITIALIZER;
>>>
>>> bool ipv6 = !IN6_IS_ADDR_V4MAPPED(&lb_vip->vip);
>>> - const char *reg_vip;
>>> + const char *ip_match = ipv6 ? "ip6" : "ip4";
>>> +
>>> + stage0 =
>>> + router_pipeline ? S_ROUTER_IN_LB_AFF_LEARN :
>> S_SWITCH_IN_LB_AFF_LEARN;
>>> + enum ovn_stage stage1 =
>>> + router_pipeline ? S_ROUTER_IN_DNAT : S_SWITCH_IN_LB;
>>> +
>>> if (router_pipeline) {
>>> reg_vip = ipv6 ? REG_NEXT_HOP_IPV6 : REG_NEXT_HOP_IPV4;
>>> + reg_backend =
>>> + ipv6 ? REG_LB_L3_AFF_BACKEND_IP6 : REG_LB_AFF_BACKEND_IP4;
>>> } else {
>>> reg_vip = ipv6 ? REG_ORIG_DIP_IPV6 : REG_ORIG_DIP_IPV4;
>>> + reg_backend =
>>> + ipv6 ? REG_LB_L2_AFF_BACKEND_IP6 : REG_LB_AFF_BACKEND_IP4;
>>> }
>>>
>>> - ds_put_format(&aff_action_lb_common,
>>> - REGBIT_CONNTRACK_COMMIT" = 0; %s = %s; ",
>>> + /* Prepare common part of affinity LB and affinity learn action. */
>>> + ds_put_format(&aff_action, REGBIT_CONNTRACK_COMMIT" = 0; %s = %s; ",
>>> reg_vip, lb_vip->vip_str);
>>> + ds_put_cstr(&aff_action_learn, "commit_lb_aff(vip = \"");
>>> +
>>> if (lb_vip->vip_port) {
>>> - ds_put_format(&aff_action_lb_common, REG_ORIG_TP_DPORT" = %d; ",
>>> + ds_put_format(&aff_action, REG_ORIG_TP_DPORT" = %d; ",
>>> lb_vip->vip_port);
>>> + ds_put_format(&aff_action_learn, ipv6 ? "[%s]:%d" : "%s:%d",
>>> + lb_vip->vip_str, lb_vip->vip_port);
>>> + } else {
>>> + ds_put_cstr(&aff_action_learn, lb_vip->vip_str);
>>> }
>>>
>>> if (lb_action) {
>>> - ds_put_format(&aff_action_lb_common, "%s;", lb_action);
>>> + ds_put_cstr(&aff_action, lb_action);
>>> }
>>> + ds_put_cstr(&aff_action, "ct_lb_mark(backends=");
>>> + ds_put_cstr(&aff_action_learn, "\", backend = \"");
>>> +
>>> + /* Prepare common part of affinity learn match. */
>>> + ds_put_format(&aff_match_learn, REGBIT_KNOWN_LB_SESSION" == 0 && "
>>> + "ct.new && %s && %s == %s && %s.dst == ", ip_match,
>>> + reg_vip, lb_vip->vip_str, ip_match);
>>> +
>>> + /* Prepare common part of affinity match. */
>>> + ds_put_format(&aff_match, REGBIT_KNOWN_LB_SESSION" == 1 && "
>>> + "ct.new && %s && %s == ", ip_match, reg_backend);
>>> +
>>> + /* Store the common part length. */
>>> + size_t aff_action_len = aff_action.length;
>>> + size_t aff_action_learn_len = aff_action_learn.length;
>>> + size_t aff_match_len = aff_match.length;
>>> + size_t aff_match_learn_len = aff_match_learn.length;
>>> +
>>>
>>> - stage0 = router_pipeline
>>> - ? S_ROUTER_IN_LB_AFF_LEARN : S_SWITCH_IN_LB_AFF_LEARN;
>>> - enum ovn_stage stage1 = router_pipeline
>>> - ? S_ROUTER_IN_DNAT : S_SWITCH_IN_LB;
>>> for (size_t i = 0; i < lb_vip->n_backends; i++) {
>>> struct ovn_lb_backend *backend = &lb_vip->backends[i];
>>>
>>> - /* Forward to OFTABLE_CHK_LB_AFFINITY table to store flow
>> tuple. */
>>> - ds_put_format(&aff_match_learn,
>>> - REGBIT_KNOWN_LB_SESSION" == 0 && "
>>> - "ct.new && %s && %s.dst == %s && %s == %s",
>>> - ipv6 ? "ip6" : "ip4", ipv6 ? "ip6" : "ip4",
>>> - backend->ip_str, reg_vip, lb_vip->vip_str);
>>> + ds_put_cstr(&aff_match_learn, backend->ip_str);
>>> + ds_put_cstr(&aff_match, backend->ip_str);
>>> +
>>> if (backend->port) {
>>> + ds_put_format(&aff_action, ipv6 ? "[%s]:%d" : "%s:%d",
>>> + backend->ip_str, backend->port);
>>> + ds_put_format(&aff_action_learn, ipv6 ? "[%s]:%d" : "%s:%d",
>>> + backend->ip_str, backend->port);
>>> +
>>> ds_put_format(&aff_match_learn, " && %s.dst == %d",
>>> lb->proto, backend->port);
>>> - }
>>> -
>>> - if (lb_vip->vip_port) {
>>> - ds_put_format(&aff_action_learn,
>>> - "commit_lb_aff(vip = \"%s%s%s:%d\"",
>>> - ipv6 ? "[" : "", lb_vip->vip_str, ipv6 ? "]"
>> : "",
>>> - lb_vip->vip_port);
>>> + ds_put_format(&aff_match, " && "REG_LB_AFF_MATCH_PORT" ==
>> %d",
>>> + backend->port);
>>> } else {
>>> - ds_put_format(&aff_action_learn, "commit_lb_aff(vip =
>> \"%s\"",
>>> - lb_vip->vip_str);
>>> + ds_put_cstr(&aff_action, backend->ip_str);
>>> + ds_put_cstr(&aff_action_learn, backend->ip_str);
>>> }
>>>
>>> - if (backend->port) {
>>> - ds_put_format(&aff_action_learn,", backend = \"%s%s%s:%d\"",
>>> - ipv6 ? "[" : "", backend->ip_str,
>>> - ipv6 ? "]" : "", backend->port);
>>> - } else {
>>> - ds_put_format(&aff_action_learn,", backend = \"%s\"",
>>> - backend->ip_str);
>>> - }
>>> + ds_put_cstr(&aff_action, ");");
>>> + ds_put_char(&aff_action_learn, '"');
>>>
>>> if (lb_vip->vip_port) {
>>> ds_put_format(&aff_action_learn, ", proto = %s", lb->proto);
>>> @@ -7057,41 +7079,13 @@ build_lb_affinity_flows(struct hmap *lflows,
>> struct ovn_northd_lb *lb,
>>> ovn_stage_get_table(stage0),
>> ovn_stage_get_pipeline(stage0),
>>> 100, ds_cstr(&aff_match_learn),
>> ds_cstr(&aff_action_learn));
>>>
>>> - const char *reg_backend;
>>> - if (IN6_IS_ADDR_V4MAPPED(&lb_vip->vip)) {
>>> - reg_backend = REG_LB_AFF_BACKEND_IP4;
>>> - } else {
>>> - reg_backend = router_pipeline
>>> - ? REG_LB_L3_AFF_BACKEND_IP6 : REG_LB_L2_AFF_BACKEND_IP6;
>>> - }
>>> -
>>> - /* Use already selected backend within affinity
>>> - * timeslot. */
>>> - if (backend->port) {
>>> - ds_put_format(&aff_match,
>>> - REGBIT_KNOWN_LB_SESSION" == 1 && %s && %s == %s "
>>> - "&& "REG_LB_AFF_MATCH_PORT" == %d",
>>> - IN6_IS_ADDR_V4MAPPED(&lb_vip->vip) ? "ip4" : "ip6",
>>> - reg_backend, backend->ip_str, backend->port);
>>> - ds_put_format(&aff_action_lb, "%s
>> ct_lb_mark(backends=%s%s%s:%d);",
>>> - ds_cstr(&aff_action_lb_common),
>>> - ipv6 ? "[" : "", backend->ip_str,
>>> - ipv6 ? "]" : "", backend->port);
>>> - } else {
>>> - ds_put_format(&aff_match,
>>> - REGBIT_KNOWN_LB_SESSION" == 1 && %s && %s == %s",
>>> - IN6_IS_ADDR_V4MAPPED(&lb_vip->vip) ? "ip4" : "ip6",
>>> - reg_backend, backend->ip_str);
>>> - ds_put_format(&aff_action_lb, "%s ct_lb_mark(backends=%s);",
>>> - ds_cstr(&aff_action_lb_common),
>> backend->ip_str);
>>> - }
>>> -
>>> struct ovn_lflow *lflow_ref_aff_lb = NULL;
>>> uint32_t hash_aff_lb = ovn_logical_flow_hash(
>>> ovn_stage_get_table(stage1),
>> ovn_stage_get_pipeline(stage1),
>>> - 150, ds_cstr(&aff_match), ds_cstr(&aff_action_lb));
>>> + 150, ds_cstr(&aff_match), ds_cstr(&aff_action));
>>>
>>> for (size_t j = 0; j < n_dplist; j++) {
>>> + /* Forward to OFTABLE_CHK_LB_AFFINITY table to store flow
>> tuple. */
>>> if (!ovn_dp_group_add_with_reference(lflow_ref_aff_learn,
>>> dplist[j])) {
>>> lflow_ref_aff_learn = ovn_lflow_add_at_with_hash(
>>> @@ -7100,35 +7094,27 @@ build_lb_affinity_flows(struct hmap *lflows,
>> struct ovn_northd_lb *lb,
>>> NULL, NULL, &lb->nlb->header_,
>> OVS_SOURCE_LOCATOR,
>>> hash_aff_learn);
>>> }
>>> - if (!ovn_dp_group_add_with_reference(lflow_ref_aff_learn,
>>> - dplist[j])) {
>>> - lflow_ref_aff_learn = ovn_lflow_add_at_with_hash(
>>> - lflows, dplist[j], stage0, 100,
>>> - ds_cstr(&aff_match_learn),
>> ds_cstr(&aff_action_learn),
>>> - NULL, NULL, &lb->nlb->header_,
>> OVS_SOURCE_LOCATOR,
>>> - hash_aff_learn);
>>> - }
>>> + /* Use already selected backend within affinity timeslot. */
>>> if (!ovn_dp_group_add_with_reference(lflow_ref_aff_lb,
>>> dplist[j])) {
>>> lflow_ref_aff_lb = ovn_lflow_add_at_with_hash(
>>> - lflows, dplist[j], stage1, 150,
>> ds_cstr(&aff_match),
>>> - ds_cstr(&aff_action_lb), NULL, NULL,
>>> - &lb->nlb->header_, OVS_SOURCE_LOCATOR,
>>> - hash_aff_lb);
>>> + lflows, dplist[j], stage1, 150, ds_cstr(&aff_match),
>>> + ds_cstr(&aff_action), NULL, NULL,
>>> + &lb->nlb->header_, OVS_SOURCE_LOCATOR,
>>> + hash_aff_lb);
>>> }
>>> }
>>>
>>> - ds_clear(&aff_action_learn);
>>> - ds_clear(&aff_match_learn);
>>> - ds_clear(&aff_action_lb);
>>> - ds_clear(&aff_match);
>>> + ds_truncate(&aff_action, aff_action_len);
>>> + ds_truncate(&aff_action_learn, aff_action_learn_len);
>>> + ds_truncate(&aff_match, aff_match_len);
>>> + ds_truncate(&aff_match_learn, aff_match_learn_len);
>>> }
>>>
>>> + ds_destroy(&aff_action);
>>> ds_destroy(&aff_action_learn);
>>> - ds_destroy(&aff_match_learn);
>>> - ds_destroy(&aff_action_lb_common);
>>> - ds_destroy(&aff_action_lb);
>>> ds_destroy(&aff_match);
>>> + ds_destroy(&aff_match_learn);
>>> }
>>
>>
>> This part is correct. I have some doubts about readability though.
>> FWIW I had doubts about readability in the original code too.
>>
>> With that in mind, I'm thinking of an alternative:
>>
>>
>> https://github.com/dceara/ovn/commit/fb9096923e98a31a44177e4446528bc1bf478e85
>>
>> This adds a bit of code duplication but, in my opinion, it shows what
>> flows we generate in a more readable way. Alternatively, I could just
>> try to merge the two comments I added and prefix
>> build_lb_affinity_flows() with that in your version.
>>
>> What do you think?
>>
>> Thanks,
>> Dumitru
>>
>>
> Hi Dumitru,
>
> separating it into two functions makes more sense. It does not feel that
> packed.
>
I've been running some tests with this version. It's a bit buggy. It
turns out yours had a bug too due to:
>>> if (lb_action) {
>>> - ds_put_format(&aff_action_lb_common, "%s;", lb_action);
>>> + ds_put_cstr(&aff_action, lb_action);
>>> }
Let's continue with a formal v2 to avoid confusion. Let's also see what
other maintainers think about accepting a refactor patch after soft
freeze. From my perspective that's OK because the feature you're
touching is new as well.
But maybe Mark, Numan, Han, others have other opinions.
Thanks,
Dumitru
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev