Em qui., 30 de mai. de 2024 às 11:57, Roberto Bartzen Acosta < [email protected]> escreveu:
> > > Em qui., 2 de mai. de 2024 às 10:11, Numan Siddique <[email protected]> > escreveu: > >> On Fri, Mar 22, 2024 at 7:51 AM Roberto Bartzen Acosta >> <[email protected]> wrote: >> > >> > Hi Mark, >> > >> > Thanks for your feedback. >> > >> > Em seg., 18 de mar. de 2024 às 11:53, Mark Michelson < >> [email protected]> >> > escreveu: >> > >> > > Hi Roberto, >> > > >> > > I have some concerns about this patch. Let's use the test case you >> added >> > > as an example network. Let's bind the vms and DGPs to hypervisors: >> > > >> > > * vm1 and lr1-ts1 are bound to hypervisor hv1 >> > > * vm2 and lr1-ts2 are bound to hypervisor hv2 >> > > >> > > Now imagine a packet arrives on lr1-ts1. The packet gets load balanced >> > > and sent to vm2. In this case, since lr1-ts1 is on hv1, the >> ct_lb_mark() >> > > action runs there, creating a conntrack entry on hv1. However, the >> > > packet eventually is tunneled to hv2 so that it can be output to vm2. >> > > >> > > Now vm2 replies to the packet. Is there anything that ensures that the >> > > reply from vm2 gets sent to hv1 for the egress pipeline of lr1? If so, >> > > then this should work fine since the packet will be unDNATted as >> > > expected on hv1. However, if the packet runs all of lr1's pipelines on >> > > hv2, then there is no conntrack entry present, and the attempt to >> unDNAT >> > > will not succeed. The packet will either be dropped because of invalid >> > > CT, or the packet will have an incorrect source IP and port. Either >> way, >> > > this isn't what is desired. >> > Hi @Mark Michelson <[email protected]> , I solved the problem that you have pointed out about CT with multiple chassis hosting the DGP in v2 of this path. Can you please take a look and provide some feedback? Thanks, Roberto > > > >> > >> > yep, you're right! that makes sense. >> > If the packet comes back with a chassis that does not have the related >> LB >> > contrack entry corresponding to the initial request, this will trigger a >> > miss in the pipeline. >> > >> > I tried to disable ct tracking for lb by configuring the parameter on >> the >> > router, but I still wasn't successful. E.g.: >> > options:lb_force_snat_ip="172.16.200.201" >> > >> > Even changing the lb_force snat ip on the router, or creating a SNAT >> > "stateless" rule, I still see this action in the output pipeline in the >> > highest priority table (table=1). >> > >> > table=1 (lr_out_undnat ), priority=120 , match=(ip4 && >> ((ip4.src == >> > 10.195.185.2 && tcp.src == 8000) || (ip4.src == 10.195.185.3 && tcp.src >> == >> > 8000) || (ip4.src == 10.195.185.4 && tcp.src == 8000) || (ip4.src == >> > 10.195.185.5 && tcp.src == 8000) || (ip4.src == 10.195.185.6 && tcp.src >> == >> > 8000)) && outport == "incus-net40-lr-lrp-01" && >> > is_chassis_resident("cr-incus-net40-lr-lrp-01")), >> action=(ct_dnat_in_czone;) >> > table=1 (lr_out_undnat ), priority=120 , match=(ip4 && >> ((ip4.src == >> > 10.195.185.2 && tcp.src == 8000) || (ip4.src == 10.195.185.3 && tcp.src >> == >> > 8000) || (ip4.src == 10.195.185.4 && tcp.src == 8000) || (ip4.src == >> > 10.195.185.5 && tcp.src == 8000) || (ip4.src == 10.195.185.6 && tcp.src >> == >> > 8000)) && outport == "incus-net40-lr-lrp-02" && >> > is_chassis_resident("cr-incus-net40-lr-lrp-02")), >> action=(ct_dnat_in_czone;) >> > table=1 (lr_out_undnat ), priority=120 , match=(ip4 && >> ((ip4.src == >> > 10.195.185.2 && tcp.src == 8000) || (ip4.src == 10.195.185.3 && tcp.src >> == >> > 8000) || (ip4.src == 10.195.185.4 && tcp.src == 8000) || (ip4.src == >> > 10.195.185.5 && tcp.src == 8000) || (ip4.src == 10.195.185.6 && tcp.src >> == >> > 8000)) && outport == "incus-net40-lr-lrp-03" && >> > is_chassis_resident("cr-incus-net40-lr-lrp-03")), >> action=(ct_dnat_in_czone;) >> > table=1 (lr_out_undnat ), priority=120 , match=(ip4 && >> ((ip4.src == >> > 10.195.185.2 && tcp.src == 8000) || (ip4.src == 10.195.185.3 && tcp.src >> == >> > 8000) || (ip4.src == 10.195.185.4 && tcp.src == 8000) || (ip4.src == >> > 10.195.185.5 && tcp.src == 8000) || (ip4.src == 10.195.185.6 && tcp.src >> == >> > 8000)) && outport == "incus-net40-lr-lrp-04" && >> > is_chassis_resident("cr-incus-net40-lr-lrp-04")), >> action=(ct_dnat_in_czone;) >> > table=1 (lr_out_undnat ), priority=120 , match=(ip4 && >> ((ip4.src == >> > 10.195.185.2 && tcp.src == 8000) || (ip4.src == 10.195.185.3 && tcp.src >> == >> > 8000) || (ip4.src == 10.195.185.4 && tcp.src == 8000) || (ip4.src == >> > 10.195.185.5 && tcp.src == 8000) || (ip4.src == 10.195.185.6 && tcp.src >> == >> > 8000)) && outport == "incus-net40-lr-lrp-ext" && >> > is_chassis_resident("cr-incus-net40-lr-lrp-ext")), >> > action=(ct_dnat_in_czone;) >> > >> > >> > Considering that the return when using multiple DGPs is probably >> associated >> > with ECMP, any idea on how to change the rules so that the LB output >> rules >> > are 'stateless' (not ct dependent) in this case with multiple DGPs? I >> > imagine this solves the problem and guarantees a return for any chassis. >> >> I think the only way to solve your problem is to add NAT support for a >> router having multiple DGPs. >> I'm not sure how easy that is or if it is even possible to support. >> We need to explore this. >> > > > Hi Numan, thanks for your feedback. > > I'm not convinced that adding stateful NAT support to a distributed port > would work and/or be a good solution. It seems to me that for this use > case, the NAT rules should be stateless, right? > In addition, I believe that stateless rules are already supported for load > balancers/routers configs, I just need to understand how to use them > integrated with multiple DGPs (the configuration I made doesn't seem to > have worked well). > > Do you think this (stateless NAT) would be a better option to use with > ECMP? I mean, traffic that can go in/out via any chassis is usually not > well managed by stateful NAT rules (and it doesn't make much sense in my > opinion). > > @Ilya Maximets <[email protected]> , do you have any ideas here? > > Kind regards, > Roberto > > >> Numan >> >> > >> > Thanks, >> > Roberto >> > >> > PS: I have a complex load balancer scenario for the use case with >> multiple >> > DPGs, and 'internal' VIP addresses + OVN interconnect. I can explain the >> > general context if you are interested :) >> > >> > >> > > On 2/19/24 16:33, Roberto Bartzen Acosta wrote: >> > > > This commit fixes the build_distr_lrouter_nat_flows_for_lb function >> to >> > > > include one NAT flow entry for each DGP in use. Since we have added >> > > support >> > > > to create multiple gateway ports per logical router, it's necessary >> to >> > > > include in the LR nat rules pipeline a specific entry for each >> attached >> > > > DGP. Otherwise, the ingress traffic is only redirected when the >> incoming >> > > > LRP matches the chassis_resident field. >> > > > >> > > > Considering that DNAT rules for DGPs were implemented with the need >> to >> > > > configure the DGP-related gateway-port column, the load-balancer >> NAT rule >> > > > configuration can use a similar idea. In this case, we don't know >> the LRP >> > > > responsible for the incoming traffic, and therefore we must apply >> the >> > > > load-balancer automatically created NAT rule in all DGPs to allow >> the >> > > > incoming traffic. >> > > > >> > > > Reported-at: >> https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/2054322 >> > > > Fixes: 15348b7b806f ("ovn-northd: Multiple distributed gateway port >> > > support.") >> > > > Signed-off-by: Roberto Bartzen Acosta <[email protected] >> > >> > > > --- >> > > > northd/en-lr-stateful.c | 12 ------ >> > > > northd/northd.c | 14 ++++--- >> > > > tests/ovn-northd.at | 92 >> +++++++++++++++++++++++++++++++++++++++++ >> > > > 3 files changed, 100 insertions(+), 18 deletions(-) >> > > > >> > > > diff --git a/northd/en-lr-stateful.c b/northd/en-lr-stateful.c >> > > > index 6d0192487..7ffa4a690 100644 >> > > > --- a/northd/en-lr-stateful.c >> > > > +++ b/northd/en-lr-stateful.c >> > > > @@ -537,18 +537,6 @@ lr_stateful_record_create(struct >> lr_stateful_table >> > > *table, >> > > > >> > > > table->array[od->index] = lr_stateful_rec; >> > > > >> > > > - /* Load balancers are not supported (yet) if a logical router >> has >> > > multiple >> > > > - * distributed gateway port. Log a warning. */ >> > > > - if (lr_stateful_rec->has_lb_vip && >> lr_has_multiple_gw_ports(od)) { >> > > > - static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, >> 1); >> > > > - VLOG_WARN_RL(&rl, "Load-balancers are configured on >> logical " >> > > > - "router %s, which has %"PRIuSIZE" distributed >> " >> > > > - "gateway ports. Load-balancer is not >> supported " >> > > > - "yet when there is more than one distributed " >> > > > - "gateway port on the router.", >> > > > - od->nbr->name, od->n_l3dgw_ports); >> > > > - } >> > > > - >> > > > return lr_stateful_rec; >> > > > } >> > > > >> > > > diff --git a/northd/northd.c b/northd/northd.c >> > > > index 2c3560ce2..7eb943d2f 100644 >> > > > --- a/northd/northd.c >> > > > +++ b/northd/northd.c >> > > > @@ -10919,10 +10919,9 @@ static void >> > > > build_distr_lrouter_nat_flows_for_lb(struct >> lrouter_nat_lb_flows_ctx >> > > *ctx, >> > > > enum lrouter_nat_lb_flow_type >> > > type, >> > > > struct ovn_datapath *od, >> > > > - struct lflow_ref *lflow_ref) >> > > > + struct lflow_ref *lflow_ref, >> > > > + struct ovn_port *dgp) >> > > > { >> > > > - struct ovn_port *dgp = od->l3dgw_ports[0]; >> > > > - >> > > > const char *undnat_action; >> > > > >> > > > switch (type) { >> > > > @@ -10953,7 +10952,7 @@ build_distr_lrouter_nat_flows_for_lb(struct >> > > lrouter_nat_lb_flows_ctx *ctx, >> > > > >> > > > if (ctx->lb_vip->n_backends || >> !ctx->lb_vip->empty_backend_rej) { >> > > > ds_put_format(ctx->new_match, " && >> is_chassis_resident(%s)", >> > > > - od->l3dgw_ports[0]->cr_port->json_key); >> > > > + dgp->cr_port->json_key); >> > > > } >> > > > >> > > > ovn_lflow_add_with_hint__(ctx->lflows, od, S_ROUTER_IN_DNAT, >> > > ctx->prio, >> > > > @@ -11164,8 +11163,11 @@ build_lrouter_nat_flows_for_lb( >> > > > if (!od->n_l3dgw_ports) { >> > > > bitmap_set1(gw_dp_bitmap[type], index); >> > > > } else { >> > > > - build_distr_lrouter_nat_flows_for_lb(&ctx, type, od, >> > > > - >> lb_dps->lflow_ref); >> > > > + for (size_t i = 0; i < od->n_l3dgw_ports; i++) { >> > > > + struct ovn_port *dgp = od->l3dgw_ports[i]; >> > > > + build_distr_lrouter_nat_flows_for_lb(&ctx, type, >> od, >> > > > + >> lb_dps->lflow_ref, >> > > dgp); >> > > > + } >> > > > } >> > > > >> > > > if (lb->affinity_timeout) { >> > > > diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at >> > > > index 6fdd761da..fa24935e1 100644 >> > > > --- a/tests/ovn-northd.at >> > > > +++ b/tests/ovn-northd.at >> > > > @@ -12313,3 +12313,95 @@ check_engine_stats northd recompute >> nocompute >> > > > check_engine_stats lflow recompute nocompute >> > > > >> > > > AT_CLEANUP >> > > > + >> > > > +OVN_FOR_EACH_NORTHD_NO_HV([ >> > > > +AT_SETUP([Load balancer with Distributed Gateway Ports (DGP)]) >> > > > +ovn_start >> > > > + >> > > > +check ovn-nbctl ls-add public >> > > > +check ovn-nbctl lr-add lr1 >> > > > + >> > > > +# lr1 DGP ts1 >> > > > +check ovn-nbctl ls-add ts1 >> > > > +check ovn-nbctl lrp-add lr1 lr1-ts1 00:00:01:02:03:04 >> 172.16.10.1/24 >> > > > +check ovn-nbctl lrp-set-gateway-chassis lr1-ts1 chassis-2 >> > > > + >> > > > +# lr1 DGP ts2 >> > > > +check ovn-nbctl ls-add ts2 >> > > > +check ovn-nbctl lrp-add lr1 lr1-ts2 00:00:01:02:03:05 >> 172.16.20.1/24 >> > > > +check ovn-nbctl lrp-set-gateway-chassis lr1-ts2 chassis-3 >> > > > + >> > > > +# lr1 DGP public >> > > > +check ovn-nbctl lrp-add lr1 lr1_public 00:de:ad:ff:00:01 >> 173.16.0.1/16 >> > > > +check ovn-nbctl lrp-add lr1 lr1_s1 00:de:ad:fe:00:02 172.16.0.1/24 >> > > > +check ovn-nbctl lrp-set-gateway-chassis lr1_public chassis-1 >> > > > + >> > > > +check ovn-nbctl ls-add s1 >> > > > +# s1 - lr1 >> > > > +check ovn-nbctl lsp-add s1 s1_lr1 >> > > > +check ovn-nbctl lsp-set-type s1_lr1 router >> > > > +check ovn-nbctl lsp-set-addresses s1_lr1 "00:de:ad:fe:00:02 >> 172.16.0.1" >> > > > +check ovn-nbctl lsp-set-options s1_lr1 router-port=lr1_s1 >> > > > + >> > > > +# s1 - backend vm1 >> > > > +check ovn-nbctl lsp-add s1 vm1 >> > > > +check ovn-nbctl lsp-set-addresses vm1 "00:de:ad:01:00:01 >> 172.16.0.101" >> > > > + >> > > > +# s1 - backend vm2 >> > > > +check ovn-nbctl lsp-add s1 vm2 >> > > > +check ovn-nbctl lsp-set-addresses vm2 "00:de:ad:01:00:02 >> 172.16.0.102" >> > > > + >> > > > +# s1 - backend vm3 >> > > > +check ovn-nbctl lsp-add s1 vm3 >> > > > +check ovn-nbctl lsp-set-addresses vm3 "00:de:ad:01:00:03 >> 172.16.0.103" >> > > > + >> > > > +# Add the lr1 DGP ts1 to the public switch >> > > > +check ovn-nbctl lsp-add public public_lr1_ts1 >> > > > +check ovn-nbctl lsp-set-type public_lr1_ts1 router >> > > > +check ovn-nbctl lsp-set-addresses public_lr1_ts1 router >> > > > +check ovn-nbctl lsp-set-options public_lr1_ts1 router-port=lr1-ts1 >> > > nat-addresses=router >> > > > + >> > > > +# Add the lr1 DGP ts2 to the public switch >> > > > +check ovn-nbctl lsp-add public public_lr1_ts2 >> > > > +check ovn-nbctl lsp-set-type public_lr1_ts2 router >> > > > +check ovn-nbctl lsp-set-addresses public_lr1_ts2 router >> > > > +check ovn-nbctl lsp-set-options public_lr1_ts2 router-port=lr1-ts2 >> > > nat-addresses=router >> > > > + >> > > > +# Add the lr1 DGP public to the public switch >> > > > +check ovn-nbctl lsp-add public public_lr1 >> > > > +check ovn-nbctl lsp-set-type public_lr1 router >> > > > +check ovn-nbctl lsp-set-addresses public_lr1 router >> > > > +check ovn-nbctl lsp-set-options public_lr1 router-port=lr1_public >> > > nat-addresses=router >> > > > + >> > > > +# Create the Load Balancer lb1 >> > > > +check ovn-nbctl --wait=sb lb-add lb1 "30.0.0.1" >> > > "172.16.0.103,172.16.0.102,172.16.0.101" >> > > > + >> > > > +# Associate load balancer to s1 >> > > > +check ovn-nbctl ls-lb-add s1 lb1 >> > > > +check ovn-nbctl --wait=sb sync >> > > > + >> > > > +ovn-sbctl dump-flows s1 > s1flows >> > > > +AT_CAPTURE_FILE([s1flows]) >> > > > + >> > > > +AT_CHECK([grep "ls_in_pre_stateful" s1flows | ovn_strip_lflows | >> grep >> > > "30.0.0.1"], [0], [dnl >> > > > + table=??(ls_in_pre_stateful ), priority=120 , match=(reg0[[2]] >> == 1 >> > > && ip4.dst == 30.0.0.1), action=(reg1 = 30.0.0.1; ct_lb_mark;) >> > > > +]) >> > > > +AT_CHECK([grep "ls_in_lb" s1flows | ovn_strip_lflows | grep >> > > "30.0.0.1"], [0], [dnl >> > > > + table=??(ls_in_lb ), priority=110 , match=(ct.new && >> > > ip4.dst == 30.0.0.1), action=(reg0[[1]] = 0; >> > > ct_lb_mark(backends=172.16.0.103,172.16.0.102,172.16.0.101);) >> > > > +]) >> > > > + >> > > > +# Associate load balancer to lr1 with DGP >> > > > +check ovn-nbctl lr-lb-add lr1 lb1 >> > > > +check ovn-nbctl --wait=sb sync >> > > > + >> > > > +ovn-sbctl dump-flows lr1 > lr1flows >> > > > +AT_CAPTURE_FILE([lr1flows]) >> > > > + >> > > > +AT_CHECK([grep "lr_in_dnat" lr1flows | ovn_strip_lflows | grep >> > > "30.0.0.1"], [0], [dnl >> > > > + table=??(lr_in_dnat ), priority=110 , match=(ct.new && >> > > !ct.rel && ip4 && ip4.dst == 30.0.0.1 && >> > > is_chassis_resident("cr-lr1-ts1")), >> > > action=(ct_lb_mark(backends=172.16.0.103,172.16.0.102,172.16.0.101);) >> > > > + table=??(lr_in_dnat ), priority=110 , match=(ct.new && >> > > !ct.rel && ip4 && ip4.dst == 30.0.0.1 && >> > > is_chassis_resident("cr-lr1-ts2")), >> > > action=(ct_lb_mark(backends=172.16.0.103,172.16.0.102,172.16.0.101);) >> > > > + table=??(lr_in_dnat ), priority=110 , match=(ct.new && >> > > !ct.rel && ip4 && ip4.dst == 30.0.0.1 && >> > > is_chassis_resident("cr-lr1_public")), >> > > action=(ct_lb_mark(backends=172.16.0.103,172.16.0.102,172.16.0.101);) >> > > > +]) >> > > > + >> > > > +AT_CLEANUP >> > > > +]) >> > > >> > > >> > >> > -- >> > >> > >> > >> > >> > _‘Esta mensagem é direcionada apenas para os endereços constantes no >> > cabeçalho inicial. Se você não está listado nos endereços constantes no >> > cabeçalho, pedimos-lhe que desconsidere completamente o conteúdo dessa >> > mensagem e cuja cópia, encaminhamento e/ou execução das ações citadas >> estão >> > imediatamente anuladas e proibidas’._ >> > >> > >> > * **‘Apesar do Magazine Luiza tomar >> > todas as precauções razoáveis para assegurar que nenhum vírus esteja >> > presente nesse e-mail, a empresa não poderá aceitar a responsabilidade >> por >> > quaisquer perdas ou danos causados por esse e-mail ou por seus anexos’.* >> > >> > >> > >> > _______________________________________________ >> > dev mailing list >> > [email protected] >> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev >> > -- _‘Esta mensagem é direcionada apenas para os endereços constantes no cabeçalho inicial. Se você não está listado nos endereços constantes no cabeçalho, pedimos-lhe que desconsidere completamente o conteúdo dessa mensagem e cuja cópia, encaminhamento e/ou execução das ações citadas estão imediatamente anuladas e proibidas’._ * **‘Apesar do Magazine Luiza tomar todas as precauções razoáveis para assegurar que nenhum vírus esteja presente nesse e-mail, a empresa não poderá aceitar a responsabilidade por quaisquer perdas ou danos causados por esse e-mail ou por seus anexos’.* _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
