On Wed, Oct 30, 2024 at 4:11 PM Tiago Pires <[email protected]> wrote:
> Hey Ales, > > Do you know if there is a way to reproduce the huge OF message that > ovn-controller sends to ovs-vswitchd? > I have a similar issue that this patch solve but I was not be able to > reproduce it: > > #ovn-controller > 5378238|ofctrl|INFO|OpenFlow error: OFPT_ERROR (OF1.5) > (xid=0xbd25708): OFPBFC_MSG_BAD_LEN > OFPT_BUNDLE_ADD_MESSAGE (OF1.5) (xid=0xbd25708): ***decode error: > OFPBFC_MSG_BAD_LEN*** > 00000000 06 22 00 10 0b d2 57 08-00 7a d1 b1 00 00 00 03 > |."....W..z......| > 5378239|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt: disconnecting > > #OVS > 1580559|connmgr|INFO|br-int<->unix#2544073: sending OFPBMC_BAD_LEN > error reply to OFPT_BUNDLE_ADD_MESSAGE message > > Thanks > > Tiago Pires > Hi Tiago, AFAIR this was happening with a large number of LBs that had "hairpin_snat_ip" set. For reproduction I was doing some manual tinkering, unfortunately I don't remember the details. I hope this helps at least a little bit. Best regards, Ales > > On Thu, Sep 14, 2023 at 5:41 PM Mark Michelson <[email protected]> > wrote: > > > > Thanks for the updates, Ales. > > > > I pushed this change to main and all branches back to 22.03. > > > > On 9/14/23 10:13, Ales Musil wrote: > > > If the flow size is bigger than UINT16_MAX it doesn't > > > fit into openflow message. Programming of such flow will > > > fail which results in disconnect of ofctrl. After reconnect > > > we program everything from scratch, in case the long flow still > > > remains the cycle continues. This causes the node to be almost > > > unusable as there will be massive traffic disruptions. > > > > > > To prevent that check if the flow is within the allowed size. > > > If not log the flow that would trigger this problem and do not program > > > it. This isn't a self-healing process, but the chance of this happening > > > are very slim. Also, any flow that is bigger than allowed size is OVN > > > bug, and it should be fixed. > > > > > > Reported-at: https://bugzilla.redhat.com/1955167 > > > Signed-off-by: Ales Musil <[email protected]> > > > --- > > > v2: Fix the formatting error. > > > v3: Rebase on top of current main. > > > Add coverage counter for the too long error. > > > --- > > > controller/ofctrl.c | 45 > +++++++++++++++++++++++++++++++++++++++++---- > > > 1 file changed, 41 insertions(+), 4 deletions(-) > > > > > > diff --git a/controller/ofctrl.c b/controller/ofctrl.c > > > index 64a444ff6..a8cda4fe8 100644 > > > --- a/controller/ofctrl.c > > > +++ b/controller/ofctrl.c > > > @@ -16,6 +16,7 @@ > > > #include <config.h> > > > #include "bitmap.h" > > > #include "byte-order.h" > > > +#include "coverage.h" > > > #include "dirs.h" > > > #include "dp-packet.h" > > > #include "flow.h" > > > @@ -55,6 +56,8 @@ > > > > > > VLOG_DEFINE_THIS_MODULE(ofctrl); > > > > > > +COVERAGE_DEFINE(ofctrl_msg_too_long); > > > + > > > /* An OpenFlow flow. */ > > > struct ovn_flow { > > > /* Key. */ > > > @@ -1769,6 +1772,18 @@ ovn_flow_log(const struct ovn_flow *f, const > char *action) > > > } > > > } > > > > > > +static void > > > +ovn_flow_log_size_err(const struct ovn_flow *f) > > > +{ > > > + COVERAGE_INC(ofctrl_msg_too_long); > > > + > > > + static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1); > > > + > > > + char *s = ovn_flow_to_string(f); > > > + VLOG_ERR_RL(&rl, "The FLOW_MOD message is too big: %s", s); > > > + free(s); > > > +} > > > + > > > static void > > > ovn_flow_uninit(struct ovn_flow *f) > > > { > > > @@ -1888,15 +1903,27 @@ encode_bundle_add(struct ofpbuf *msg, struct > ofputil_bundle_ctrl_msg *bc) > > > return ofputil_encode_bundle_add(OFP15_VERSION, &bam); > > > } > > > > > > -static void > > > +static bool > > > add_flow_mod(struct ofputil_flow_mod *fm, > > > struct ofputil_bundle_ctrl_msg *bc, > > > struct ovs_list *msgs) > > > { > > > struct ofpbuf *msg = encode_flow_mod(fm); > > > struct ofpbuf *bundle_msg = encode_bundle_add(msg, bc); > > > + > > > + uint32_t flow_mod_len = msg->size; > > > + uint32_t bundle_len = bundle_msg->size; > > > + > > > ofpbuf_delete(msg); > > > + > > > + if (flow_mod_len > UINT16_MAX || bundle_len > UINT16_MAX) { > > > + ofpbuf_delete(bundle_msg); > > > + > > > + return false; > > > + } > > > + > > > ovs_list_push_back(msgs, &bundle_msg->list_node); > > > + return true; > > > } > > > > > > /* group_table. */ > > > @@ -2235,7 +2262,10 @@ installed_flow_add(struct ovn_flow *d, > > > .new_cookie = htonll(d->cookie), > > > .command = OFPFC_ADD, > > > }; > > > - add_flow_mod(&fm, bc, msgs); > > > + > > > + if (!add_flow_mod(&fm, bc, msgs)) { > > > + ovn_flow_log_size_err(d); > > > + } > > > } > > > > > > static void > > > @@ -2259,7 +2289,7 @@ installed_flow_mod(struct ovn_flow *i, struct > ovn_flow *d, > > > /* Use OFPFC_ADD so that cookie can be updated. */ > > > fm.command = OFPFC_ADD; > > > } > > > - add_flow_mod(&fm, bc, msgs); > > > + bool result = add_flow_mod(&fm, bc, msgs); > > > > > > /* Replace 'i''s actions and cookie by 'd''s. */ > > > mem_stats.installed_flow_usage -= i->ofpacts_len - > d->ofpacts_len; > > > @@ -2267,6 +2297,10 @@ installed_flow_mod(struct ovn_flow *i, struct > ovn_flow *d, > > > i->ofpacts = xmemdup(d->ofpacts, d->ofpacts_len); > > > i->ofpacts_len = d->ofpacts_len; > > > i->cookie = d->cookie; > > > + > > > + if (!result) { > > > + ovn_flow_log_size_err(i); > > > + } > > > } > > > > > > static void > > > @@ -2280,7 +2314,10 @@ installed_flow_del(struct ovn_flow *i, > > > .table_id = i->table_id, > > > .command = OFPFC_DELETE_STRICT, > > > }; > > > - add_flow_mod(&fm, bc, msgs); > > > + > > > + if (!add_flow_mod(&fm, bc, msgs)) { > > > + ovn_flow_log_size_err(i); > > > + } > > > } > > > > > > static void > > > > _______________________________________________ > > dev mailing list > > [email protected] > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > -- > > > > > _‘Esta mensagem é direcionada apenas para os endereços constantes no > cabeçalho inicial. Se você não está listado nos endereços constantes no > cabeçalho, pedimos-lhe que desconsidere completamente o conteúdo dessa > mensagem e cuja cópia, encaminhamento e/ou execução das ações citadas > estão > imediatamente anuladas e proibidas’._ > > > * **‘Apesar do Magazine Luiza tomar > todas as precauções razoáveis para assegurar que nenhum vírus esteja > presente nesse e-mail, a empresa não poderá aceitar a responsabilidade por > quaisquer perdas ou danos causados por esse e-mail ou por seus anexos’.* > > > > -- Ales Musil Senior Software Engineer - OVN Core Red Hat EMEA <https://www.redhat.com> [email protected] <https://red.ht/sig> _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
