Hi everyone,
I have been working on implementing incremental processing in OVN-IC and
encountered a design issue regarding how OVN-IC handles multi-AZ writes.
The Issue
In a scenario where multiple AZs are connected via OVN-IC, certain events
trigger all AZs to attempt writing the same data to the ISB/INB
simultaneously. This race condition leads to a constraint violation, which
causes the transaction to fail and forces a full recompute.
Example:
A clear example of this can be seen in ovn-ic.c:ts_run:
if (ctx->ovnisb_txn) {
/* Create ISB Datapath_Binding */
ICNBREC_TRANSIT_SWITCH_FOR_EACH (ts, ctx->ovninb_idl) {
const struct icsbrec_datapath_binding *isb_dp =
shash_find_and_delete(isb_ts_dps, ts->name);
if (!isb_dp) {
/* Allocate tunnel key */
int64_t dp_key = allocate_dp_key(dp_tnlids, vxlan_mode,
"transit switch datapath");
if (!dp_key) {
continue;
}
isb_dp = icsbrec_datapath_binding_insert(ctx->ovnisb_txn);
icsbrec_datapath_binding_set_transit_switch(isb_dp,
ts->name);
icsbrec_datapath_binding_set_tunnel_key(isb_dp, dp_key);
} else if (dp_key_refresh) {
/* Refresh tunnel key since encap mode has changed. */
int64_t dp_key = allocate_dp_key(dp_tnlids, vxlan_mode,
"transit switch datapath");
if (dp_key) {
icsbrec_datapath_binding_set_tunnel_key(isb_dp, dp_key);
}
}
if (!isb_dp->type) {
icsbrec_datapath_binding_set_type(isb_dp, "transit-switch");
}
if (!isb_dp->nb_ic_uuid) {
icsbrec_datapath_binding_set_nb_ic_uuid(isb_dp,
&ts->header_.uuid,
1);
}
}
struct shash_node *node;
SHASH_FOR_EACH (node, isb_ts_dps) {
icsbrec_datapath_binding_delete(node->data);
}
}
When a new transit-switch is created, every AZ attempts to create the same
datapath_binding on the ISB. Only one request succeeds; the others fail
with a "constraint-violation."
Impact:
This behavior negates the performance benefits of implementing incremental
processing, as the system falls back to a full recompute upon these
failures.
For development purposes, I am currently ignoring these errors, but the
ideal way of fixing this issue is to have a mechanism where only a single
AZ handles the writes but this would require implementing some consensus
protocol.
Does anyone have any advice on how we can fix this issue?
Thanks,
Tiago Matos
--
_‘Esta mensagem é direcionada apenas para os endereços constantes no
cabeçalho inicial. Se você não está listado nos endereços constantes no
cabeçalho, pedimos-lhe que desconsidere completamente o conteúdo dessa
mensagem e cuja cópia, encaminhamento e/ou execução das ações citadas estão
imediatamente anuladas e proibidas’._
* **‘Apesar do Magazine Luiza tomar
todas as precauções razoáveis para assegurar que nenhum vírus esteja
presente nesse e-mail, a empresa não poderá aceitar a responsabilidade por
quaisquer perdas ou danos causados por esse e-mail ou por seus anexos’.*
_______________________________________________
discuss mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss