On Sun, Aug 21, 2022 at 12:34 AM Peng He <[email protected]> wrote: > > Hi Mike > > Intuitionally, I do not think it is a good idea.
I agree with you, that change is really a sledge hammer. I'll collect some telemetry from the Massif check to see why exactly the race condition is occurring. Cheers, M > > if src->ukey_state = DELETED, but dst is to bring it back to ALIVE (some > state), > then this is a bug, because it breaks the state machine behind. > > I think the checking we can relax here is only when dst and ukey stat is in > EVICTED and DELETED, > and this is only because specific state machine we are using there. > > transition_ukey should be as simple as possible, not related to any specific > state machine used. > > > Mike Pattrick <[email protected]>于2022年8月16日 周二22:05写道: >> >> On Tue, Aug 16, 2022 at 4:04 AM Peng He <[email protected]> wrote: >> > >> > Hi, Mike >> > >> > by changing it into the function, you mean relax the checking if the state >> > is >> > either EVICTED or DELETED? >> > >> >> In my testing code, I changed it to this: >> >> diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c >> index 57f94df54..224bdfbb8 100644 >> --- a/ofproto/ofproto-dpif-upcall.c >> +++ b/ofproto/ofproto-dpif-upcall.c >> @@ -1937,6 +1937,12 @@ transition_ukey_at(struct udpif_key *ukey, enum >> ukey_state dst, >> OVS_REQUIRES(ukey->mutex) >> { >> if (dst < ukey->state) { >> + if (ukey->state >= UKEY_EVICTED) { >> + VLOG_DBG("Possible race condition on ukey transition %d->%d " >> + "(last transitioned from thread %u at %s)"ukey->state, >> + dst, ukey->state_thread, ukey->state_where); >> + return; >> + } >> VLOG_ABORT("Invalid ukey transition %d->%d (last transitioned from " >> "thread %u at %s)", ukey->state, dst, ukey->state_thread, >> ukey->state_where); >> >> >> > >> > Mike Pattrick <[email protected]> 于2022年8月15日周一 20:16写道: >> >> >> >> On Sat, Jun 4, 2022 at 11:19 AM Peng He <[email protected]> wrote: >> >> > >> >> > There is a race condition between the revalidator threads and >> >> > the handler/pmd threads. >> >> > >> >> > revalidator PMD threads >> >> > push_dp_ops deletes a key and tries >> >> > to del the dp magaflow. >> >> > does the upcall, generates a new >> >> > ukey, >> >> > and replaces the old ukey, now the >> >> > old >> >> > ukey state is UKEY_DELETED >> >> > >> >> > dp_ops succeeds, tries to change >> >> > the old ukey's state into >> >> > UKEY_EVICTED, however, the old >> >> > ukey's state is already UKEY_DELETED, >> >> > so OVS aborts. >> >> > >> >> > I did not observe this in the real environment, as it takes time for >> >> > PMDs to finish the upcall and replace the old ukeys. Normally, the >> >> > revalidator will change ukey state into UKEY_EVICTED first. >> >> > But it's better to cover this case. >> >> >> >> Hello Peng, >> >> >> >> I've run into a similar issue by running OVS under Massif, but with a >> >> different caller. >> >> >> >> Do you think it makes sense to move this change into transition_ukey() ? >> >> >> >> >> >> Cheers, >> >> M >> >> >> >> > >> >> > Signed-off-by: Peng He <[email protected]> >> >> > --- >> >> > ofproto/ofproto-dpif-upcall.c | 8 ++++++-- >> >> > 1 file changed, 6 insertions(+), 2 deletions(-) >> >> > >> >> > diff --git a/ofproto/ofproto-dpif-upcall.c >> >> > b/ofproto/ofproto-dpif-upcall.c >> >> > index 4f68f14f2..9c55e43c1 100644 >> >> > --- a/ofproto/ofproto-dpif-upcall.c >> >> > +++ b/ofproto/ofproto-dpif-upcall.c >> >> > @@ -2411,7 +2411,9 @@ push_dp_ops(struct udpif *udpif, struct ukey_op >> >> > *ops, size_t n_ops) >> >> > if (op->dop.error) { >> >> > if (op->ukey) { >> >> > ovs_mutex_lock(&op->ukey->mutex); >> >> > - transition_ukey(op->ukey, UKEY_EVICTED); >> >> > + if (op->ukey->state < UKEY_EVICTED) { >> >> > + transition_ukey(op->ukey, UKEY_EVICTED); >> >> > + } >> >> > ovs_mutex_unlock(&op->ukey->mutex); >> >> > } >> >> > /* if it's a flow_del error, 'stats' is unusable, it's ok >> >> > @@ -2432,7 +2434,9 @@ push_dp_ops(struct udpif *udpif, struct ukey_op >> >> > *ops, size_t n_ops) >> >> > >> >> > if (op->ukey) { >> >> > ovs_mutex_lock(&op->ukey->mutex); >> >> > - transition_ukey(op->ukey, UKEY_EVICTED); >> >> > + if (op->ukey->state < UKEY_EVICTED) { >> >> > + transition_ukey(op->ukey, UKEY_EVICTED); >> >> > + } >> >> > push->used = MAX(stats->used, op->ukey->stats.used); >> >> > push->tcp_flags = stats->tcp_flags | >> >> > op->ukey->stats.tcp_flags; >> >> > push->n_packets = stats->n_packets - >> >> > op->ukey->stats.n_packets; >> >> > -- >> >> > 2.25.1 >> >> > >> >> > _______________________________________________ >> >> > dev mailing list >> >> > [email protected] >> >> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev >> >> > >> >> >> > >> > >> > -- >> > hepeng >> _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
