Ah, that's a good point I had missed. I think the logic is fine as is then. I think I'll add a comment to that affect at the beginning of the function.
Ethan On Tue, Aug 11, 2015 at 1:51 PM, Joe Stringer <joestrin...@nicira.com> wrote: > FWIW the "slice" logic in the for loop is meant to ensure that no > revalidator threads concurrently clean the same map. > > There's two cases where the main thread might call into this as well > though, one is when reconfiguring threads (which shouldn't conflict, > because it stops all revalidators first), and the ovs-appctl > revalidator/purge case, which could hit something like this, but I > don't think there is a case where we use this functionality while OVS > is under load, so the likelyhood of problems is low. > > On 11 August 2015 at 12:18, Ethan Jackson <et...@nicira.com> wrote: >> Yeah sorry this patch is not great. That second problem is solved in >> the follow on. I'd be somewhat inclined to drop this one all >> together. >> >> Ethan >> >> On Tue, Aug 11, 2015 at 11:15 AM, Jarno Rajahalme <jrajaha...@nicira.com> >> wrote: >>> >>>> On Aug 11, 2015, at 10:42 AM, Jarno Rajahalme <jrajaha...@nicira.com> >>>> wrote: >>>> >>>> >>>>> On Aug 10, 2015, at 6:46 PM, Ethan J. Jackson <et...@nicira.com> wrote: >>>>> >>>>> From: Ethan Jackson <et...@nicira.com> >>>>> >>>>> Since revalidator_sweep() doesn't hold the ukey mutex for each full >>>>> loop iteration, it's theoretically possible that two threads may >>>>> call ukey_delete() on the same ukey. If this happens, they both will >>>>> attempt to remove the ukey from he cmap, causing the loser of the race >>>>> to fail. >>>>> >>>>> Signed-off-by: Ethan J. Jackson <et...@nicira.com> >>>>> --- >>>>> ofproto/ofproto-dpif-upcall.c | 2 +- >>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>> >>>>> diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c >>>>> index 0f2e186..fddb535 100644 >>>>> --- a/ofproto/ofproto-dpif-upcall.c >>>>> +++ b/ofproto/ofproto-dpif-upcall.c >>>>> @@ -2076,7 +2076,6 @@ revalidator_sweep__(struct revalidator >>>>> *revalidator, bool purge) >>>>> flow_exists = ukey->flow_exists; >>>>> seq_mismatch = (ukey->dump_seq != dump_seq >>>>> && ukey->reval_seq != reval_seq); >>>>> - ovs_mutex_unlock(&ukey->mutex); >>>>> >>>>> if (flow_exists >>>>> && (purge >>>> >>>> There is a call to push_ukey_ops() between these blocks. It calls >>>> push_ukey_ops__(), which will take a lock on the ukeys, including the last >>>> one added and still locked, so this would result in double locking. >>>> >>> >>> I missed the fact that handle_missed_revalidation() in here also takes the >>> ukey lock. >>> >>> Jarno >>> >>>> This could be resolved by moving the “ if (n_ops == REVALIDATE_MAX_BATCH)” >>>> block after the (moved) ukey unlock, similarly to the remainder ops >>>> handling at the end. However, I’m not sure if this would defeat the >>>> purpose of this patch. >>>> >>>> Also, it seems that generally the umap lock is taken first and ukey locks >>>> after, so there is a possibility of a deadlock if the locks are taken in >>>> the reverse order, like in this patch. >>>> >>>> It would be really great if we could employ clang thread safety analysis >>>> more than we do in this file currently, but I’m not sure if that is >>>> possible. >>>> >>>> Jarno >>>> >>>>> @@ -2095,6 +2094,7 @@ revalidator_sweep__(struct revalidator >>>>> *revalidator, bool purge) >>>>> ukey_delete(umap, ukey); >>>>> ovs_mutex_unlock(&umap->mutex); >>>>> } >>>>> + ovs_mutex_unlock(&ukey->mutex); >>>>> } >>>>> >>>>> if (n_ops) { >>>>> -- >>>>> 2.1.0 >>>>> >>>>> _______________________________________________ >>>>> dev mailing list >>>>> dev@openvswitch.org >>>>> http://openvswitch.org/mailman/listinfo/dev >>>> >>> >> _______________________________________________ >> dev mailing list >> dev@openvswitch.org >> http://openvswitch.org/mailman/listinfo/dev _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev