On Wed, Jan 24, 2024 at 10:53 PM Han Zhou <hz...@ovn.org> wrote:
>
> On Wed, Jan 24, 2024 at 4:23 AM Dumitru Ceara <dce...@redhat.com> wrote:
> >
> > On 1/24/24 06:01, Han Zhou wrote:
> > > On Fri, Jan 19, 2024 at 2:50 AM Dumitru Ceara <dce...@redhat.com> wrote:
> > >>
> > >> On 1/11/24 16:31, num...@ovn.org wrote:
> > >>> +
> > >>> +void
> > >>> +lflow_table_add_lflow(struct lflow_table *lflow_table,
> > >>> +                      const struct ovn_datapath *od,
> > >>> +                      const unsigned long *dp_bitmap, size_t
> > > dp_bitmap_len,
> > >>> +                      enum ovn_stage stage, uint16_t priority,
> > >>> +                      const char *match, const char *actions,
> > >>> +                      const char *io_port, const char *ctrl_meter,
> > >>> +                      const struct ovsdb_idl_row *stage_hint,
> > >>> +                      const char *where,
> > >>> +                      struct lflow_ref *lflow_ref)
> > >>> +    OVS_EXCLUDED(fake_hash_mutex)
> > >>> +{
> > >>> +    struct ovs_mutex *hash_lock;
> > >>> +    uint32_t hash;
> > >>> +
> > >>> +    ovs_assert(!od ||
> > >>> +               ovn_stage_to_datapath_type(stage) ==
> > > ovn_datapath_get_type(od));
> > >>> +
> > >>> +    hash = ovn_logical_flow_hash(ovn_stage_get_table(stage),
> > >>> +                                 ovn_stage_get_pipeline(stage),
> > >>> +                                 priority, match,
> > >>> +                                 actions);
> > >>> +
> > >>> +    hash_lock = lflow_hash_lock(&lflow_table->entries, hash);
> > >>> +    struct ovn_lflow *lflow =
> > >>> +        do_ovn_lflow_add(lflow_table, od, dp_bitmap,
> > >>> +                         dp_bitmap_len, hash, stage,
> > >>> +                         priority, match, actions,
> > >>> +                         io_port, ctrl_meter, stage_hint, where);
> > >>> +
> > >>> +    if (lflow_ref) {
> > >>> +        /* lflow referencing is only supported if 'od' is not NULL.
> */
> > >>> +        ovs_assert(od);
> > >>> +
> > >>> +        struct lflow_ref_node *lrn =
> > >>> +            lflow_ref_node_find(&lflow_ref->lflow_ref_nodes, lflow,
> > > hash);
> > >>> +        if (!lrn) {
> > >>> +            lrn = xzalloc(sizeof *lrn);
> > >>> +            lrn->lflow = lflow;
> > >>> +            lrn->dp_index = od->index;
> > >>> +            ovs_list_insert(&lflow_ref->lflows_ref_list,
> > >>> +                            &lrn->lflow_list_node);
> > >>> +            inc_dp_refcnt(&lflow->dp_refcnts_map, lrn->dp_index);
> > >>> +            ovs_list_insert(&lflow->referenced_by,
> > > &lrn->ref_list_node);
> > >>> +
> > >>> +            hmap_insert(&lflow_ref->lflow_ref_nodes, &lrn->ref_node,
> > > hash);
> > >>> +        }
> > >>> +
> > >>> +        lrn->linked = true;
> > >>> +    }
> > >>> +
> > >>> +    lflow_hash_unlock(hash_lock);
> > >>> +
> > >>> +}
> > >>> +
> > >>
> > >> This part is not thread safe.
> > >>
> > >> If two threads try to add logical flows that have different hashes and
> > >> lflow_ref is not NULL we're going to have a race condition when
> > >> inserting to the &lflow_ref->lflow_ref_nodes hash map because the two
> > >> threads will take different locks.
> > >>
> > >
> > > I think it is safe because a lflow_ref is always associated with an
> object,
> > > e.g. port, datapath, lb, etc., and lflow generation for a single such
> > > object is never executed in parallel, which is how the parallel lflow
> build
> > > is designed.
> > > Does it make sense?
> >
> > It happens that it's safe in this current patch set because indeed we
> > always process individual ports, datapaths, lbs, etc, in the same
> > thread.  However, this code (lflow_table_add_lflow()) is generic and
> > there's nothing (not even a comment) that would warn developers in the
> > future about the potential race if the lflow_ref is shared.
> >
> > I spoke to Numan offline a bit about this and I think the current plan
> > is to leave it as is and add proper locking as a follow up (or in v7).
> > But I think we still need a clear comment here warning users about this.
> >  Maybe we should add a comment where the lflow_ref structure is defined
> too.
> >
> > What do you think?
>
> I totally agree with you about adding comments to explain the thread safety
> considerations, and make it clear that the lflow_ref should always be
> associated with the object that is being processed by the thread.
> With regard to any follow up change for proper locking, I am not sure what
> scenario is your concern. I think if we always make sure the lflow_ref
> passed in is the one owned by the object then the current locking is
> sufficient. And this is the natural case.
>
> However, I did think about a situation where there might be a potential
> problem in the future when we need to maintain references for more than one
> input for the same lflow. For the "main" input, which is the object that
> the thread is iterating and generating lflow for, will have its lflow_ref
> passed in this function, but we might also need to maintain the reference
> of the lflow for a secondary input (or even third). In that case it is not
> just the locking issue, but firstly we need to have a proper way to pass in
> the secondary lflow_ref, which is what I had mentioned in the review
> comments for v3:
> https://mail.openvswitch.org/pipermail/ovs-dev/2023-December/410269.html.
> (the last paragraph). Regardless of that, it is not a problem for now, and
> I hope there is no need to add references for more inputs for the I-P that
> matters for production use cases.
>

I'll update the next version with the proper documentation.

@Han Regarding your comment that we may have the requirement in the
future to add a logical flow to
2 or more lflow_ref's,   I doubt if we would have such a requirement
or a scenario.

Because calling lflow_table_add_lflow(,..., lflow_ref1,  lflow_ref2)
is same as calling lflow_table_add_lflow twice

i.e
lflow_table_add_lflow(,..., lflow_ref1)
lflow_table_add_lflow(,..., lflow_ref2)

In the second call,  since the ovn_lflow already exists in the lflow
table, it will be just referenced in the lflow_ref2.
It would be a problem for sure if these lflow_refs (ref1 and ref2)
belong to different objects.

However I do see a scenario where lflow_table_add_lflow() could be
called with a different lflow_ref object.

Few scenarios I could think of

1.  While generating logical flows for a logical switch port (of type
router) 'P1',  there is a possibility (most likely a mistake from
a contributor) may call something like

lflow_table_add_lflow(....,  P1->peer->lflow_ref)

Even in this case,  we don't generate logical router port flows in
parallel with logical switch ports and
hence P1->peer->lflow_ref may not be modified by multiple threads.
But it's a possibility.  And I think Dumitru
is perhaps concerned about such scenarios.

2.  While generating load balancer flows we may also generate logical
flows for the routers and store them in the od->lflow_ref  (If we
happen to support I-P for router ports)
     i.e  HMAP_FOR_EACH_PARALLEL (lb, lb_maps) {
                generate_lflows_for_lb(lb, lb->lflow_ref)
                FOR_EACH_ROUTER_OF_LB(lr, lb) {
                    generate_lb_lflows_for_router(lr, lr->lflow_ref)
                }
           }

     In this case, it is possible that a logical router 'lr' may be
associated with multiple lbs and this may corrupt the lr->lflow_ref.


We have 2 choices  (after this patch series is accepted :))
  a.  Don't do anything.  The documentation added by this patch series
(which I'll update in the next version)  is good enough.
       and whenever such scenarios arise,  the contributor will add
the thread_safe code or make sure that he/she will not use the
lflow_ref of other objects while
       generating lflows.  In the scenario 2 above,  the contributor
should not generate lflows for the routers in the lb_maps loop.
Instead should generate
       them in the lr_datapaths loop or lr_stateful loop.

  b.  As a follow up patch,  make the lflow_table_add_lflow()
thread_safe for lflow_ref so that we are covered for the future
scenarios.
       The downside of this is that we have to maintain hash locks for
each lflow_ref and reconsile the lflow_ref hmap after all the threads
finish the lflow generation.
       It does incur some cost (both memory and cpu wise).  But it
will be only during recomputes.  Which should not be too frequent.


Let me know what you think.  I'm fine with both, but personally prefer
(1) as I don't see such scenarios in the near future.


Thanks
Numan




> Thanks,
> Han
>
> >
> > Regards,
> > Dumitru
> >
> > >
> > > Thanks,
> > > Han
> > >
> > >> That might corrupt the map.
> > >>
> > >> I guess, if we don't want to cause more performance degradation we
> > >> should maintain as many lflow_ref instances as we do hash_locks, i.e.,
> > >> LFLOW_HASH_LOCK_MASK + 1.  Will that even be possible?
> > >>
> > >> Wdyt?
> > >>
> > >> Regards,
> > >> Dumitru
> > >>
> > >> _______________________________________________
> > >> dev mailing list
> > >> d...@openvswitch.org
> > >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> > >
> >
> _______________________________________________
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to