On Fri, Feb 10, 2023 at 12:30:58PM +0100, Adrian Moreno wrote:
> 
> 
> On 1/25/23 16:35, Simon Horman wrote:
> > On Tue, Jan 24, 2023 at 08:21:28PM +0100, Adrián Moreno wrote:
> > > From: Adrian Moreno <[email protected]>
> > > 
> > > IPFIX templates have to be sent for each Observation Domain ID.
> > > Currently, a timer is kept at each dpif_ipfix_exporter to send them.
> > > This works fine for per-bridge sampling where there is only one
> > > Observation Domain ID per exporter. However, this is does not work for
> > > per-flow sampling where more than one Observation Domain IDs can be
> > > specified by the controller. In this case, ovs-vswitchd will only send
> > > template information for one (arbitrary) DomainID.
> > > 
> > > Fix per-flow sampling by using an hmap to keep a timer for each
> > > Observation Domain ID.
> > > 
> > > Signed-off-by: Adrian Moreno <[email protected]>
> > > 
> > > ---
> > > - v2:
> > >    - Fixed a potential race condition in unit test.
> > > - v1:
> > >    - Added unit test.
> > >    - Drop domain_id from "struct dpif_ipfix_domain" since the hashmap
> > >    already uses it as index.
> > >    - Added OVS_REQUES(mutex) to dpif_ipfix_exporter_find_domain.
> > > ---
> > >   ofproto/ofproto-dpif-ipfix.c | 127 ++++++++++++++++++++++++++++-------
> > >   tests/system-traffic.at      |  52 ++++++++++++++
> > >   2 files changed, 156 insertions(+), 23 deletions(-)
> > > 
> > > diff --git a/ofproto/ofproto-dpif-ipfix.c b/ofproto/ofproto-dpif-ipfix.c
> > > index 742eed399..fa71fda0f 100644
> > > --- a/ofproto/ofproto-dpif-ipfix.c
> > > +++ b/ofproto/ofproto-dpif-ipfix.c
> > > @@ -124,11 +124,18 @@ struct dpif_ipfix_port {
> > >       uint32_t ifindex;
> > >   };
> > > +struct dpif_ipfix_domain {
> > > +    struct hmap_node hmap_node; /* In struct dpif_ipfix_exporter's 
> > > domains. */
> > > +    time_t last_template_set_time;
> > > +};
> > > +
> > >   struct dpif_ipfix_exporter {
> > >       uint32_t exporter_id; /* Exporting Process identifier */
> > >       struct collectors *collectors;
> > >       uint32_t seq_number;
> > > -    time_t last_template_set_time;
> > > +    struct hmap domains; /* Contains struct dpif_ipfix_domain indexed by
> > > +                            observation domain id. */
> > > +    time_t last_stats_sent_time;
> > >       struct hmap cache_flow_key_map;  /* ipfix_flow_cache_entry. */
> > >       struct ovs_list cache_flow_start_timestamp_list;  /* 
> > > ipfix_flow_cache_entry. */
> > >       uint32_t cache_active_timeout;  /* In seconds. */
> > 
> > I'm not sure if it is important, but this change bumps the
> > cache_flow_key_map from the 1st to 2nd cacheline (on x86_64).
> > 
> > This can be avoided by moving seq_number above collectors,
> > as there is a 32-bit hole there.
> > 
> 
> Thanks very much for looking at this Simon.
> 
> Your proposal makes total sense. We should not degrade cache efficiency when
> it's so easily avoidable. I'll add it to the patch.

Likewise, thanks.

> > Possibly there is further scope for making this structure more
> > cache-fiendly. But again, I'm not sure if it is important.
> > 
> Looking further optimizing this structure, I don't think there's huge space
> for improvement.

That may well be the case.
TBH, I didn't look for further optimisations particularly closely.

> The operation that we'd like to optimize would be
> ipfix_cache_update() which also uses "ofproto_stats", that spans over 2
> cache lines.
> I think the best we can do is:
> 
> struct dpif_ipfix_exporter {
>         struct hmap                cache_flow_key_map;   /*     0    32 */
>         struct ovs_list            cache_flow_start_timestamp_list; /*   32
> 16 */
>         uint32_t                   cache_max_flows;      /*    48     4 */
>         uint32_t                   cache_active_timeout; /*    52     4 */
>         struct hmap                domains;              /*    56    32 */
>         /* --- cacheline 1 boundary (64 bytes) was 24 bytes ago --- */
>         time_t                     last_stats_sent_time; /*    88     8 */
>         char *                     virtual_obs_id;       /*    96     8 */
>         struct collectors *        collectors;           /*   104     8 */
>         uint32_t                   exporter_id;          /*   112     4 */
>         uint32_t                   seq_number;           /*   116     4 */
>         uint8_t                    virtual_obs_len;      /*   120     1 */
> 
>         /* XXX 7 bytes hole, try to pack */
> 
>         /* --- cacheline 2 boundary (128 bytes) --- */
>         ofproto_ipfix_stats        ofproto_stats;        /*   128    88 */
>         /* --- cacheline 3 boundary (192 bytes) was 24 bytes ago --- */
>         struct dpif_ipfix_global_stats ipfix_global_stats; /*   216   152 */
> 
>         /* size: 368, cachelines: 6, members: 13 */
>         /* sum members: 361, holes: 1, sum holes: 7 */
>         /* last cacheline: 48 bytes */
> };
> 
> If we're lucky we can not make use of cacheline 2 during this operation. Not
> sure that'll give us any significant benefit.
> 
> My plan is to do some performance testing of the IPFIX collection. I will
> take this into account.

Yes, good plan. I agree it is a good idea to make such a change data driven.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to