Re: [ovs-dev] [ovs-dev v7 2/3] ipf: micro-optimize ipf_ctx_eq
Hi, Thanks for the reviewing. Aaron Conole 于2022年1月12日周三 05:37写道: > Peng He writes: > > > by using ipf_list's key instead of first frags' metadata can reduce > > quite a lot of cache access as by the time calling ipf_ctx_eq, ipf_list > > is cache-hot. > > > > Signed-off-by: Peng He > > --- > > Is there a reason not to just fold this into 1/3? It's strange to > introduce something and immediately re-write it in the same series. > > I don't see a reason not to fold this into 1/3. The performance > optimization isn't difficult to understand, and doesn't seem awkward. > will merge it into the first patch. > > > lib/ipf.c | 18 +++--- > > 1 file changed, 11 insertions(+), 7 deletions(-) > > > > diff --git a/lib/ipf.c b/lib/ipf.c > > index 1fd3d8d30..26de7bbcf 100644 > > --- a/lib/ipf.c > > +++ b/lib/ipf.c > > @@ -1047,14 +1047,17 @@ ipf_send_frags_in_list(struct ipf *ipf, struct > ipf_list *ipf_list, > > } > > > > static bool > > -ipf_ctx_eq(struct ipf_list *ipf_list, struct ipf_ctx *ctx) > > +ipf_ctx_eq(struct ipf_list *ipf_list, struct ipf_ctx *ctx, > > + struct dp_packet *pkt) > > { > > -struct dp_packet *pkt = > > -ipf_list->frag_list[ipf_list->last_sent_idx + 1].pkt; > > - > > -if (pkt->md.recirc_id != ctx->recirc_id || > > +/* NOTE: not using the first pkt's metadata, use ipf_list instead, > > + * using pkt's metadata causes too much cache miss, > > + * using ipf_list instead drops ipf_postprocessing cpu usage > > + * from 4 percent to 2 percent. ^_^. > > + */ > > Maybe this note is better in the commit log. We can't actually be sure > about each system's underlying memory architecture (for example, does > this hold true even on ARM?). > Ok, will put it into the commit log. > > > +if (ipf_list->key.recirc_id != ctx->recirc_id || > > pkt->md.in_port.odp_port != ctx->in_port.odp_port || > > -pkt->md.ct_zone != ctx->zone) { > > +ipf_list->key.zone != ctx->zone) { > > return false; > > } > > return true; > > @@ -1077,7 +1080,8 @@ ipf_send_completed_frags(struct ipf *ipf, struct > dp_packet_batch *pb, > > > > LIST_FOR_EACH_SAFE (ipf_list, next, list_node, > >frag_complete_list) { > > > > -if (ctx && !ipf_ctx_eq(ipf_list, ctx)) { > > +if (ctx && !ipf_ctx_eq(ipf_list, ctx, \ > > +ipf_list->frag_list[ipf_list->last_sent_idx + > 1].pkt)) { > > continue; > > } > > -- hepeng ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [ovs-dev v7 2/3] ipf: micro-optimize ipf_ctx_eq
Peng He writes: > by using ipf_list's key instead of first frags' metadata can reduce > quite a lot of cache access as by the time calling ipf_ctx_eq, ipf_list > is cache-hot. > > Signed-off-by: Peng He > --- Is there a reason not to just fold this into 1/3? It's strange to introduce something and immediately re-write it in the same series. I don't see a reason not to fold this into 1/3. The performance optimization isn't difficult to understand, and doesn't seem awkward. > lib/ipf.c | 18 +++--- > 1 file changed, 11 insertions(+), 7 deletions(-) > > diff --git a/lib/ipf.c b/lib/ipf.c > index 1fd3d8d30..26de7bbcf 100644 > --- a/lib/ipf.c > +++ b/lib/ipf.c > @@ -1047,14 +1047,17 @@ ipf_send_frags_in_list(struct ipf *ipf, struct > ipf_list *ipf_list, > } > > static bool > -ipf_ctx_eq(struct ipf_list *ipf_list, struct ipf_ctx *ctx) > +ipf_ctx_eq(struct ipf_list *ipf_list, struct ipf_ctx *ctx, > + struct dp_packet *pkt) > { > -struct dp_packet *pkt = > -ipf_list->frag_list[ipf_list->last_sent_idx + 1].pkt; > - > -if (pkt->md.recirc_id != ctx->recirc_id || > +/* NOTE: not using the first pkt's metadata, use ipf_list instead, > + * using pkt's metadata causes too much cache miss, > + * using ipf_list instead drops ipf_postprocessing cpu usage > + * from 4 percent to 2 percent. ^_^. > + */ Maybe this note is better in the commit log. We can't actually be sure about each system's underlying memory architecture (for example, does this hold true even on ARM?). > +if (ipf_list->key.recirc_id != ctx->recirc_id || > pkt->md.in_port.odp_port != ctx->in_port.odp_port || > -pkt->md.ct_zone != ctx->zone) { > +ipf_list->key.zone != ctx->zone) { > return false; > } > return true; > @@ -1077,7 +1080,8 @@ ipf_send_completed_frags(struct ipf *ipf, struct > dp_packet_batch *pb, > > LIST_FOR_EACH_SAFE (ipf_list, next, list_node, >frag_complete_list) > { > > -if (ctx && !ipf_ctx_eq(ipf_list, ctx)) { > +if (ctx && !ipf_ctx_eq(ipf_list, ctx, \ > +ipf_list->frag_list[ipf_list->last_sent_idx + 1].pkt)) { > continue; > } ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [ovs-dev v7 2/3] ipf: micro-optimize ipf_ctx_eq
Bleep bloop. Greetings Peng He, I am a robot and I have tried out your patch. Thanks for your contribution. I encountered some error that I wasn't expecting. See the details below. checkpatch: ERROR: Author Peng He needs to sign off. WARNING: Unexpected sign-offs from developers who are not authors or co-authors or committers: Peng He Lines checked: 56, Warnings: 1, Errors: 1 Please check this out. If you feel there has been an error, please email acon...@redhat.com Thanks, 0-day Robot ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [ovs-dev v7 2/3] ipf: micro-optimize ipf_ctx_eq
by using ipf_list's key instead of first frags' metadata can reduce quite a lot of cache access as by the time calling ipf_ctx_eq, ipf_list is cache-hot. Signed-off-by: Peng He --- lib/ipf.c | 18 +++--- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/lib/ipf.c b/lib/ipf.c index 1fd3d8d30..26de7bbcf 100644 --- a/lib/ipf.c +++ b/lib/ipf.c @@ -1047,14 +1047,17 @@ ipf_send_frags_in_list(struct ipf *ipf, struct ipf_list *ipf_list, } static bool -ipf_ctx_eq(struct ipf_list *ipf_list, struct ipf_ctx *ctx) +ipf_ctx_eq(struct ipf_list *ipf_list, struct ipf_ctx *ctx, + struct dp_packet *pkt) { -struct dp_packet *pkt = -ipf_list->frag_list[ipf_list->last_sent_idx + 1].pkt; - -if (pkt->md.recirc_id != ctx->recirc_id || +/* NOTE: not using the first pkt's metadata, use ipf_list instead, + * using pkt's metadata causes too much cache miss, + * using ipf_list instead drops ipf_postprocessing cpu usage + * from 4 percent to 2 percent. ^_^. + */ +if (ipf_list->key.recirc_id != ctx->recirc_id || pkt->md.in_port.odp_port != ctx->in_port.odp_port || -pkt->md.ct_zone != ctx->zone) { +ipf_list->key.zone != ctx->zone) { return false; } return true; @@ -1077,7 +1080,8 @@ ipf_send_completed_frags(struct ipf *ipf, struct dp_packet_batch *pb, LIST_FOR_EACH_SAFE (ipf_list, next, list_node, >frag_complete_list) { -if (ctx && !ipf_ctx_eq(ipf_list, ctx)) { +if (ctx && !ipf_ctx_eq(ipf_list, ctx, \ +ipf_list->frag_list[ipf_list->last_sent_idx + 1].pkt)) { continue; } -- 2.25.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev