On Fri, Sep 28, 2018 at 05:15:06PM +0100, Tiago Lam wrote: > When a dp_packet is from a DPDK source, and it contains multi-segment > mbufs, the data_len is not equal to the packet size, pkt_len. Instead, > the data_len of each mbuf in the chain should be considered while > distributing the new (provided) size. > > To account for the above dp_packet_set_size() has been changed so that, > in the multi-segment mbufs case, only the data_len on the last mbuf of > the chain and the total size of the packet, pkt_len, are changed. The > data_len on the intermediate mbufs preceeding the last mbuf is not > changed by dp_packet_set_size(). Furthermore, in some cases > dp_packet_set_size() may be used to set a smaller size than the current > packet size, thus effectively trimming the end of the packet. In the > multi-segment mbufs case this may lead to lingering mbufs that may need > freeing. > > __dp_packet_set_data() now also updates an mbufs' data_len after setting > the data offset. This is so that both fields are always in sync for each > mbuf in a chain.
I think we need an assert(source) to make sure dp_packet_shift() will never be called for a multi-seg. > Co-authored-by: Michael Qiu <[email protected]> > Co-authored-by: Mark Kavanagh <[email protected]> > Co-authored-by: Przemyslaw Lal <[email protected]> > Co-authored-by: Marcin Ksiadz <[email protected]> > Co-authored-by: Yuanhan Liu <[email protected]> > > Signed-off-by: Michael Qiu <[email protected]> > Signed-off-by: Mark Kavanagh <[email protected]> > Signed-off-by: Przemyslaw Lal <[email protected]> > Signed-off-by: Marcin Ksiadz <[email protected]> > Signed-off-by: Yuanhan Liu <[email protected]> > Signed-off-by: Tiago Lam <[email protected]> > Acked-by: Eelco Chaudron <[email protected]> > --- > lib/dp-packet.h | 84 > ++++++++++++++++++++++++++++++++++++++++++++++++--------- > 1 file changed, 72 insertions(+), 12 deletions(-) > > diff --git a/lib/dp-packet.h b/lib/dp-packet.h > index 6376039..4545041 100644 > --- a/lib/dp-packet.h > +++ b/lib/dp-packet.h > @@ -426,20 +426,60 @@ dp_packet_size(const struct dp_packet *b) > return b->mbuf.pkt_len; > } > > +/* Sets the size of the packet 'b' to 'v'. For non-DPDK packets this only > means > + * setting b->size_, but if used in a DPDK packet it means adjusting the > first > + * mbuf pkt_len and last mbuf data_len, to reflect the real size, which can > + * lead to free'ing tail mbufs that are no longer used. > + * > + * This function should be used for setting the size only, and if there's an > + * assumption that the tail end of 'b' will be trimmed. For adjustng the head > + * 'end' of 'b', dp_packet_pull() should be used instead. */ > static inline void > dp_packet_set_size(struct dp_packet *b, uint32_t v) > { > - /* netdev-dpdk does not currently support segmentation; consequently, for > - * all intents and purposes, 'data_len' (16 bit) and 'pkt_len' (32 bit) > may > - * be used interchangably. > - * > - * On the datapath, it is expected that the size of packets > - * (and thus 'v') will always be <= UINT16_MAX; this means that there is > no > - * loss of accuracy in assigning 'v' to 'data_len'. > - */ > - b->mbuf.data_len = (uint16_t)v; /* Current seg length. */ > - b->mbuf.pkt_len = v; /* Total length of all segments linked > to > - * this segment. */ > + if (b->source == DPBUF_DPDK) { > + struct rte_mbuf *mbuf = &b->mbuf; > + uint16_t new_len = v; > + uint16_t data_len; > + uint16_t nb_segs = 0; > + uint16_t pkt_len = 0; > + > + /* Trim 'v' length bytes from the end of the chained buffers, freeing > + any buffers that may be left floating */ > + while (mbuf) { > + data_len = MIN(new_len, mbuf->data_len); > + mbuf->data_len = data_len; > + > + if (new_len - data_len <= 0) { > + /* Free the rest of chained mbufs */ > + free_dpdk_buf(CONTAINER_OF(mbuf->next, struct dp_packet, > + mbuf)); > + mbuf->next = NULL; > + } else if (!mbuf->next) { > + /* Don't assign more than what we have available */ > + mbuf->data_len = MIN(new_len, > + mbuf->buf_len - mbuf->data_off); > + } > + > + new_len -= data_len; > + nb_segs += 1; > + pkt_len += mbuf->data_len; > + mbuf = mbuf->next; > + } > + > + /* pkt_len != v would effectively mean that pkt_len < than 'v' (as > + * being bigger is logically impossible). Being < than 'v' would mean > + * the 'v' provided was bigger than the available room, which is the > + * responsibility of the caller to make sure there is enough room */ > + ovs_assert(pkt_len == v); > + > + b->mbuf.nb_segs = nb_segs; > + b->mbuf.pkt_len = pkt_len; > + } else { > + b->mbuf.data_len = v; > + /* Total length of all segments linked to this segment. */ > + b->mbuf.pkt_len = v; > + } > } > > static inline uint16_t > @@ -451,7 +491,27 @@ __packet_data(const struct dp_packet *b) > static inline void > __packet_set_data(struct dp_packet *b, uint16_t v) > { > - b->mbuf.data_off = v; > + if (b->source == DPBUF_DPDK) { > + /* Moving data_off away from the first mbuf in the chain is not a > + * possibility using DPBUF_DPDK dp_packets */ > + ovs_assert(v == UINT16_MAX || v <= b->mbuf.buf_len); > + > + uint16_t prev_ofs = b->mbuf.data_off; > + b->mbuf.data_off = v; > + int16_t ofs_diff = prev_ofs - b->mbuf.data_off; > + > + /* When dealing with DPDK mbufs, keep data_off and data_len in sync. > + * Thus, update data_len if the length changes with the move of > + * data_off. However, if data_len is 0, there's no data to move and > + * data_len should remain 0. */ > + > + if (b->mbuf.data_len != 0) { > + b->mbuf.data_len = MIN(b->mbuf.data_len + ofs_diff, > + b->mbuf.buf_len - b->mbuf.data_off); if 'v' is <= mbuf.buf_len, then why we need the above? Why not b->mbuf.data_len += ofs_diff ? fbl > + } > + } else { > + b->mbuf.data_off = v; > + } > } > > static inline uint16_t > -- > 2.7.4 > > _______________________________________________ > dev mailing list > [email protected] > https://mail.openvswitch.org/mailman/listinfo/ovs-dev -- Flavio _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
