On Tue, Jul 24, 2018 at 12:06 AM, Lam, Tiago <[email protected]> wrote:

> On 23/07/2018 23:55, Darrell Ball wrote:
> >
> >
> > On Fri, Jul 20, 2018 at 10:09 AM, Lam, Tiago <[email protected]
> > <mailto:[email protected]>> wrote:
> >
> >     On 18/07/2018 23:53, Darrell Ball wrote:
> >     > sorry, several distractions delayed response.
> >     >
> >     > On Mon, Jul 16, 2018 at 1:37 AM, Lam, Tiago <[email protected]
> <mailto:[email protected]>
> >     > <mailto:[email protected] <mailto:[email protected]>>> wrote:
> >     >
> >     >     On 13/07/2018 18:54, Darrell Ball wrote:
> >     >     > Thanks for the patch.
> >     >     >
> >     >     > A few queries inline.
> >     >     >
> >     >
> >     >     Hi Darrell,
> >     >
> >     >     Thanks for your inputs. I've replied in-line as well.
> >     >
> >     >     > On Wed, Jul 11, 2018 at 11:23 AM, Tiago Lam <
> [email protected] <mailto:[email protected]>
> >     <mailto:[email protected] <mailto:[email protected]>>
> >     >     > <mailto:[email protected] <mailto:[email protected]>
> >     <mailto:[email protected] <mailto:[email protected]>>>> wrote:
> >     >     >
> >     >     >     When enabled with DPDK OvS relies on mbufs allocated by
> mempools to
> >     >     >     receive and output data on DPDK ports. Until now, each
> OvS dp_packet has
> >     >     >     had only one mbuf associated, which is allocated with
> the maximum
> >     >     >     possible size, taking the MTU into account. This
> approach, however,
> >     >     >     doesn't allow us to increase the allocated size in an
> mbuf, if needed,
> >     >     >     since an mbuf is allocated and initialised upon mempool
> creation. Thus,
> >     >     >     in the current implementatin this is dealt with by
> calling
> >     >     >     OVS_NOT_REACHED() and terminating OvS.
> >     >     >
> >     >     >     To avoid this, and allow the (already) allocated space
> to be better
> >     >     >     used, dp_packet_resize__() now tries to use the
> available room, both the
> >     >     >     tailroom and the headroom, to make enough space for the
> new data. Since
> >     >     >     this happens for packets of source DPBUF_DPDK, the
> single-segment mbuf
> >     >     >     case mentioned above is also covered by this new aproach
> in resize__().
> >     >     >
> >     >     >     Signed-off-by: Tiago Lam <[email protected] <mailto:
> [email protected]>
> >     <mailto:[email protected] <mailto:[email protected]>>
> >     >     >     <mailto:[email protected] <mailto:[email protected]>
> >     <mailto:[email protected] <mailto:[email protected]>>>>
> >     >     >     Acked-by: Eelco Chaudron <[email protected] <mailto:
> [email protected]>
> >     >     <mailto:[email protected] <mailto:[email protected]>>
> >     >     >     <mailto:[email protected] <mailto:[email protected]>
> >     <mailto:[email protected] <mailto:[email protected]>>>>
> >     >     >     ---
> >     >     >      lib/dp-packet.c | 48
> >     >     ++++++++++++++++++++++++++++++++++++++++++++++--
> >     >     >      1 file changed, 46 insertions(+), 2 deletions(-)
> >     >     >
> >     >     >     diff --git a/lib/dp-packet.c b/lib/dp-packet.c
> >     >     >     index d6e19eb..87af459 100644
> >     >     >     --- a/lib/dp-packet.c
> >     >     >     +++ b/lib/dp-packet.c
> >     >     >     @@ -237,9 +237,51 @@ dp_packet_resize__(struct dp_packet
> *b,
> >     >     size_t
> >     >     >     new_headroom, size_t new_tailroom
> >     >     >          new_allocated = new_headroom + dp_packet_size(b) +
> >     >     new_tailroom;
> >     >     >
> >     >     >          switch (b->source) {
> >     >     >     +    /* When resizing mbufs, both a single mbuf and
> >     multi-segment
> >     >     >     mbufs (where
> >     >     >     +     * data is not contigously held in memory), both
> >     the headroom
> >     >     >     and the
> >     >     >     +     * tailroom available will be used to make more
> >     space for
> >     >     where
> >     >     >     data needs
> >     >     >     +     * to be inserted. I.e if there's not enough
> headroom,
> >     >     data may
> >     >     >     be shifted
> >     >     >     +     * right if there's enough tailroom.
> >     >     >     +     * However, this is not bulletproof and in some
> >     cases the
> >     >     space
> >     >     >     available
> >     >     >     +     * won't be enough - in those cases, an error
> should be
> >     >     >     returned and the
> >     >     >     +     * packet dropped. */
> >     >     >          case DPBUF_DPDK:
> >     >     >     -        OVS_NOT_REACHED();
> >     >     >
> >     >     >
> >     >     > Previously, it was a coding error to call this function for
> >     a DPDK
> >     >     mbuf
> >     >     > case, which is pretty
> >     >     > clear. But with this patch, presumably that is not longer
> >     the case and
> >     >     > the calling the API is
> >     >     > now ok for DPDK mbufs.
> >     >     >
> >     >
> >     >     As it stands, it will still be an error to call
> >     dp_packet_resize__() for
> >     >     any DPDK packet, or by extension any of the other functions
> >     that call
> >     >     it, such as dp_packet_prealloc_tailroom() and
> >     >     dp_packet_prealloc_headroom().
> >     >
> >     >
> >     >
> >     > yep, the existing code fails in a very clear way; i.e. whenever it
> is
> >     > called for a dpdk packet.
> >     > So the user would need to handle in some other way, which is not
> being
> >     > done today, I know.
> >     >
> >     >
> >     >
> >     >     This patch only tries to alleviate that
> >     >     by accommodating space from the headroom or tailroom, if
> >     possible, and
> >     >     create just enough space for the new data.
> >     >
> >     >
> >     > The new code will fail is some yet undefined way, occasionally
> working
> >     > and failing
> >     > in the "other" cases.
> >     >
> >     >
> >     >
> >     >     My preferred approach would
> >     >     be to return an error if not possible, but since the API
> >     doesn't deal
> >     >     with errors as is, the previous behavior of manually asserting
> >     was left
> >     >     as is. As reported in [1] (I comment more on that below), the
> >     behavior
> >     >
> >     >     of manually asserting can lead to undesired behavior in some
> >     use cases.
> >     >
> >     >
> >     >
> >     > I am familiar with the issue.
> >     > As part of the change,  dp_packet_put_uninit()
> >     > and dp_packet_push_uninit() could be modified to return NULL
> >     > and that could be percolated and checked for.
> >     >
> >     > Those APIs could simply check (by calling a helper API) if they
> would
> >     > fail a priori to trigger returning
> >     > NULL for dpdk buf cases.
> >     >
> >     >
> >
> >     Ok, I'm glad we're on the same page here. That's what [2] is doing.
> I'm
> >     planning to bring that forward. If you could have a look as well,
> that
> >     would be great.
> >
> >     >
> >     >
> >     >     >
> >     >     >
> >     >     >     +    {
> >     >     >     +        size_t miss_len;
> >     >     >     +
> >     >     >     +        if (new_headroom == dp_packet_headroom(b)) {
> >     >     >     +            /* This is a tailroom adjustment. Since
> >     there's no
> >     >     >     tailroom space
> >     >     >     +             * left, try and shift data towards the
> >     head to free up
> >     >     >     tail space,
> >     >     >     +             * if there's enough headroom */
> >     >     >     +
> >     >     >     +            miss_len = new_tailroom -
> >     dp_packet_tailroom(b);
> >     >     >     +
> >     >     >     +            if (miss_len <= new_headroom) {
> >     >     >     +                dp_packet_shift(b, -miss_len);
> >     >     >     +            } else {
> >     >     >     +                /* XXX: Handle error case and report
> >     error to caller */
> >     >     >     +                OVS_NOT_REACHED();
> >     >     >
> >     >     >
> >     >     >
> >     >     > This will not just drop the packet, it will fail the daemon,
> >     because a
> >     >     > packet cannot be resized.
> >     >     > If the system is completely depleted of memory, that may be
> >     ok, but in
> >     >     > the case, no such
> >     >     > assumption is implied.
> >     >     >
> >     >     > Also, why is XXX still left in the patch?
> >     >     >
> >     >
> >     >     Because there's still work to do in that regard. The whole
> process
> >     >     shouldn't be brought down if there's not enough space to put
> >     some data
> >     >     in one single packet. However, this was intentionally left out
> >     of this
> >     >     series or otherwise it would increase its complexity
> >     considerably.
> >     >
> >     >
> >     > It seems unnecessary to add a bunch of code to a series that tries
> >     to handle
> >     > 'resize', but handles it partially in practical cases. It also
> seems
> >     > undefined when
> >     > it works and when it does not from a API caller POV.
> >
> >     Granted that the dp_packet_resize__() supports more operations than
> what
> >     this implementation provides, but dp_packet API callers won't notice
> a
> >     difference since the only callers are
> >     dp_packet_prealloc_tailroom() and dp_packet_prealloc_headroom(). And
> the
> >     implementation covers for those functions, which only either modify
> the
> >     tailroom or the headroom (and not both).
> >
> >     > I think patch 7 is also there to only support this patch 8.
> >     >
> >
> >     That's a wrong assumption. Patch 7/14 was added to support the
> >     dp_packet_shift() operation.
> >
> >     In this patch 8/14, because of the noncontiguous nature of chained
> >     mbufs, there's no possibility of using general functions like
> `memmove`,
> >     and re-using the shift operations was just easier.
> >
> >
> > I guess we may be talking about 2 different things.
> > My point is simply that greping for dp_packet_mbuf_shift() does not show
> > any users with resize__()
> > removed; I think it would be best to remove unused code from the patchset
> > dp_packet_mbuf_shift() is defined in patch 7.
> >
> >
>
> Again, it is used in `dp_packet_shift()` for DPDK packets.
>


sorry, I missed these responses.

Presently, the usage of the dp_packet_shift() api is limited to pcap usage.
I did not check carefully, but possibly there is no valid dpdk use at the
moment.
But, on the other hand, I suppose dp_packet_shift() is a general api, of
sorts.




>
> >
> >     > Ideally, It would seem like a modified patch 7 and patch 8 would
> >     belong
> >     > with the rest of the
> >     > fix for the dpdk packet memory preallocation constraint issue.
> >     >
> >     > Also, ideally, 'XXX' is removed from patches.
> >     >
> >     >
> >     >
> >     >
> >     >     As others have pointed out in [1], this is not a simple
> >     change, which
> >     >     would have to be propagated to higher levels in other parts of
> >     the code
> >     >     base. I've proposed an alternative (vs refactoring the whole
> >     dp_packet
> >     >     API to handle and return errors) in [2], but that seems to
> >     have gone
> >     >     stale. Going forward I see that approach merging with this new
> >     piece in
> >     >     dp_packet_resize__(), where an error can be returned to the
> >     caller if
> >     >     there's not enough space.
> >     >
> >     >
> >     > The full change is outside the scope of this series.
> >     >
> >     >
> >     >
> >     >
> >     >     > Also, the preexisting API handles two cases:
> >     >     > 1/ Tailroom only adjustment
> >     >     > 2/ headroom and/or tailroom adjustment
> >     >     >
> >     >     > meaning it handles all cases.
> >     >     >
> >     >     > The new DPDK addition (part of the same API) defines 2 cases
> >     >     >
> >     >     > 1/ tailroom only adjustment
> >     >     > 2/ headroom only adjustment
> >     >     >
> >     >     > So, it looks like a different API, that also does not handle
> >     all cases.
> >     >     >
> >     >     >
> >     >
> >     >     You have a point there, support for point 2/ "headroom and
> >     tailroom
> >     >     adjustment" is missed. It doesn't seem to be used anywhere at
> the
> >     >     moment, the only callers being dp_packet_prealloc_tailroom()
> and
> >     >     dp_packet_prealloc_headroom(), but I'll submit an incremental
> >     patch to
> >     >     deal with this. Thanks for pointing it out.
> >     >
> >     >     Tiago.
> >     >
> >     >     [1]
> >     >
> >      https://mail.openvswitch.org/pipermail/ovs-dev/2018-May/346649.html
> >     <https://mail.openvswitch.org/pipermail/ovs-dev/2018-May/346649.html
> >
> >     >
> >      <https://mail.openvswitch.org/pipermail/ovs-dev/2018-
> May/346649.html <https://mail.openvswitch.org/pipermail/ovs-dev/2018-May/
> 346649.html>>
> >     >     [2]
> >     >
> >      https://mail.openvswitch.org/pipermail/ovs-dev/2018-July/
> 348908.html <https://mail.openvswitch.org/pipermail/ovs-dev/2018-July/
> 348908.html>
> >     >
> >      <https://mail.openvswitch.org/pipermail/ovs-dev/2018-
> July/348908.html
> >     <https://mail.openvswitch.org/pipermail/ovs-dev/2018-July/
> 348908.html>>
> >     >
> >     >     >
> >     >     >     +            }
> >     >     >     +        } else {
> >     >     >     +            /* Otherwise, this is a headroom
> >     adjustment. Try to
> >     >     >     shift data
> >     >     >     +             * towards the tail to free up head space,
> >     if there's
> >     >     >     enough
> >     >     >     +             * tailroom */
> >     >     >     +
> >     >     >     +            miss_len = new_headroom -
> >     dp_packet_headroom(b);
> >     >     >
> >     >     >     +
> >     >     >     +            if (miss_len <= new_tailroom) {
> >     >     >     +                dp_packet_shift(b, miss_len);
> >     >     >     +            } else {
> >     >     >     +                /* XXX: Handle error case and report
> >     error to
> >     >     caller */
> >     >     >     +                OVS_NOT_REACHED();
> >     >     >
> >     >     >
> >     >     >
> >     >     > same comments as above.
> >     >     >
> >     >     >
> >     >     >
> >     >     >     +            }
> >     >     >     +        }
> >     >     >     +
> >     >     >     +        new_base = dp_packet_base(b);
> >     >     >     +
> >     >     >     +        break;
> >     >     >     +    }
> >     >     >          case DPBUF_MALLOC:
> >     >     >              if (new_headroom == dp_packet_headroom(b)) {
> >     >     >                  new_base = xrealloc(dp_packet_base(b),
> >     >     new_allocated);
> >     >     >     @@ -263,7 +305,9 @@ dp_packet_resize__(struct dp_packet
> >     *b, size_t
> >     >     >     new_headroom, size_t new_tailroom
> >     >     >              OVS_NOT_REACHED();
> >     >     >          }
> >     >     >
> >     >     >     -    dp_packet_set_allocated(b, new_allocated);
> >     >     >     +    if (b->source != DPBUF_DPDK) {
> >     >     >     +        dp_packet_set_allocated(b, new_allocated);
> >     >     >     +    }
> >     >     >          dp_packet_set_base(b, new_base);
> >     >     >
> >     >     >          new_data = (char *) new_base + new_headroom;
> >     >     >     --
> >     >     >     2.7.4
> >     >     >
> >     >     >     _______________________________________________
> >     >     >     dev mailing list
> >     >     >     [email protected] <mailto:[email protected]>
> >     <mailto:[email protected] <mailto:[email protected]>>
> >     >     <mailto:[email protected] <mailto:[email protected]>
> >     <mailto:[email protected] <mailto:[email protected]>>>
> >     >     >     https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >     <https://mail.openvswitch.org/mailman/listinfo/ovs-dev>
> >     >     <https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >     <https://mail.openvswitch.org/mailman/listinfo/ovs-dev>>
> >     >     >     <https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >     <https://mail.openvswitch.org/mailman/listinfo/ovs-dev>
> >     >     <https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >     <https://mail.openvswitch.org/mailman/listinfo/ovs-dev>>>
> >     >     >
> >     >     >
> >     >
> >     >
> >
> >
>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to