Re: [lng-odp] [API-NEXT PATCH] api-next: pktio: add odp_pktio_send_complete() definition

Zoltan Kiss Fri, 29 May 2015 07:56:25 -0700


On 29/05/15 15:21, Ola Liljedahl wrote:

On 29 May 2015 at 13:55, Zoltan Kiss <[email protected]
<mailto:[email protected]>> wrote:



    On 28/05/15 17:40, Ola Liljedahl wrote:

        On 28 May 2015 at 17:23, Zoltan Kiss <[email protected]
        <mailto:[email protected]>
        <mailto:[email protected] <mailto:[email protected]>>>
        wrote:



             On 28/05/15 16:00, Ola Liljedahl wrote:

                 I disprove of this solution. TX completion processing
        (cleaning TX
                 descriptor rings after transmission complete) is an
        implementation
                 (hardware) aspect and should be hidden from the
        application.


             Unfortunately you can't, if you want your pktio application
        work
             with poll mode drivers. In that case TX completion
        interrupt (can
             be) disabled and the application has to control that as
        well. In
             case of DPDK you just call the send function (with 0
        packets, if you
             don't have anything to send at the time)

        Why do you have to retire transmitted packet if you are not
        transmitting
        new packets (and need those descriptors in the TX ring)?

    Because otherwise they are a memory leak.

They are not leaked! They are still in the TX ring, just waiting to get
retired.

Indeed, leak is not the right word because they are still referenced.But they still can't be released currently, only as a side effect ofsomething else which might not happen.


    Those buffers might be needed somewhere else. If they are only
    released when you send/receive packets out next time, you are in
    trouble, because that might never happen. Especially when that event
    is blocked because your TX ring is full of unreleased packets.

Having to few buffers is always a problem. You don't want to have too
large RX/TX rings because that just increases buffering and latency
("buffer bloat" problem).

That's the generic principle, yes, but we need a bulletproof way herethat applications can't get blocked because they forget to call themagic function which otherwise release the TX buffers.




      Does the

        application have too few packets in the pool so that reception
        will suffer?

    Let me approach the problem from a different angle: the current
    workaround is that you have to allocate a pool with _loooads_ of
    buffers, so you have a good chance you never run out of free
    buffers. Probably. Because it still doesn't guarantee that there
    will be a next send/receive event on that interface to release the
    packets.







               There isn't

                 any corresponding call that refills the RX descriptor
        rings with
                 fresh
                 buffers.

             You can do that in the receive function, I think that's how the
             drivers are doing it generally.


                 The completion processing can be performed from any ODP
        call, not
                 necessary odp_pktio_send().


             I think "any" is not specific enough. Which one?

        odp_pktio_recv, odp_schedule. Wherever the application blocks or
        busy
        waits waiting for more packets.

    We do that already on odp_pktio_recv. It doesn't help, because you
    can only release the buffers held in the current interface's TX
    ring. You can't do anything about other interfaces.

Why not?

There is no guarantee that the application thread calling
odp_pktio_recv() on one interface is the only one transmitting on that
specific egress interface. In the general case, all threads may be using
all pktio interfaces for both reception and transmission.

    I mean, you could trigger TX completion on every interface every
    time you receive on one, but that would be a scalability nightmare.

Maybe not every time. I expect a more creative solution than this.
Perhaps when you run out of buffers in the pool?


That might be better, I will think about it.







             Can you provide a vague draft how would you fix the l2fwd
        example below?

        I don't think anything needs fixing on the application level.


    Wrong. odp_l2fwd uses one packet pool, receives from pktio_src and
    then if there is anything received, it sends it out on pktio_dst.

This specific application has this specific behavior. Are you sure this
is a general solution? I am not.

    Let's say the pool has 576 elements, and the interfaces uses 256 RX
    and 256 TX descriptors. You start with 2*256 buffers kept in the two
    RX ring. Let's say you receive the first 64 packets, you refill the
    RX ring immediately, so now you're out of buffers. You can send out
    that 64, but in the next iteration odp_pktio_recv() will return 0
    because it can't refill the RX descriptors. (and the driver won't
    give you back any buffer unless you can refill it). And now you are
    in an infinite loop, recv will always return 0, because you never
    release the packets.

The size of the pool should somehow be correlated with the size of the
RX and TX rings for "best performance" (whatever this means). But I also
think that the system should function regardless of RX/TX ring sizes and
pool size, "function" meaning not deadlock.

Yes, I wrote about it in reply to Petri's that making up such a numbercould be quite hard. And not just because at the moment we don't exposethe ring descriptor numbers.


    There are several ways to fix this:
    - tell the application writer that if you see deadlocks, increase
    the element size of the buffer. I doubt anyone would ever use ODP to
    anything serious when seeing such thing.
    - you can't really give anything more specific than in the previous
    point, because such details as RX/TX descriptor numbers are
    abstracted away, intentionally. And your platform can't autotune
    them, because it doesn't know how many elements you have in the pool
    used for TX. In fact, it could be more than just one pool.
    - make sure that you run odp_pktio_send even if pkts == 0. In case
    of ODP-DPDK it can help because that actually triggers TX
    completion. Actually, we can make odp_pktio_send_complete() ==
    odp_pktio_send(len=0), so we don't have to introduce a new function.
    But that doesn't change the fact that we have to call TX completion
    periodically to make sure nothing is blocked.

So why doesn't the ODP-for-DPDK implementation call TX completion
"periodically" or at some other suitable times?


From where?


    - or we can just do what I proposed in the patch, which is very
    similar to the previous point, but articulate the importance of TX
    completion more.

Which is a platform specific problem and exactly the kind of things that
the ODP API should hide and not expose.

Disabling TX completion interrupt in order to achieve better performancein polling applications is not very platform specific. I think we shouldbe able to cope with that.






                 -- Ola


                 On 28 May 2015 at 16:38, Zoltan Kiss
        <[email protected] <mailto:[email protected]>
                 <mailto:[email protected]
        <mailto:[email protected]>>
                 <mailto:[email protected]
        <mailto:[email protected]> <mailto:[email protected]
        <mailto:[email protected]>>>>

                 wrote:

                      A pktio interface can be used with poll mode
        drivers, where TX
                      completion often
                      has to be done manually. This turned up as a
        problem with
                 ODP-DPDK and
                      odp_l2fwd:

                      while (!exit_threads) {
                               pkts = odp_pktio_recv(pktio_src,...);
                               if (pkts <= 0)
                                       continue;
                      ...
                               if (pkts_ok > 0)
                                       odp_pktio_send(pktio_dst,
        pkt_tbl, pkts_ok);
                      ...
                      }

                      In this example we never call odp_pktio_send() on
        pktio_dst
                 if there
                      wasn't
                      any new packets received on pktio_src. DPDK needs
        manual TX
                      completion. The
                      above example should have an
                 odp_pktio_send_completion(pktio_dst)
                      right at the
                      beginning of the loop.

                      Signed-off-by: Zoltan Kiss <[email protected]
        <mailto:[email protected]>
                 <mailto:[email protected]
        <mailto:[email protected]>>
                      <mailto:[email protected]
        <mailto:[email protected]>

                 <mailto:[email protected]
        <mailto:[email protected]>>>>

                      ---
                        include/odp/api/packet_io.h | 16 ++++++++++++++++
                        1 file changed, 16 insertions(+)

                      diff --git a/include/odp/api/packet_io.h
                 b/include/odp/api/packet_io.h
                      index b97b2b8..3a4054c 100644
                      --- a/include/odp/api/packet_io.h
                      +++ b/include/odp/api/packet_io.h
                      @@ -119,6 +119,22 @@ int
        odp_pktio_recv(odp_pktio_t pktio,
                      odp_packet_t pkt_table[], int len);
                        int odp_pktio_send(odp_pktio_t pktio, odp_packet_t
                 pkt_table[],
                      int len);

                        /**
                      + * Release sent packets
                      + *
                      + * This function should be called after sending on a
                 pktio. If the
                      platform
                      + * doesn't implement send completion in other
        ways, this
                 function
                      should call
                      + * odp_packet_free() on packets where
        transmission is already
                      completed. It can
                      + * be a no-op if the platform guarantees that the
        packets
                 will be
                      released upon
                      + * completion, but the application must call it
                 periodically after
                      send to make
                      + * sure packets are released.
                      + *
                      + * @param pktio        ODP packet IO handle
                      + *
                      + * @retval <0 on failure
                      + */
                      +int odp_pktio_send_complete(odp_pktio_t pktio);
                      +
                      +/**
                         * Set the default input queue to be associated
        with a
                 pktio handle
                         *
                         * @param pktio                ODP packet IO handle
                      --
                      1.9.1

                      _______________________________________________
                      lng-odp mailing list
        [email protected] <mailto:[email protected]>
        <mailto:[email protected] <mailto:[email protected]>>
                 <mailto:[email protected]
        <mailto:[email protected]>
        <mailto:[email protected] <mailto:[email protected]>>>
        https://lists.linaro.org/mailman/listinfo/lng-odp

_______________________________________________
lng-odp mailing list
[email protected]
https://lists.linaro.org/mailman/listinfo/lng-odp

Re: [lng-odp] [API-NEXT PATCH] api-next: pktio: add odp_pktio_send_complete() definition

Reply via email to