On 29/05/15 15:21, Ola Liljedahl wrote:
On 29 May 2015 at 13:55, Zoltan Kiss <[email protected]
<mailto:[email protected]>> wrote:
On 28/05/15 17:40, Ola Liljedahl wrote:
On 28 May 2015 at 17:23, Zoltan Kiss <[email protected]
<mailto:[email protected]>
<mailto:[email protected] <mailto:[email protected]>>>
wrote:
On 28/05/15 16:00, Ola Liljedahl wrote:
I disprove of this solution. TX completion processing
(cleaning TX
descriptor rings after transmission complete) is an
implementation
(hardware) aspect and should be hidden from the
application.
Unfortunately you can't, if you want your pktio application
work
with poll mode drivers. In that case TX completion
interrupt (can
be) disabled and the application has to control that as
well. In
case of DPDK you just call the send function (with 0
packets, if you
don't have anything to send at the time)
Why do you have to retire transmitted packet if you are not
transmitting
new packets (and need those descriptors in the TX ring)?
Because otherwise they are a memory leak.
They are not leaked! They are still in the TX ring, just waiting to get
retired.
Indeed, leak is not the right word because they are still referenced.
But they still can't be released currently, only as a side effect of
something else which might not happen.
Those buffers might be needed somewhere else. If they are only
released when you send/receive packets out next time, you are in
trouble, because that might never happen. Especially when that event
is blocked because your TX ring is full of unreleased packets.
Having to few buffers is always a problem. You don't want to have too
large RX/TX rings because that just increases buffering and latency
("buffer bloat" problem).
That's the generic principle, yes, but we need a bulletproof way here
that applications can't get blocked because they forget to call the
magic function which otherwise release the TX buffers.
Does the
application have too few packets in the pool so that reception
will suffer?
Let me approach the problem from a different angle: the current
workaround is that you have to allocate a pool with _loooads_ of
buffers, so you have a good chance you never run out of free
buffers. Probably. Because it still doesn't guarantee that there
will be a next send/receive event on that interface to release the
packets.
There isn't
any corresponding call that refills the RX descriptor
rings with
fresh
buffers.
You can do that in the receive function, I think that's how the
drivers are doing it generally.
The completion processing can be performed from any ODP
call, not
necessary odp_pktio_send().
I think "any" is not specific enough. Which one?
odp_pktio_recv, odp_schedule. Wherever the application blocks or
busy
waits waiting for more packets.
We do that already on odp_pktio_recv. It doesn't help, because you
can only release the buffers held in the current interface's TX
ring. You can't do anything about other interfaces.
Why not?
There is no guarantee that the application thread calling
odp_pktio_recv() on one interface is the only one transmitting on that
specific egress interface. In the general case, all threads may be using
all pktio interfaces for both reception and transmission.
I mean, you could trigger TX completion on every interface every
time you receive on one, but that would be a scalability nightmare.
Maybe not every time. I expect a more creative solution than this.
Perhaps when you run out of buffers in the pool?
That might be better, I will think about it.
Can you provide a vague draft how would you fix the l2fwd
example below?
I don't think anything needs fixing on the application level.
Wrong. odp_l2fwd uses one packet pool, receives from pktio_src and
then if there is anything received, it sends it out on pktio_dst.
This specific application has this specific behavior. Are you sure this
is a general solution? I am not.
Let's say the pool has 576 elements, and the interfaces uses 256 RX
and 256 TX descriptors. You start with 2*256 buffers kept in the two
RX ring. Let's say you receive the first 64 packets, you refill the
RX ring immediately, so now you're out of buffers. You can send out
that 64, but in the next iteration odp_pktio_recv() will return 0
because it can't refill the RX descriptors. (and the driver won't
give you back any buffer unless you can refill it). And now you are
in an infinite loop, recv will always return 0, because you never
release the packets.
The size of the pool should somehow be correlated with the size of the
RX and TX rings for "best performance" (whatever this means). But I also
think that the system should function regardless of RX/TX ring sizes and
pool size, "function" meaning not deadlock.
Yes, I wrote about it in reply to Petri's that making up such a number
could be quite hard. And not just because at the moment we don't expose
the ring descriptor numbers.
There are several ways to fix this:
- tell the application writer that if you see deadlocks, increase
the element size of the buffer. I doubt anyone would ever use ODP to
anything serious when seeing such thing.
- you can't really give anything more specific than in the previous
point, because such details as RX/TX descriptor numbers are
abstracted away, intentionally. And your platform can't autotune
them, because it doesn't know how many elements you have in the pool
used for TX. In fact, it could be more than just one pool.
- make sure that you run odp_pktio_send even if pkts == 0. In case
of ODP-DPDK it can help because that actually triggers TX
completion. Actually, we can make odp_pktio_send_complete() ==
odp_pktio_send(len=0), so we don't have to introduce a new function.
But that doesn't change the fact that we have to call TX completion
periodically to make sure nothing is blocked.
So why doesn't the ODP-for-DPDK implementation call TX completion
"periodically" or at some other suitable times?
From where?
- or we can just do what I proposed in the patch, which is very
similar to the previous point, but articulate the importance of TX
completion more.
Which is a platform specific problem and exactly the kind of things that
the ODP API should hide and not expose.
Disabling TX completion interrupt in order to achieve better performance
in polling applications is not very platform specific. I think we should
be able to cope with that.
-- Ola
On 28 May 2015 at 16:38, Zoltan Kiss
<[email protected] <mailto:[email protected]>
<mailto:[email protected]
<mailto:[email protected]>>
<mailto:[email protected]
<mailto:[email protected]> <mailto:[email protected]
<mailto:[email protected]>>>>
wrote:
A pktio interface can be used with poll mode
drivers, where TX
completion often
has to be done manually. This turned up as a
problem with
ODP-DPDK and
odp_l2fwd:
while (!exit_threads) {
pkts = odp_pktio_recv(pktio_src,...);
if (pkts <= 0)
continue;
...
if (pkts_ok > 0)
odp_pktio_send(pktio_dst,
pkt_tbl, pkts_ok);
...
}
In this example we never call odp_pktio_send() on
pktio_dst
if there
wasn't
any new packets received on pktio_src. DPDK needs
manual TX
completion. The
above example should have an
odp_pktio_send_completion(pktio_dst)
right at the
beginning of the loop.
Signed-off-by: Zoltan Kiss <[email protected]
<mailto:[email protected]>
<mailto:[email protected]
<mailto:[email protected]>>
<mailto:[email protected]
<mailto:[email protected]>
<mailto:[email protected]
<mailto:[email protected]>>>>
---
include/odp/api/packet_io.h | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/include/odp/api/packet_io.h
b/include/odp/api/packet_io.h
index b97b2b8..3a4054c 100644
--- a/include/odp/api/packet_io.h
+++ b/include/odp/api/packet_io.h
@@ -119,6 +119,22 @@ int
odp_pktio_recv(odp_pktio_t pktio,
odp_packet_t pkt_table[], int len);
int odp_pktio_send(odp_pktio_t pktio, odp_packet_t
pkt_table[],
int len);
/**
+ * Release sent packets
+ *
+ * This function should be called after sending on a
pktio. If the
platform
+ * doesn't implement send completion in other
ways, this
function
should call
+ * odp_packet_free() on packets where
transmission is already
completed. It can
+ * be a no-op if the platform guarantees that the
packets
will be
released upon
+ * completion, but the application must call it
periodically after
send to make
+ * sure packets are released.
+ *
+ * @param pktio ODP packet IO handle
+ *
+ * @retval <0 on failure
+ */
+int odp_pktio_send_complete(odp_pktio_t pktio);
+
+/**
* Set the default input queue to be associated
with a
pktio handle
*
* @param pktio ODP packet IO handle
--
1.9.1
_______________________________________________
lng-odp mailing list
[email protected] <mailto:[email protected]>
<mailto:[email protected] <mailto:[email protected]>>
<mailto:[email protected]
<mailto:[email protected]>
<mailto:[email protected] <mailto:[email protected]>>>
https://lists.linaro.org/mailman/listinfo/lng-odp
_______________________________________________
lng-odp mailing list
[email protected]
https://lists.linaro.org/mailman/listinfo/lng-odp