On Mon, Nov 28, 2016 at 12:03:06PM +0100, Thomas Monjalon wrote:
> We need attention of every PMD developers on this thread.
I've been following this thread from the beginning while working on rte_flow
and wanted to see where it was headed before replying. (I know, v11 was
submitted about 1 month ago but still.)
> Reminder of what Konstantin suggested:
> - if the PMD supports TX offloads AND
> - if to be able use any of these offloads the upper layer SW would have to:
> * modify the contents of the packet OR
> * obey HW specific restrictions
> then it is a PMD developer responsibility to provide tx_prep() that would
> expected modifications of the packet contents and restriction checks.
> Otherwise, tx_prep() implementation is not required and can be safely set to
> I copy/paste also my previous conclusion:
> Before txprep, there is only one API: the application must prepare the
> packets checksum itself (get_psd_sum in testpmd).
> With txprep, the application have 2 choices: keep doing the job itself
> or call txprep which calls a PMD-specific function.
Something is definitely needed here, and only PMDs can provide it. I think
applications should not have to clear checksum fields or initialize them to
some magic value, same goes for any other offload or hardware limitation
that needs to be worked around.
tx_prep() is one possible answer to this issue, however as mentioned in the
original patch it can be very expensive if exposed by the PMD.
Another issue I'm more concerned about is the way limitations are managed
(struct rte_eth_desc_lim). While not officially tied to tx_prep(), this
structure contains new fields that are only relevant to a few devices, and I
fear it will keep growing with each new hardware quirk to manage, breaking
ABIs in the process.
What are applications supposed to do, check each of them regardless before
attempting to send a burst?
I understand tx_prep() automates this process, however I'm wondering why
isn't the TX burst function doing that itself. Using nb_mtu_seg_max as an
example, tx_prep() has an extra check in case of TSO that the TX burst
function does not perform. This ends up being much more expensive to
applications due to the additional loop doing redundant testing on each
If, say as a performance improvement, we decided to leave the validation
part to the TX burst function; what remains in tx_prep() is basically heavy
"preparation" requiring mbuf changes (i.e. erasing checksums, for now).
Following the same logic, why can't such a thing be made part of the TX
burst function as well (through a direct call to rte_phdr_cksum_fix()
whenever necessary). From an application standpoint, what are the advantages
of having to:
if (tx_prep()) // iterate and update mbufs as needed
tx_burst(); // iterate and send
tx_burst(); // iterate, update as needed and send
Note that PMDs could still provide different TX callbacks depending on the
set of enabled offloads so performance is not unnecessarily impacted.
In my opinion the second approach is both faster to applications and more
friendly from a usability perspective, am I missing something obvious?
> The question is: does non-Intel drivers need a checksum preparation for TSO?
> Will it behave well if txprep does nothing in these drivers?
> When looking at the code, most of drivers handle the TSO flags.
> But it is hard to know whether they rely on the pseudo checksum or not.
> git grep -l 'PKT_TX_UDP_CKSUM\|PKT_TX_TCP_CKSUM\|PKT_TX_TCP_SEG' drivers/net/
> Please, we need a comment for each driver saying
> "it is OK, we do not need any checksum preparation for TSO"
> "yes we have to implement tx_prepare or TSO will not work in this mode"
For both mlx4 and mlx5 then,
"it is OK, we do not need any checksum preparation for TSO".
Actually I do not think we'll ever need tx_prep() unless we add our own
quirks to struct rte_eth_desc_lim (and friends) which are currently quietly
handled by TX burst functions.