> From: Bruce Richardson [mailto:bruce.richard...@intel.com] > Sent: Wednesday, October 14, 2020 1:39 PM > > On Wed, Oct 14, 2020 at 01:14:13PM +0200, Morten Brørup wrote: > > > From: Bruce Richardson [mailto:bruce.richard...@intel.com] > > > Sent: Wednesday, October 14, 2020 11:30 AM > > > > > > On Wed, Oct 14, 2020 at 10:53:24AM +0200, Thomas Monjalon wrote: > > > > 14/10/2020 10:26, Morten Brørup: > > > > > From: Ferruh Yigit > > > > > > On 9/14/2020 1:42 PM, Morten Brørup wrote: > > > > > > > From: Bruce Richardson > > > > > > >> On Mon, Sep 14, 2020 at 01:05:11PM +0200, Morten Brørup wrote: > > > > > > >>> Updated description of rte_eth_rx_burst() to reflect what > > > drivers, > > > > > > >>> when using vector instructions, expect from nb_pkts. > > > > > > >>> > > > > > > >>> Also discussed on the mailing list here: > > > > > > >>> > > > > > > >> > > > http://inbox.dpdk.org/dev/98CBD80474FA8B44BF855DF32C47DC35C61257@smarts > > > > > > >> erver.smartshare.dk/ > > > > > > >>> > > > > > > >>> --- a/lib/librte_ethdev/rte_ethdev.h > > > > > > >>> +++ b/lib/librte_ethdev/rte_ethdev.h > > > > > > >>> @@ -4469,6 +4469,10 @@ int > > > > > > >> rte_eth_dev_hairpin_capability_get(uint16_t port_id, > > > > > > >>> * burst-oriented optimizations in both synchronous and > > > asynchronous > > > > > > >>> * packet processing environments with no overhead in both > > > cases. > > > > > > >>> * > > > > > > >>> + * @note > > > > > > >>> + * Some drivers using vector instructions require that > > > *nb_pkts* > > > > > > >> is > > > > > > >>> + * divisible by 4 or 8, depending on the driver > > > implementation. > > > > > > >>> + * > > > > > > >> > > > > > > >> Not technically true, in that the drivers will round the value > > > down to > > > > > > >> the > > > > > > >> nearest multiple of 4 or 8. So how about rewording as: > > > > > > >> > > > > > > >> "Some drivers using vector instructions may round the > *nb_pkts* > > > driver > > > > > > >> to > > > > > > >> a multiple of 4 or 8 depending upon the driver > implementation." > > > > > > >> > > > > > > > > > > > > > > You are correct about the driver behavior. > > > > > > > > > > > > > > However, if you pass nb_pkts=9, the driver will return 8 > packets, > > > > > > > and thus it does not conform to the API behavior of returning > > > nb_pkts > > > > > > > if they are there. > > > > > > > > > > > > > > This is why the description in this patch differs from the > > > description we > > > > > > reached in the RFC discussion. > > > > > > > > > > > > > > > > > > > Hi Morten, Bruce, > > > > > > > > > > > > +1 to document the this behavior. > > > > > > > > > > > > But in the patch the wording is more strict: > > > > > > "... require that *nb_pkts* is divisible by 4 or 8 ..." > > > > > > "... The value must be divisible by 8 in order to work with any > > > driver." > > > > > > > > > > > > I am not sure the requirement is that strict. Application still > > > provide any > > > > > > value for 'nb_pkts', so the value doesn't "have to" be divisible > 8/4. > > > > > > > > > > > > But for vector PMD case it will return number of packets round > down > > > to 8/4. > > > > > > Perhaps can add for vector PMD it must be at least 4/8? > > > > > > > > > > > > Bruce's explanation sound more accurate to me, what do you think? > > > > > > > > > > > > > > > > I aim to keep the explanation in the documentation relatively > simple. > > > Keep the parameter description short, and add the details about vector > > > driver behavior as a note to the function. > > > > > > > > > > The reason for all this is the existing documentation describing > how to > > > use the rte_eth_rx_burst() function at high level: > > > > > > > > > > The rte_eth_rx_burst() function returns the number of packets > actually > > > retrieved [...]. A return value equal to nb_pkts indicates [...] that > other > > > received packets remain in the input queue. Applications implementing a > > > "retrieve as much received packets as possible" policy can check this > > > specific case and keep invoking the rte_eth_rx_burst() function until a > > > value less than nb_pkts is returned. > > > > > > > > > > As an alternative to my proposed solution, we could add that vector > > > drivers round down to 4 or 8, and the application's comparison of the > > > nb_pkts and return value must consider this. But that would make the > above > > > description strangely complex, rather than just requiring that nb_pkts > for > > > vector drivers must be divisible by 4 or 8. > > > > > > > > > > And as a minor detail, keeping my proposed restriction would also > > > eliminate the vector drivers' need to round down. > > > > > > > > > > I don't see a need to be able to call rte_eth_rx_burst() with a > value > > > not divisible by 4 or 8 for a vector driver, so my proposed restriction > is > > > a tradeoff favoring simplicity over unnecessary flexibility. > > > > > > > > It makes sense to me. > > > > > > > > > > That sounds reasonable for what we have now. We just need to > standardize on > > > either 4 or 8 as the required factor of the input size. I would suggest > > > having it as 4, and look to put in fallback paths for the few drivers > which > > > don't support less than 8. I think that 8 is too large a min burst size > to > > > support, for any apps that want small bursts for lower latency. > > > > > > > 8 works with all drivers today, so I prefer 8. Also, it seems more future > proof for even higher bandwidths. If you need low latency, don't use vector > drivers. Also, DPDK generally seems to prefer throughput over latency, so 8 > seems like the better choice to me. > > > > Bruce, if you have a very strong preference for standardizing on 4 and > you can convince the affected driver developers to modify their drivers to > support 4, we can go ahead with that. > > > Let's go with 8 for now, and we can reduce in future if we can.
Great. If y'all agree now, can I get some Acks, then? -Morten