On 06/02/2024 20:33, Ferruh Yigit wrote: > On 2/1/2024 10:08 AM, Kevin Traynor wrote: >> On 01/02/2024 08:43, David Marchand wrote: >>> As described in a recent bugzilla opened against the net/iavf driver, >>> a driver may call a event callback from other calls of the ethdev API. >>> >>> Nothing guarantees in the ethdev API against such behavior. >>> >>> Add a notice against using locks in those callbacks. >>> >>> Bugzilla ID: 1337 >>> >>> Signed-off-by: David Marchand <david.march...@redhat.com> >>> --- >>> lib/ethdev/rte_ethdev.h | 14 +++++++++++++- >>> 1 file changed, 13 insertions(+), 1 deletion(-) >>> >>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h >>> index 21e3a21903..5c6b104fb4 100644 >>> --- a/lib/ethdev/rte_ethdev.h >>> +++ b/lib/ethdev/rte_ethdev.h >>> @@ -4090,7 +4090,19 @@ enum rte_eth_event_type { >>> RTE_ETH_EVENT_MAX /**< max value of this enum */ >>> }; >>> >>> -/** User application callback to be registered for interrupts. */ >>> +/** >>> + * User application callback to be registered for interrupts. >>> + * >>> + * Note: there is no guarantee in the DPDK drivers that a callback won't be >>> + * called in the middle of other parts of the ethdev API. For >>> example, >>> + * imagine that thread A calls rte_eth_dev_start() and as part of >>> this >>> + * call, a RTE_ETH_EVENT_INTR_RESET event gets generated and the >>> + * associated callback is ran on thread A. In that example, if the >>> + * application protects its internal data using locks before calling >>> + * rte_eth_dev_start(), and the callback takes a same lock, a >>> deadlock >>> + * occurs. Because of this, it is highly recommended NOT to take >>> locks in >>> + * those callbacks. >>> + */ >> >> That is a good practical recommendation for an application developer. >> >> I wonder if it should taken further so that the API formally states the >> callback MUST be non-blocking? >> > > Application still can manage the locks in a safe way, but needs to be > aware of above condition and possible deadlock. >
Just to explain a bit more, if you look at the original issue in the Bugzilla [0], I think there was an assumption that rte_eth_dev_configure() would not block or deadlock with the eal-intr-thread. So then it was assumed that waiting for the lock in the callback was ok, because rte_eth_dev_configure() would return and callback would obtain the lock. So i'm showing that in this example the lack of a guarantee or clarity or bad assumption about the behavior of rte_eth_dev_configure() made it difficult for an app developer to know if their locks were safe or not. That's why I was thinking about something more formal. > I think above note is sufficient instead of forbidding locks in > callbacks completely. > In the end the difference between "highly recommended NOT to" and "must not" is not much and either way is probably enough to scare someone enough to avoid them. [0] https://bugs.dpdk.org/show_bug.cgi?id=1337#c0