> From: Eads, Gage > Sent: Monday, March 26, 2018 10:59 PM > To: Van Haaren, Harry <harry.van.haa...@intel.com>; dev@dpdk.org > Cc: jerin.ja...@caviumnetworks.com; hemant.agra...@nxp.com; Richardson, Bruce > <bruce.richard...@intel.com>; santosh.shu...@caviumnetworks.com; > nipun.gu...@nxp.com > Subject: RE: [PATCH v4 1/2] eventdev: add device stop flush callback > > > > > -----Original Message----- > > From: Van Haaren, Harry > > Sent: Friday, March 23, 2018 11:57 AM > > To: Eads, Gage <gage.e...@intel.com>; dev@dpdk.org > > Cc: jerin.ja...@caviumnetworks.com; hemant.agra...@nxp.com; Richardson, > > Bruce <bruce.richard...@intel.com>; santosh.shu...@caviumnetworks.com; > > nipun.gu...@nxp.com > > Subject: RE: [PATCH v4 1/2] eventdev: add device stop flush callback > > > > > From: Eads, Gage > > > Sent: Tuesday, March 20, 2018 2:13 PM > > > To: dev@dpdk.org > > > Cc: jerin.ja...@caviumnetworks.com; Van Haaren, Harry > > > <harry.van.haa...@intel.com>; hemant.agra...@nxp.com; Richardson, > > > Bruce <bruce.richard...@intel.com>; santosh.shu...@caviumnetworks.com; > > > nipun.gu...@nxp.com > > > Subject: [PATCH v4 1/2] eventdev: add device stop flush callback > > > > > > When an event device is stopped, it drains all event queues. These > > > events may contain pointers, so to prevent memory leaks eventdev now > > > supports a user-provided flush callback that is called during the queue > drain > > process. > > > This callback is stored in process memory, so the callback must be > > > registered by any process that may call rte_event_dev_stop(). > > > > > > This commit also clarifies the behavior of rte_event_dev_stop(). > > > > > > This follows this mailing list discussion: > > > http://dpdk.org/ml/archives/dev/2018-January/087484.html > > > > > > Signed-off-by: Gage Eads <gage.e...@intel.com> > > > > <snip most of the code - looks good!> > > > > > /** > > > - * Stop an event device. The device can be restarted with a call to > > > - * rte_event_dev_start() > > > + * Stop an event device. > > > + * > > > + * This function causes all queued events to be drained. While > > > + draining > > > events > > > + * out of the device, this function calls the user-provided flush > > > + callback > > > + * (if one was registered) once per event. > > > + * > > > + * This function does not drain events from event ports; the > > > + application is > > > + * responsible for flushing events from all ports before stopping the > > > device. > > > > > > Question about how an application is expected to correctly cleanup all the > > events here. Note in particular the last part: "application is responsible > for > > flushing events from all ports **BEFORE** stopping the device". > > > > Given the event device is still running, how can the application be sure it > has > > flushed all the events (from the dequeue side in particular)? > > > > Appreciate the feedback -- good points all around. > > I was expecting that the application would unlink queues from the ports, and > then dequeue until each port has no events. However, there are PMDs for which > runtime port link/unlink is not supported, so I see that this is not a viable > approach. Plus, this adds the application burden that you describe below.
+1. > > > > In order to drain all events from the ports, I was expecting the following: > > > > // stop scheduling new events to worker cores > > rte_event_dev_stop() > > ---> callback gets called for each event > > > > // to dequeue events from each port, and app cleans them up? > > FOR_EACH_PORT( rte_event_dev_dequeue(..., port_id, ...) ) > > > > > > I'd like to avoid the dequeue-each-port() approach in application, as it > adds extra > > burden to clean up correctly... > > Agreed, but for a different reason: that approach means we'd have to change > the documented eventdev behavior. rte_eventdev.h states that the "schedule, > enqueue and dequeue functions should not be invoked when the device is > stopped," and this patch reiterates that in the rte_event_dev_stop() > documentation ("Threads that continue to enqueue/dequeue while the device is > stopped, or being stopped, will result in undefined behavior"). Since a PMD's > stop cleanup code could just be repeated calls to a PMD's dequeue code, > allowing applications to dequeue simultaneously could be troublesome. All +1 too, good point about the header stating it is undefined behavior. > > What if we say that dequeue() returns zero after stop() (leaving events > possibly > > in the port-dequeue side SW buffers), and these events which were about to > be > > dequeued by the worker core are also passed to the dev_stop_flush callback? > > I'd prefer to have dequeue-while-stopped be unsupported, so we don't need an > additional check or synchronization in the datapath, but passing the events in > a port to the callback should work (for the sw PMD, at least). How does that > sound? That's fine with me, both from design point of view, and SW PMD. @HW PMD maintainers, would the above approach work for you?