[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK
On Wed, Nov 02, 2016 at 01:56:27PM +, Bruce Richardson wrote: > On Wed, Nov 02, 2016 at 06:39:27PM +0530, Jerin Jacob wrote: > > On Wed, Nov 02, 2016 at 11:35:51AM +, Bruce Richardson wrote: > > > On Wed, Nov 02, 2016 at 04:55:22PM +0530, Jerin Jacob wrote: > > > > On Fri, Oct 28, 2016 at 02:36:48PM +0530, Jerin Jacob wrote: > > > > > On Fri, Oct 28, 2016 at 09:36:46AM +0100, Bruce Richardson wrote: > > > > > > On Fri, Oct 28, 2016 at 08:31:41AM +0530, Jerin Jacob wrote: > > > > > > > On Wed, Oct 26, 2016 at 01:54:14PM +0100, Bruce Richardson wrote: > > > > > > > > On Wed, Oct 26, 2016 at 05:54:17PM +0530, Jerin Jacob wrote: > > > > > > > How about making default as "mixed" and let application > > > > > > > configures what > > > > > > > is not required?. That way application responsibility is clear. > > > > > > > something similar to ETH_TXQ_FLAGS_NOMULTSEGS, > > > > > > > ETH_TXQ_FLAGS_NOREFCOUNT > > > > > > > with default. > > > > > > > > > > > > > I suppose it could work, but why bother doing that? If an app knows > > > > > > it's > > > > > > only going to use one traffic type, why not let it just state what > > > > > > it > > > > > > will do rather than try to specify what it won't do. If mixed is > > > > > > needed, > > > > > > > > > > My thought was more inline with ethdev spec, like, ref-count is > > > > > default, > > > > > if application need exception then set ETH_TXQ_FLAGS_NOREFCOUNT. But > > > > > it is OK, if > > > > > you need other way. > > > > > > > > > > > then it's easy enough to specify - and we can make it the > > > > > > zero/default > > > > > > value too. > > > > > > > > > > OK. Then we will make MIX as zero/default and add > > > > > "allowed_event_types" in > > > > > event queue config. > > > > > > > > > > > > > Bruce, > > > > > > > > I have tried to make it as "allowed_event_types" in event queue config. > > > > However, rte_event_queue_default_conf_get() can also take NULL for > > > > default > > > > configuration. So I think, It makes sense to go with negation approach > > > > like ethdev to define the default to avoid confusion on the default. So > > > > I am thinking like below now, > > > > > > > > ? [master][libeventdev] $ git diff > > > > diff --git a/rte_eventdev.h b/rte_eventdev.h > > > > index cf22b0e..cac4642 100644 > > > > --- a/rte_eventdev.h > > > > +++ b/rte_eventdev.h > > > > @@ -429,6 +429,12 @@ rte_event_dev_configure(uint8_t dev_id, struct > > > > rte_event_dev_config *config); > > > > * > > > > * \see rte_event_port_setup(), rte_event_port_link() > > > > */ > > > > +#define RTE_EVENT_QUEUE_CFG_NOATOMIC_TYPE (1ULL << 1) > > > > +/**< Skip configuring atomic schedule type resources */ > > > > +#define RTE_EVENT_QUEUE_CFG_NOORDERED_TYPE (1ULL << 2) > > > > +/**< Skip configuring ordered schedule type resources */ > > > > +#define RTE_EVENT_QUEUE_CFG_NOPARALLEL_TYPE(1ULL << 3) > > > > +/**< Skip configuring parallel schedule type resources */ > > > > > > > > /** Event queue configuration structure */ > > > > struct rte_event_queue_conf { > > > > > > > > Thoughts? > > > > > > > > > > I'm ok with the default as being all types, in the case where NULL is > > > specified for the parameter. It does make the most sense. > > > > Yes. That case I need to explicitly mention in the documentation about what > > is default case. With RTE_EVENT_QUEUE_CFG_NOATOMIC_TYPE scheme it quite > > understood what is default. Not adding up? :-) > > > > Would below not work? DEFAULT explicitly stated, and can be commented to > say all types allowed. All I was trying to avoid explicitly stating the default state. Not worth to have back and forth on slow path configuration, I will keep it as positive logic as you suggested :-) and inspired from PKT_TX_L4_MASK #define RTE_EVENT_QUEUE_CFG_TYPE_MASK (3ULL << 0) #define RTE_EVENT_QUEUE_CFG_ALL_TYPES (0ULL << 0) /**< Enable all types */ #define RTE_EVENT_QUEUE_CFG_ATOMIC_ONLY (1ULL << 0) #define RTE_EVENT_QUEUE_CFG_ORDERED_ONLY(2ULL << 0) #define RTE_EVENT_QUEUE_CFG_PARALLEL_ONLY (3ULL << 0) #define RTE_EVENT_QUEUE_CFG_SINGLE_CONSUMER (1ULL << 2) > > #define RTE_EVENT_QUEUE_CFG_DEFAULT 0 > #define RTE_EVENT_QUEUE_CFG_ALL_TYPES RTE_EVENT_QUEUE_CFG_DEFAULT > #define RTE_EVENT_QUEUE_CFG_ATOMIC_ONLY (1<<0) > #define RTE_EVENT_QUEUE_CFG_ORDERED_ONLY (1<<1) > > > /Bruce
[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK
On Wed, Nov 02, 2016 at 11:35:51AM +, Bruce Richardson wrote: > On Wed, Nov 02, 2016 at 04:55:22PM +0530, Jerin Jacob wrote: > > On Fri, Oct 28, 2016 at 02:36:48PM +0530, Jerin Jacob wrote: > > > On Fri, Oct 28, 2016 at 09:36:46AM +0100, Bruce Richardson wrote: > > > > On Fri, Oct 28, 2016 at 08:31:41AM +0530, Jerin Jacob wrote: > > > > > On Wed, Oct 26, 2016 at 01:54:14PM +0100, Bruce Richardson wrote: > > > > > > On Wed, Oct 26, 2016 at 05:54:17PM +0530, Jerin Jacob wrote: > > > > > How about making default as "mixed" and let application configures > > > > > what > > > > > is not required?. That way application responsibility is clear. > > > > > something similar to ETH_TXQ_FLAGS_NOMULTSEGS, > > > > > ETH_TXQ_FLAGS_NOREFCOUNT > > > > > with default. > > > > > > > > > I suppose it could work, but why bother doing that? If an app knows it's > > > > only going to use one traffic type, why not let it just state what it > > > > will do rather than try to specify what it won't do. If mixed is needed, > > > > > > My thought was more inline with ethdev spec, like, ref-count is default, > > > if application need exception then set ETH_TXQ_FLAGS_NOREFCOUNT. But it > > > is OK, if > > > you need other way. > > > > > > > then it's easy enough to specify - and we can make it the zero/default > > > > value too. > > > > > > OK. Then we will make MIX as zero/default and add "allowed_event_types" in > > > event queue config. > > > > > > > Bruce, > > > > I have tried to make it as "allowed_event_types" in event queue config. > > However, rte_event_queue_default_conf_get() can also take NULL for default > > configuration. So I think, It makes sense to go with negation approach > > like ethdev to define the default to avoid confusion on the default. So > > I am thinking like below now, > > > > ? [master][libeventdev] $ git diff > > diff --git a/rte_eventdev.h b/rte_eventdev.h > > index cf22b0e..cac4642 100644 > > --- a/rte_eventdev.h > > +++ b/rte_eventdev.h > > @@ -429,6 +429,12 @@ rte_event_dev_configure(uint8_t dev_id, struct > > rte_event_dev_config *config); > > * > > * \see rte_event_port_setup(), rte_event_port_link() > > */ > > +#define RTE_EVENT_QUEUE_CFG_NOATOMIC_TYPE (1ULL << 1) > > +/**< Skip configuring atomic schedule type resources */ > > +#define RTE_EVENT_QUEUE_CFG_NOORDERED_TYPE (1ULL << 2) > > +/**< Skip configuring ordered schedule type resources */ > > +#define RTE_EVENT_QUEUE_CFG_NOPARALLEL_TYPE(1ULL << 3) > > +/**< Skip configuring parallel schedule type resources */ > > > > /** Event queue configuration structure */ > > struct rte_event_queue_conf { > > > > Thoughts? > > > > I'm ok with the default as being all types, in the case where NULL is > specified for the parameter. It does make the most sense. Yes. That case I need to explicitly mention in the documentation about what is default case. With RTE_EVENT_QUEUE_CFG_NOATOMIC_TYPE scheme it quite understood what is default. Not adding up? :-) > > However, for the cases where the user does specify what they want, I > think it does make more sense, and is easier on the user for things to > be specified in a positive, rather than negative sense. For a user who > wants to just use atomic events, having to specify that as "not-reordered > and not-unordered" just isn't as clear! :-) > > /Bruce >
[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK
On Wed, Nov 02, 2016 at 11:48:37AM +, Bruce Richardson wrote: > On Wed, Nov 02, 2016 at 01:36:34PM +0530, Jerin Jacob wrote: > > On Fri, Oct 28, 2016 at 01:48:57PM +, Van Haaren, Harry wrote: > > > > -Original Message- > > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob > > > > Sent: Tuesday, October 25, 2016 6:49 PM > > > > > > > > > > > Hi Community, > > > > > > > > So far, I have received constructive feedback from Intel, NXP and > > > > Linaro folks. > > > > Let me know, if anyone else interested in contributing to the > > > > definition of eventdev? > > > > > > > > If there are no major issues in proposed spec, then Cavium would like > > > > work on > > > > implementing and up-streaming the common code(lib/librte_eventdev/) and > > > > an associated HW driver.(Requested minor changes of v2 will be addressed > > > > in next version). > > > > > > > > > Hi All, > > > > > > I've been looking at the eventdev API from a use-case point of view, and > > > I'm unclear on a how the API caters for two uses. I have simplified these > > > as much as possible, think of them as a theoretical unit-test for the API > > > :) > > > > > > > > > Fragmentation: > > > 1. Dequeue 8 packets > > > 2. Process 2 packets > > > 3. Processing 3rd, this packet needs fragmentation into two packets > > > 4. Process remaining 5 packets as normal > > > > > > What function calls does the application make to achieve this? > > > In particular, I'm referring to how can the scheduler know that the 3rd > > > packet is the one being fragmented, and how to keep packet order valid. > > > > > > > OK. I will try to share my views on IP fragmentation on event _HW_ > > models(at least on Cavium HW) then we can see, how we can converge. > > > > First, The fragmentation specific logic should be decoupled from the event > > model as it specific to packet and L3 layer(Not specific to generic event) > > > I would view fragmentation as just one example of a workload like this, > multicast and broadcast may be two other cases. Yes, they all apply to > packet, but the general feature support is just how to provide support > for one event generating multiple further events which should be linked > together for reordering. [I think this only really applies in the AFIAK, There two different schemes to "maintain ordering", the first one is based "reordering buffers" i.e as a list data structure used to hold the event first and then when it comes correcting the order(ORDERED->ATOMIC), correct the order based on the previous "reordering buffers". But some HW implementation use "port" state based reordering scheme (i.e no external reorder buffer to keep track the order). So I think, To have portable application workflow, the use case where multiple event generated based on one event, generated events needs to store in the parent event and in the downstream, process them as required. like fragmentation example in http://dpdk.org/ml/archives/dev/2016-November/049707.html The above scheme should OK in your implementation. Right? > reordered case - which leads to another question: in your experience > do you see other event types other than packet being handled in a > "reordered" manner?] We use both timer events and crypto completion events etc in ORDERED type. But not like, one event creates N event scheme on those. > > /Bruce >
[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK
On Wed, Nov 02, 2016 at 11:45:07AM +, Bruce Richardson wrote: > On Wed, Nov 02, 2016 at 04:17:04PM +0530, Jerin Jacob wrote: > > On Wed, Oct 26, 2016 at 12:11:03PM +, Van Haaren, Harry wrote: > > > > -Original Message- > > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob > > > > > > > > So far, I have received constructive feedback from Intel, NXP and > > > > Linaro folks. > > > > Let me know, if anyone else interested in contributing to the > > > > definition of eventdev? > > > > > > > > If there are no major issues in proposed spec, then Cavium would like > > > > work on > > > > implementing and up-streaming the common code(lib/librte_eventdev/) and > > > > an associated HW driver.(Requested minor changes of v2 will be addressed > > > > in next version). > > > > > > > Hi All, > > > > Two queries, > > > > 1) In SW implementation, Is their any connection between "struct > > rte_event_port_conf"'s dequeue_queue_depth and enqueue_queue_depth ? > > i.e it should be enqueue_queue_depth >= dequeue_queue_depth. Right ? > > Thought of adding the common checks in common layer. > > I think this is probably best left to the driver layers to enforce. For > us, such a restriction doesn't really make sense, though in many cases > that would be the usual setup. For accurate load balancing, the dequeue > queue depth would be small, and the burst size would probably equal the > queue depth, meaning the enqueue depth needs to be at least as big. > However, for better throughput, or in cases where all traffic is being > coalesced to a single core e.g. for transmit out a network port, there > is no need to keep the dequeue queue shallow and so it can be many times > the burst size, while the enqueue queue can be kept to 1-2 times the > burst size. > OK > > > > 2)Any comments on follow item(section under ) that needs improvement. > > --- > > Abstract the differences in event QoS management with different > > priority schemes available in different HW or SW implementations with > > portable > > application workflow. > > > > Based on the feedback, there three different kinds of QoS support > > available in > > three different HW or SW implementations. > > 1) Priority associated with the event queue > > 2) Priority associated with each event enqueue > > (Same flow can have two different priority on two separate enqueue) > > 3) Priority associated with the flow(each flow has unique priority) > > > > In v2, The differences abstracted based on device capability > > (RTE_EVENT_DEV_CAP_QUEUE_QOS for the first scheme, > > RTE_EVENT_DEV_CAP_EVENT_QOS for the second and third scheme). > > This scheme would call for different application workflow for > > nontrivial QoS-enabled applications. > > --- > > After thinking a while, I think, RTE_EVENT_DEV_CAP_EVENT_QOS is a > > super-set.if so, the subset RTE_EVENT_DEV_CAP_QUEUE_QOS can be > > implemented with RTE_EVENT_DEV_CAP_EVENT_QOS. i.e We may not need two > > flags, Just one flag RTE_EVENT_DEV_CAP_EVENT_QOS is enough to fix > > portability issue with basic QoS enabled applications. > > > > i.e Introduce RTE_EVENT_DEV_CAP_EVENT_QOS as config option in device > > configure stage if application needs fine granularity on QoS per event > > enqueue.For trivial applications, configured > > rte_event_queue_conf->priority can be used as rte_event_enqueue(struct > > rte_event.priority) > > > So all implementations should support the concept of priority among > queues, and then there is optional support for event or flow based > prioritization. Is that a correct interpretation of what you propose? Yes. If you _can_ implement it and if possible in the system. > > /Bruce >
[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK
On Fri, Oct 28, 2016 at 02:36:48PM +0530, Jerin Jacob wrote: > On Fri, Oct 28, 2016 at 09:36:46AM +0100, Bruce Richardson wrote: > > On Fri, Oct 28, 2016 at 08:31:41AM +0530, Jerin Jacob wrote: > > > On Wed, Oct 26, 2016 at 01:54:14PM +0100, Bruce Richardson wrote: > > > > On Wed, Oct 26, 2016 at 05:54:17PM +0530, Jerin Jacob wrote: > > > > > On Wed, Oct 26, 2016 at 12:11:03PM +, Van Haaren, Harry wrote: > > > > > > > -Original Message- > > > > rte_event_queue_conf, with possible values: > > > > * atomic > > > > * ordered > > > > * parallel > > > > * mixed - allowing all 3 types. I think allowing 2 of three types might > > > > make things too complicated. > > > > > > > > An open question would then be how to behave when the queue type and > > > > requested event type conflict. We can either throw an error, or just > > > > ignore the event type and always treat enqueued events as being of the > > > > queue type. I prefer the latter, because it's faster not having to > > > > error-check, and it pushes the responsibility on the app to know what > > > > it's doing. > > > > > > How about making default as "mixed" and let application configures what > > > is not required?. That way application responsibility is clear. > > > something similar to ETH_TXQ_FLAGS_NOMULTSEGS, ETH_TXQ_FLAGS_NOREFCOUNT > > > with default. > > > > > I suppose it could work, but why bother doing that? If an app knows it's > > only going to use one traffic type, why not let it just state what it > > will do rather than try to specify what it won't do. If mixed is needed, > > My thought was more inline with ethdev spec, like, ref-count is default, > if application need exception then set ETH_TXQ_FLAGS_NOREFCOUNT. But it is > OK, if > you need other way. > > > then it's easy enough to specify - and we can make it the zero/default > > value too. > > OK. Then we will make MIX as zero/default and add "allowed_event_types" in > event queue config. > Bruce, I have tried to make it as "allowed_event_types" in event queue config. However, rte_event_queue_default_conf_get() can also take NULL for default configuration. So I think, It makes sense to go with negation approach like ethdev to define the default to avoid confusion on the default. So I am thinking like below now, ? [master][libeventdev] $ git diff diff --git a/rte_eventdev.h b/rte_eventdev.h index cf22b0e..cac4642 100644 --- a/rte_eventdev.h +++ b/rte_eventdev.h @@ -429,6 +429,12 @@ rte_event_dev_configure(uint8_t dev_id, struct rte_event_dev_config *config); * * \see rte_event_port_setup(), rte_event_port_link() */ +#define RTE_EVENT_QUEUE_CFG_NOATOMIC_TYPE (1ULL << 1) +/**< Skip configuring atomic schedule type resources */ +#define RTE_EVENT_QUEUE_CFG_NOORDERED_TYPE (1ULL << 2) +/**< Skip configuring ordered schedule type resources */ +#define RTE_EVENT_QUEUE_CFG_NOPARALLEL_TYPE(1ULL << 3) +/**< Skip configuring parallel schedule type resources */ /** Event queue configuration structure */ struct rte_event_queue_conf { Thoughts? > /Jerin > > > > > Our software implementation for now, only supports one type per queue - > > which we suspect should meet a lot of use-cases. We'll have to see about > > adding in mixed types in future. > > > > /Bruce
[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK
On Wed, Oct 26, 2016 at 12:11:03PM +, Van Haaren, Harry wrote: > > -Original Message- > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob > > > > So far, I have received constructive feedback from Intel, NXP and Linaro > > folks. > > Let me know, if anyone else interested in contributing to the definition of > > eventdev? > > > > If there are no major issues in proposed spec, then Cavium would like work > > on > > implementing and up-streaming the common code(lib/librte_eventdev/) and > > an associated HW driver.(Requested minor changes of v2 will be addressed > > in next version). > Hi All, Two queries, 1) In SW implementation, Is their any connection between "struct rte_event_port_conf"'s dequeue_queue_depth and enqueue_queue_depth ? i.e it should be enqueue_queue_depth >= dequeue_queue_depth. Right ? Thought of adding the common checks in common layer. 2)Any comments on follow item(section under ) that needs improvement. --- Abstract the differences in event QoS management with different priority schemes available in different HW or SW implementations with portable application workflow. Based on the feedback, there three different kinds of QoS support available in three different HW or SW implementations. 1) Priority associated with the event queue 2) Priority associated with each event enqueue (Same flow can have two different priority on two separate enqueue) 3) Priority associated with the flow(each flow has unique priority) In v2, The differences abstracted based on device capability (RTE_EVENT_DEV_CAP_QUEUE_QOS for the first scheme, RTE_EVENT_DEV_CAP_EVENT_QOS for the second and third scheme). This scheme would call for different application workflow for nontrivial QoS-enabled applications. --- After thinking a while, I think, RTE_EVENT_DEV_CAP_EVENT_QOS is a super-set.if so, the subset RTE_EVENT_DEV_CAP_QUEUE_QOS can be implemented with RTE_EVENT_DEV_CAP_EVENT_QOS. i.e We may not need two flags, Just one flag RTE_EVENT_DEV_CAP_EVENT_QOS is enough to fix portability issue with basic QoS enabled applications. i.e Introduce RTE_EVENT_DEV_CAP_EVENT_QOS as config option in device configure stage if application needs fine granularity on QoS per event enqueue.For trivial applications, configured rte_event_queue_conf->priority can be used as rte_event_enqueue(struct rte_event.priority) Thoughts? /Jerin
[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK
On Fri, Oct 28, 2016 at 03:16:18PM +0100, Bruce Richardson wrote: > On Fri, Oct 28, 2016 at 02:48:57PM +0100, Van Haaren, Harry wrote: > > > -Original Message- > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob > > > Sent: Tuesday, October 25, 2016 6:49 PM > > > > > > > > Hi Community, > > > > > > So far, I have received constructive feedback from Intel, NXP and Linaro > > > folks. > > > Let me know, if anyone else interested in contributing to the definition > > > of eventdev? > > > > > > If there are no major issues in proposed spec, then Cavium would like > > > work on > > > implementing and up-streaming the common code(lib/librte_eventdev/) and > > > an associated HW driver.(Requested minor changes of v2 will be addressed > > > in next version). > > > > > > Hi All, > > > > I've been looking at the eventdev API from a use-case point of view, and > > I'm unclear on a how the API caters for two uses. I have simplified these > > as much as possible, think of them as a theoretical unit-test for the API :) > > > > > > Fragmentation: > > 1. Dequeue 8 packets > > 2. Process 2 packets > > 3. Processing 3rd, this packet needs fragmentation into two packets > > 4. Process remaining 5 packets as normal > > > > What function calls does the application make to achieve this? > > In particular, I'm referring to how can the scheduler know that the 3rd > > packet is the one being fragmented, and how to keep packet order valid. > > > > > > Dropping packets: > > 1. Dequeue 8 packets > > 2. Process 2 packets > > 3. Processing 3rd, this packet needs to be dropped > > 4. Process remaining 5 packets as normal > > > > What function calls does the application make to achieve this? > > Again, in particular how does the scheduler know that the 3rd packet is > > being dropped. > > > > > > Regards, -Harry > > Hi, > > these questions apply particularly to reordered which has a lot more > complications than the other types in terms of sending packets back into > the scheduler. However, atomic types will still suffer from problems > with things the way they are - again if we assume a burst of 8 packets, > then to forward those packets, we need to re-enqueue them again to the > scheduler, and also then send 8 releases to the scheduler as well, to > release the atomic locks for those packets. > This means that for each packet we have to send two messages to a > scheduler core, something that is really inefficient. > > This number of messages is critical for any software implementation, as > the cost of moving items core-to-core is going to be a big bottleneck > (perhaps the biggest bottleneck) in the system. It's for this reason we > need to use burst APIs - as with rte_rings. I agree, That the reason why we have rte_event_*_burst() > > How we have solved this in our implementation, is to allow there to be > an event operation type. The four operations we implemented are as below > (using packet as a synonym for event here, since these would mostly > apply to packets flowing through a system): > > * NEW - just a regular enqueue of a packet, without any previous context Makes sense. I was trying derive it.Make sense for application requesting it. > * FORWARD - enqueue a packet, and mark the flow processing for the > equivalent packet that was dequeued as completed, i.e. > release any atomic locks, or reorder this packet with > respect to any other outstanding packets from the event queue. Default case > * DROP- this is roughtly equivalent to the existing "release" API call, > except that having it as an enqueue type allows us to > release multiple items in a single call, and also to mix > releases with new packets and forwarded packets Yes. Maps to rte_event_release(), with index parameter, its kind doing the job. But, Makes sense as flag to enable burst. But it calls for removing the index parameter. Looks like index parameter has issue in Intel implementation. If so, may be we(Cavium) can fill the index in the dequeue as implementation specific bits like Harry suggested and use it in enqueue. http://dpdk.org/ml/archives/dev/2016-October/049459.html Any thoughts from NXP? > * PARTIAL - this indicates that the packet being enqueued should be > treated according to the context of the current packet, but > that that context should not be released/completed by the > enqueue of this packet. This only really applies for > reordered events, and is needed to do fragmentation and or > multicast of packets with reordering. I believe PARTIAL is something, HW implementation will have trouble. I have outlined other way to fix without coupling fragmentation logic in scheduler. http://dpdk.org/ml/archives/dev/2016-November/049707.html If it makes sense for everyone then may be can - Introduce "event operation type" bits (NEW, DROP, FORWARD(may not required as
[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK
On Wed, Nov 02, 2016 at 06:39:27PM +0530, Jerin Jacob wrote: > On Wed, Nov 02, 2016 at 11:35:51AM +, Bruce Richardson wrote: > > On Wed, Nov 02, 2016 at 04:55:22PM +0530, Jerin Jacob wrote: > > > On Fri, Oct 28, 2016 at 02:36:48PM +0530, Jerin Jacob wrote: > > > > On Fri, Oct 28, 2016 at 09:36:46AM +0100, Bruce Richardson wrote: > > > > > On Fri, Oct 28, 2016 at 08:31:41AM +0530, Jerin Jacob wrote: > > > > > > On Wed, Oct 26, 2016 at 01:54:14PM +0100, Bruce Richardson wrote: > > > > > > > On Wed, Oct 26, 2016 at 05:54:17PM +0530, Jerin Jacob wrote: > > > > > > How about making default as "mixed" and let application configures > > > > > > what > > > > > > is not required?. That way application responsibility is clear. > > > > > > something similar to ETH_TXQ_FLAGS_NOMULTSEGS, > > > > > > ETH_TXQ_FLAGS_NOREFCOUNT > > > > > > with default. > > > > > > > > > > > I suppose it could work, but why bother doing that? If an app knows > > > > > it's > > > > > only going to use one traffic type, why not let it just state what it > > > > > will do rather than try to specify what it won't do. If mixed is > > > > > needed, > > > > > > > > My thought was more inline with ethdev spec, like, ref-count is default, > > > > if application need exception then set ETH_TXQ_FLAGS_NOREFCOUNT. But it > > > > is OK, if > > > > you need other way. > > > > > > > > > then it's easy enough to specify - and we can make it the zero/default > > > > > value too. > > > > > > > > OK. Then we will make MIX as zero/default and add "allowed_event_types" > > > > in > > > > event queue config. > > > > > > > > > > Bruce, > > > > > > I have tried to make it as "allowed_event_types" in event queue config. > > > However, rte_event_queue_default_conf_get() can also take NULL for default > > > configuration. So I think, It makes sense to go with negation approach > > > like ethdev to define the default to avoid confusion on the default. So > > > I am thinking like below now, > > > > > > ? [master][libeventdev] $ git diff > > > diff --git a/rte_eventdev.h b/rte_eventdev.h > > > index cf22b0e..cac4642 100644 > > > --- a/rte_eventdev.h > > > +++ b/rte_eventdev.h > > > @@ -429,6 +429,12 @@ rte_event_dev_configure(uint8_t dev_id, struct > > > rte_event_dev_config *config); > > > * > > > * \see rte_event_port_setup(), rte_event_port_link() > > > */ > > > +#define RTE_EVENT_QUEUE_CFG_NOATOMIC_TYPE (1ULL << 1) > > > +/**< Skip configuring atomic schedule type resources */ > > > +#define RTE_EVENT_QUEUE_CFG_NOORDERED_TYPE (1ULL << 2) > > > +/**< Skip configuring ordered schedule type resources */ > > > +#define RTE_EVENT_QUEUE_CFG_NOPARALLEL_TYPE(1ULL << 3) > > > +/**< Skip configuring parallel schedule type resources */ > > > > > > /** Event queue configuration structure */ > > > struct rte_event_queue_conf { > > > > > > Thoughts? > > > > > > > I'm ok with the default as being all types, in the case where NULL is > > specified for the parameter. It does make the most sense. > > Yes. That case I need to explicitly mention in the documentation about what > is default case. With RTE_EVENT_QUEUE_CFG_NOATOMIC_TYPE scheme it quite > understood what is default. Not adding up? :-) > Would below not work? DEFAULT explicitly stated, and can be commented to say all types allowed. #define RTE_EVENT_QUEUE_CFG_DEFAULT 0 #define RTE_EVENT_QUEUE_CFG_ALL_TYPES RTE_EVENT_QUEUE_CFG_DEFAULT #define RTE_EVENT_QUEUE_CFG_ATOMIC_ONLY (1<<0) #define RTE_EVENT_QUEUE_CFG_ORDERED_ONLY (1<<1) /Bruce
[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK
On Fri, Oct 28, 2016 at 01:48:57PM +, Van Haaren, Harry wrote: > > -Original Message- > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob > > Sent: Tuesday, October 25, 2016 6:49 PM > > > > > Hi Community, > > > > So far, I have received constructive feedback from Intel, NXP and Linaro > > folks. > > Let me know, if anyone else interested in contributing to the definition of > > eventdev? > > > > If there are no major issues in proposed spec, then Cavium would like work > > on > > implementing and up-streaming the common code(lib/librte_eventdev/) and > > an associated HW driver.(Requested minor changes of v2 will be addressed > > in next version). > > > Hi All, > > I've been looking at the eventdev API from a use-case point of view, and I'm > unclear on a how the API caters for two uses. I have simplified these as much > as possible, think of them as a theoretical unit-test for the API :) > > > Fragmentation: > 1. Dequeue 8 packets > 2. Process 2 packets > 3. Processing 3rd, this packet needs fragmentation into two packets > 4. Process remaining 5 packets as normal > > What function calls does the application make to achieve this? > In particular, I'm referring to how can the scheduler know that the 3rd > packet is the one being fragmented, and how to keep packet order valid. > OK. I will try to share my views on IP fragmentation on event _HW_ models(at least on Cavium HW) then we can see, how we can converge. First, The fragmentation specific logic should be decoupled from the event model as it specific to packet and L3 layer(Not specific to generic event) Now, let us consider the fragmentation handling with non-burst case and single flow. The following text outlines the event flow a)Setup an event device with single event queue b)Link multiple ports to single event queue c)Event producer enqueues p0..p7 packets to event queue with ORDERED type.(let's assume p2 packet needs to be fragmented i.e application needs to create p2.0 and p2.1 from p2) d)Since it is an ORDERED type, p0 to p7 packets are distributed to multiple ports in parallel(assigned to each lcore or lightweight thread) e) each lcore/lightweight thread get the packet from designated event port and process them in parallel and enqueue back to ATOMIC type to maintain ordering f)The one lcore dequeues the p2 packet, understands it needs to be fragmented due to MTU size etc. So it calls rte_ipv4_fragment_packet() and store the fragmented packet p2.0 and p2.1 in private area of p2 mbuf. and as usual like other workers, it enqueues p2 to atomic queue for maintaining the order. g)On the atomic flow, when lcore dequeues packets, then it comes in order p0..p7. The application sends p0 to p7 on the wire. When application checks the p2 mbuf private area it understands it is fragmented and then sends p2.0 and p2.1 on the wire. OR skip the fragmentation step in (f) and in step (g), while processing the p2, run over rte_ipv4_fragment_packet() and split the packet and transmit the packets(in case application don't want to deal with mbuf private area) Now, When it comes to BURST scheme. We are planning to create a SW structure as a virtual event port and associate N (N=rte_event_port_dequeue_depth()) physical HW event ports to the virtual port. That way, it just come as an extension to non burst API and on the release call have explicit "index" and identify the physical event port associated with the virtual port. /Jerin > > Dropping packets: > 1. Dequeue 8 packets > 2. Process 2 packets > 3. Processing 3rd, this packet needs to be dropped > 4. Process remaining 5 packets as normal > > What function calls does the application make to achieve this? > Again, in particular how does the scheduler know that the 3rd packet is being > dropped. rte_event_release(..,..,3)?? > > > Regards, -Harry
[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK
On Wed, Nov 02, 2016 at 01:36:34PM +0530, Jerin Jacob wrote: > On Fri, Oct 28, 2016 at 01:48:57PM +, Van Haaren, Harry wrote: > > > -Original Message- > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob > > > Sent: Tuesday, October 25, 2016 6:49 PM > > > > > > > > Hi Community, > > > > > > So far, I have received constructive feedback from Intel, NXP and Linaro > > > folks. > > > Let me know, if anyone else interested in contributing to the definition > > > of eventdev? > > > > > > If there are no major issues in proposed spec, then Cavium would like > > > work on > > > implementing and up-streaming the common code(lib/librte_eventdev/) and > > > an associated HW driver.(Requested minor changes of v2 will be addressed > > > in next version). > > > > > > Hi All, > > > > I've been looking at the eventdev API from a use-case point of view, and > > I'm unclear on a how the API caters for two uses. I have simplified these > > as much as possible, think of them as a theoretical unit-test for the API :) > > > > > > Fragmentation: > > 1. Dequeue 8 packets > > 2. Process 2 packets > > 3. Processing 3rd, this packet needs fragmentation into two packets > > 4. Process remaining 5 packets as normal > > > > What function calls does the application make to achieve this? > > In particular, I'm referring to how can the scheduler know that the 3rd > > packet is the one being fragmented, and how to keep packet order valid. > > > > OK. I will try to share my views on IP fragmentation on event _HW_ > models(at least on Cavium HW) then we can see, how we can converge. > > First, The fragmentation specific logic should be decoupled from the event > model as it specific to packet and L3 layer(Not specific to generic event) > I would view fragmentation as just one example of a workload like this, multicast and broadcast may be two other cases. Yes, they all apply to packet, but the general feature support is just how to provide support for one event generating multiple further events which should be linked together for reordering. [I think this only really applies in the reordered case - which leads to another question: in your experience do you see other event types other than packet being handled in a "reordered" manner?] /Bruce
[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK
On Wed, Nov 02, 2016 at 04:17:04PM +0530, Jerin Jacob wrote: > On Wed, Oct 26, 2016 at 12:11:03PM +, Van Haaren, Harry wrote: > > > -Original Message- > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob > > > > > > So far, I have received constructive feedback from Intel, NXP and Linaro > > > folks. > > > Let me know, if anyone else interested in contributing to the definition > > > of eventdev? > > > > > > If there are no major issues in proposed spec, then Cavium would like > > > work on > > > implementing and up-streaming the common code(lib/librte_eventdev/) and > > > an associated HW driver.(Requested minor changes of v2 will be addressed > > > in next version). > > > > Hi All, > > Two queries, > > 1) In SW implementation, Is their any connection between "struct > rte_event_port_conf"'s dequeue_queue_depth and enqueue_queue_depth ? > i.e it should be enqueue_queue_depth >= dequeue_queue_depth. Right ? > Thought of adding the common checks in common layer. I think this is probably best left to the driver layers to enforce. For us, such a restriction doesn't really make sense, though in many cases that would be the usual setup. For accurate load balancing, the dequeue queue depth would be small, and the burst size would probably equal the queue depth, meaning the enqueue depth needs to be at least as big. However, for better throughput, or in cases where all traffic is being coalesced to a single core e.g. for transmit out a network port, there is no need to keep the dequeue queue shallow and so it can be many times the burst size, while the enqueue queue can be kept to 1-2 times the burst size. > > 2)Any comments on follow item(section under ) that needs improvement. > --- > Abstract the differences in event QoS management with different > priority schemes available in different HW or SW implementations with portable > application workflow. > > Based on the feedback, there three different kinds of QoS support > available in > three different HW or SW implementations. > 1) Priority associated with the event queue > 2) Priority associated with each event enqueue > (Same flow can have two different priority on two separate enqueue) > 3) Priority associated with the flow(each flow has unique priority) > > In v2, The differences abstracted based on device capability > (RTE_EVENT_DEV_CAP_QUEUE_QOS for the first scheme, > RTE_EVENT_DEV_CAP_EVENT_QOS for the second and third scheme). > This scheme would call for different application workflow for > nontrivial QoS-enabled applications. > --- > After thinking a while, I think, RTE_EVENT_DEV_CAP_EVENT_QOS is a > super-set.if so, the subset RTE_EVENT_DEV_CAP_QUEUE_QOS can be > implemented with RTE_EVENT_DEV_CAP_EVENT_QOS. i.e We may not need two > flags, Just one flag RTE_EVENT_DEV_CAP_EVENT_QOS is enough to fix > portability issue with basic QoS enabled applications. > > i.e Introduce RTE_EVENT_DEV_CAP_EVENT_QOS as config option in device > configure stage if application needs fine granularity on QoS per event > enqueue.For trivial applications, configured > rte_event_queue_conf->priority can be used as rte_event_enqueue(struct > rte_event.priority) > So all implementations should support the concept of priority among queues, and then there is optional support for event or flow based prioritization. Is that a correct interpretation of what you propose? /Bruce
[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK
On Wed, Nov 02, 2016 at 04:55:22PM +0530, Jerin Jacob wrote: > On Fri, Oct 28, 2016 at 02:36:48PM +0530, Jerin Jacob wrote: > > On Fri, Oct 28, 2016 at 09:36:46AM +0100, Bruce Richardson wrote: > > > On Fri, Oct 28, 2016 at 08:31:41AM +0530, Jerin Jacob wrote: > > > > On Wed, Oct 26, 2016 at 01:54:14PM +0100, Bruce Richardson wrote: > > > > > On Wed, Oct 26, 2016 at 05:54:17PM +0530, Jerin Jacob wrote: > > > > > > On Wed, Oct 26, 2016 at 12:11:03PM +, Van Haaren, Harry wrote: > > > > > > > > -Original Message- > > > > > rte_event_queue_conf, with possible values: > > > > > * atomic > > > > > * ordered > > > > > * parallel > > > > > * mixed - allowing all 3 types. I think allowing 2 of three types > > > > > might > > > > > make things too complicated. > > > > > > > > > > An open question would then be how to behave when the queue type and > > > > > requested event type conflict. We can either throw an error, or just > > > > > ignore the event type and always treat enqueued events as being of the > > > > > queue type. I prefer the latter, because it's faster not having to > > > > > error-check, and it pushes the responsibility on the app to know what > > > > > it's doing. > > > > > > > > How about making default as "mixed" and let application configures what > > > > is not required?. That way application responsibility is clear. > > > > something similar to ETH_TXQ_FLAGS_NOMULTSEGS, ETH_TXQ_FLAGS_NOREFCOUNT > > > > with default. > > > > > > > I suppose it could work, but why bother doing that? If an app knows it's > > > only going to use one traffic type, why not let it just state what it > > > will do rather than try to specify what it won't do. If mixed is needed, > > > > My thought was more inline with ethdev spec, like, ref-count is default, > > if application need exception then set ETH_TXQ_FLAGS_NOREFCOUNT. But it is > > OK, if > > you need other way. > > > > > then it's easy enough to specify - and we can make it the zero/default > > > value too. > > > > OK. Then we will make MIX as zero/default and add "allowed_event_types" in > > event queue config. > > > > Bruce, > > I have tried to make it as "allowed_event_types" in event queue config. > However, rte_event_queue_default_conf_get() can also take NULL for default > configuration. So I think, It makes sense to go with negation approach > like ethdev to define the default to avoid confusion on the default. So > I am thinking like below now, > > ? [master][libeventdev] $ git diff > diff --git a/rte_eventdev.h b/rte_eventdev.h > index cf22b0e..cac4642 100644 > --- a/rte_eventdev.h > +++ b/rte_eventdev.h > @@ -429,6 +429,12 @@ rte_event_dev_configure(uint8_t dev_id, struct > rte_event_dev_config *config); > * > * \see rte_event_port_setup(), rte_event_port_link() > */ > +#define RTE_EVENT_QUEUE_CFG_NOATOMIC_TYPE (1ULL << 1) > +/**< Skip configuring atomic schedule type resources */ > +#define RTE_EVENT_QUEUE_CFG_NOORDERED_TYPE (1ULL << 2) > +/**< Skip configuring ordered schedule type resources */ > +#define RTE_EVENT_QUEUE_CFG_NOPARALLEL_TYPE(1ULL << 3) > +/**< Skip configuring parallel schedule type resources */ > > /** Event queue configuration structure */ > struct rte_event_queue_conf { > > Thoughts? > I'm ok with the default as being all types, in the case where NULL is specified for the parameter. It does make the most sense. However, for the cases where the user does specify what they want, I think it does make more sense, and is easier on the user for things to be specified in a positive, rather than negative sense. For a user who wants to just use atomic events, having to specify that as "not-reordered and not-unordered" just isn't as clear! :-) /Bruce
[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK
On Fri, Oct 28, 2016 at 02:48:57PM +0100, Van Haaren, Harry wrote: > > -Original Message- > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob > > Sent: Tuesday, October 25, 2016 6:49 PM > > > > > Hi Community, > > > > So far, I have received constructive feedback from Intel, NXP and Linaro > > folks. > > Let me know, if anyone else interested in contributing to the definition of > > eventdev? > > > > If there are no major issues in proposed spec, then Cavium would like work > > on > > implementing and up-streaming the common code(lib/librte_eventdev/) and > > an associated HW driver.(Requested minor changes of v2 will be addressed > > in next version). > > > Hi All, > > I've been looking at the eventdev API from a use-case point of view, and I'm > unclear on a how the API caters for two uses. I have simplified these as much > as possible, think of them as a theoretical unit-test for the API :) > > > Fragmentation: > 1. Dequeue 8 packets > 2. Process 2 packets > 3. Processing 3rd, this packet needs fragmentation into two packets > 4. Process remaining 5 packets as normal > > What function calls does the application make to achieve this? > In particular, I'm referring to how can the scheduler know that the 3rd > packet is the one being fragmented, and how to keep packet order valid. > > > Dropping packets: > 1. Dequeue 8 packets > 2. Process 2 packets > 3. Processing 3rd, this packet needs to be dropped > 4. Process remaining 5 packets as normal > > What function calls does the application make to achieve this? > Again, in particular how does the scheduler know that the 3rd packet is being > dropped. > > > Regards, -Harry Hi, these questions apply particularly to reordered which has a lot more complications than the other types in terms of sending packets back into the scheduler. However, atomic types will still suffer from problems with things the way they are - again if we assume a burst of 8 packets, then to forward those packets, we need to re-enqueue them again to the scheduler, and also then send 8 releases to the scheduler as well, to release the atomic locks for those packets. This means that for each packet we have to send two messages to a scheduler core, something that is really inefficient. This number of messages is critical for any software implementation, as the cost of moving items core-to-core is going to be a big bottleneck (perhaps the biggest bottleneck) in the system. It's for this reason we need to use burst APIs - as with rte_rings. How we have solved this in our implementation, is to allow there to be an event operation type. The four operations we implemented are as below (using packet as a synonym for event here, since these would mostly apply to packets flowing through a system): * NEW - just a regular enqueue of a packet, without any previous context * FORWARD - enqueue a packet, and mark the flow processing for the equivalent packet that was dequeued as completed, i.e. release any atomic locks, or reorder this packet with respect to any other outstanding packets from the event queue. * DROP- this is roughtly equivalent to the existing "release" API call, except that having it as an enqueue type allows us to release multiple items in a single call, and also to mix releases with new packets and forwarded packets * PARTIAL - this indicates that the packet being enqueued should be treated according to the context of the current packet, but that that context should not be released/completed by the enqueue of this packet. This only really applies for reordered events, and is needed to do fragmentation and or multicast of packets with reordering. Therefore, I think we need to use some of the bits just freed up in the event structure to include an enqueue operation type. Without it, I just can't see how the API can ever support burst operation on packets. Regards, /Bruce
[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK
On Fri, Oct 28, 2016 at 09:36:46AM +0100, Bruce Richardson wrote: > On Fri, Oct 28, 2016 at 08:31:41AM +0530, Jerin Jacob wrote: > > On Wed, Oct 26, 2016 at 01:54:14PM +0100, Bruce Richardson wrote: > > > On Wed, Oct 26, 2016 at 05:54:17PM +0530, Jerin Jacob wrote: > > > > On Wed, Oct 26, 2016 at 12:11:03PM +, Van Haaren, Harry wrote: > > > > > > -Original Message- > > > > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob > > > Thanks. One other suggestion is that it might be useful to provide > > > support for having typed queues explicitly in the API. Right now, when > > > you create an queue, the queue_conf structure takes as parameters how > > > many atomic flows that are needed for the queue, or how many reorder > > > slots need to be reserved for it. This implicitly hints at the type of > > > traffic which will be sent to the queue, but I'm wondering if it's > > > better to make it explicit. There are certain optimisations that can be > > > looked at if we know that a queue only handles packets of a particular > > > type. [Not having to handle reordering when pulling events from a core > > > can be a big win for software!]. > > > > If it helps in SW implementation, then I think we can add this in queue > > configuration. > > > > > > > > How about adding: "allowed_event_types" as a field to > > > rte_event_queue_conf, with possible values: > > > * atomic > > > * ordered > > > * parallel > > > * mixed - allowing all 3 types. I think allowing 2 of three types might > > > make things too complicated. > > > > > > An open question would then be how to behave when the queue type and > > > requested event type conflict. We can either throw an error, or just > > > ignore the event type and always treat enqueued events as being of the > > > queue type. I prefer the latter, because it's faster not having to > > > error-check, and it pushes the responsibility on the app to know what > > > it's doing. > > > > How about making default as "mixed" and let application configures what > > is not required?. That way application responsibility is clear. > > something similar to ETH_TXQ_FLAGS_NOMULTSEGS, ETH_TXQ_FLAGS_NOREFCOUNT > > with default. > > > I suppose it could work, but why bother doing that? If an app knows it's > only going to use one traffic type, why not let it just state what it > will do rather than try to specify what it won't do. If mixed is needed, My thought was more inline with ethdev spec, like, ref-count is default, if application need exception then set ETH_TXQ_FLAGS_NOREFCOUNT. But it is OK, if you need other way. > then it's easy enough to specify - and we can make it the zero/default > value too. OK. Then we will make MIX as zero/default and add "allowed_event_types" in event queue config. /Jerin > > Our software implementation for now, only supports one type per queue - > which we suspect should meet a lot of use-cases. We'll have to see about > adding in mixed types in future. > > /Bruce
[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK
> -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob > Sent: Tuesday, October 25, 2016 6:49 PM > > Hi Community, > > So far, I have received constructive feedback from Intel, NXP and Linaro > folks. > Let me know, if anyone else interested in contributing to the definition of > eventdev? > > If there are no major issues in proposed spec, then Cavium would like work on > implementing and up-streaming the common code(lib/librte_eventdev/) and > an associated HW driver.(Requested minor changes of v2 will be addressed > in next version). Hi All, I've been looking at the eventdev API from a use-case point of view, and I'm unclear on a how the API caters for two uses. I have simplified these as much as possible, think of them as a theoretical unit-test for the API :) Fragmentation: 1. Dequeue 8 packets 2. Process 2 packets 3. Processing 3rd, this packet needs fragmentation into two packets 4. Process remaining 5 packets as normal What function calls does the application make to achieve this? In particular, I'm referring to how can the scheduler know that the 3rd packet is the one being fragmented, and how to keep packet order valid. Dropping packets: 1. Dequeue 8 packets 2. Process 2 packets 3. Processing 3rd, this packet needs to be dropped 4. Process remaining 5 packets as normal What function calls does the application make to achieve this? Again, in particular how does the scheduler know that the 3rd packet is being dropped. Regards, -Harry
[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK
> From: Vincent Jardin [mailto:vincent.jardin at 6wind.com] > Sent: Wednesday, October 26, 2016 7:37 PM > Le 26 octobre 2016 2:11:26 PM "Van Haaren, Harry" > a ?crit : > > >> -Original Message- > >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob > >> > >> So far, I have received constructive feedback from Intel, NXP and Linaro > >> folks. > >> Let me know, if anyone else interested in contributing to the definition of > >> eventdev? > >> > >> If there are no major issues in proposed spec, then Cavium would like work > >> on > >> implementing and up-streaming the common code(lib/librte_eventdev/) and > >> an associated HW driver.(Requested minor changes of v2 will be addressed > >> in next version). > > > > Hi All, > > > > I will propose a minor change to the rte_event struct, allowing some bits > > to be implementation specific. Currently the rte_event struct has no space > > to allow an implementation store any metadata about the event. For software > > performance it would be really helpful if there are some bits available for > > the implementation to keep some flags about each event. > > > > I suggest to rework the struct as below which opens 6 bits that were > > otherwise wasted, and define them as implementation specific. By > > implementation specific it is understood that the implementation can > > overwrite any information stored in those bits, and the application must > > not expect the data to remain after the event is scheduled. > > > > OLD: > > struct rte_event { > > uint32_t flow_id:24; > > uint32_t queue_id:8; > > uint8_t sched_type; /* Note only 2 bits of 8 are required */ > > > > NEW: > > struct rte_event { > > uint32_t flow_id:24; > > uint32_t sched_type:2; /* reduced size : but 2 bits is enough for the > > enqueue types Ordered,Atomic,Parallel.*/ > > uint32_t implementation:6; /* available for implementation specific > > metadata */ > > uint8_t queue_id; /* still 8 bits as before */ > > Bitfileds are efficients on Octeon. What's about other CPUs you have in > mind? x86 is not as efficient. Given the rte_event struct is 16 bytes and there's no free space to use, I see no alternative than using bitfields in this case. Wecloming suggestions of a better way to layout the structure to avoid the bitfield. Regards, -Harry
[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK
On Fri, Oct 28, 2016 at 08:31:41AM +0530, Jerin Jacob wrote: > On Wed, Oct 26, 2016 at 01:54:14PM +0100, Bruce Richardson wrote: > > On Wed, Oct 26, 2016 at 05:54:17PM +0530, Jerin Jacob wrote: > > > On Wed, Oct 26, 2016 at 12:11:03PM +, Van Haaren, Harry wrote: > > > > > -Original Message- > > > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob > > Thanks. One other suggestion is that it might be useful to provide > > support for having typed queues explicitly in the API. Right now, when > > you create an queue, the queue_conf structure takes as parameters how > > many atomic flows that are needed for the queue, or how many reorder > > slots need to be reserved for it. This implicitly hints at the type of > > traffic which will be sent to the queue, but I'm wondering if it's > > better to make it explicit. There are certain optimisations that can be > > looked at if we know that a queue only handles packets of a particular > > type. [Not having to handle reordering when pulling events from a core > > can be a big win for software!]. > > If it helps in SW implementation, then I think we can add this in queue > configuration. > > > > > How about adding: "allowed_event_types" as a field to > > rte_event_queue_conf, with possible values: > > * atomic > > * ordered > > * parallel > > * mixed - allowing all 3 types. I think allowing 2 of three types might > > make things too complicated. > > > > An open question would then be how to behave when the queue type and > > requested event type conflict. We can either throw an error, or just > > ignore the event type and always treat enqueued events as being of the > > queue type. I prefer the latter, because it's faster not having to > > error-check, and it pushes the responsibility on the app to know what > > it's doing. > > How about making default as "mixed" and let application configures what > is not required?. That way application responsibility is clear. > something similar to ETH_TXQ_FLAGS_NOMULTSEGS, ETH_TXQ_FLAGS_NOREFCOUNT > with default. > I suppose it could work, but why bother doing that? If an app knows it's only going to use one traffic type, why not let it just state what it will do rather than try to specify what it won't do. If mixed is needed, then it's easy enough to specify - and we can make it the zero/default value too. Our software implementation for now, only supports one type per queue - which we suspect should meet a lot of use-cases. We'll have to see about adding in mixed types in future. /Bruce
[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK
On Wed, Oct 26, 2016 at 01:54:14PM +0100, Bruce Richardson wrote: > On Wed, Oct 26, 2016 at 05:54:17PM +0530, Jerin Jacob wrote: > > On Wed, Oct 26, 2016 at 12:11:03PM +, Van Haaren, Harry wrote: > > > > -Original Message- > > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob > Thanks. One other suggestion is that it might be useful to provide > support for having typed queues explicitly in the API. Right now, when > you create an queue, the queue_conf structure takes as parameters how > many atomic flows that are needed for the queue, or how many reorder > slots need to be reserved for it. This implicitly hints at the type of > traffic which will be sent to the queue, but I'm wondering if it's > better to make it explicit. There are certain optimisations that can be > looked at if we know that a queue only handles packets of a particular > type. [Not having to handle reordering when pulling events from a core > can be a big win for software!]. If it helps in SW implementation, then I think we can add this in queue configuration. > > How about adding: "allowed_event_types" as a field to > rte_event_queue_conf, with possible values: > * atomic > * ordered > * parallel > * mixed - allowing all 3 types. I think allowing 2 of three types might > make things too complicated. > > An open question would then be how to behave when the queue type and > requested event type conflict. We can either throw an error, or just > ignore the event type and always treat enqueued events as being of the > queue type. I prefer the latter, because it's faster not having to > error-check, and it pushes the responsibility on the app to know what > it's doing. How about making default as "mixed" and let application configures what is not required?. That way application responsibility is clear. something similar to ETH_TXQ_FLAGS_NOMULTSEGS, ETH_TXQ_FLAGS_NOREFCOUNT with default. /Jerin > > /Bruce
[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK
On Wed, Oct 26, 2016 at 01:43:25PM +0100, Bruce Richardson wrote: > On Tue, Oct 25, 2016 at 11:19:05PM +0530, Jerin Jacob wrote: > > On Wed, Oct 12, 2016 at 01:00:16AM +0530, Jerin Jacob wrote: > > > Thanks to Intel and NXP folks for the positive and constructive feedback > > > I've received so far. Here is the updated RFC(v2). > > > > > > I've attempted to address as many comments as possible. > > > > > > This series adds rte_eventdev.h to the DPDK tree with > > > adequate documentation in doxygen format. > > > > > > Updates are also available online: > > > > > > Related draft header file (this patch): > > > https://rawgit.com/jerinjacobk/libeventdev/master/rte_eventdev.h > > > > > > PDF version(doxgen output): > > > https://rawgit.com/jerinjacobk/libeventdev/master/librte_eventdev_v2.pdf > > > > > > Repo: > > > https://github.com/jerinjacobk/libeventdev > > > > > > > Hi Community, > > > > So far, I have received constructive feedback from Intel, NXP and Linaro > > folks. > > Let me know, if anyone else interested in contributing to the definition of > > eventdev? > > > > If there are no major issues in proposed spec, then Cavium would like work > > on > > implementing and up-streaming the common code(lib/librte_eventdev/) and > > an associated HW driver.(Requested minor changes of v2 will be addressed > > in next version). > > > > We are planning to submit the work for 17.02 or 17.05 release(based on > > how implementation goes). > > > > Hi Jerin, Hi Bruce, > > thanks for driving this. In terms of the common code framework, when > would you see that you might have something to upstream for that? As you > know, we've been working on a software implementation which we are now > looking to move to the eventdev APIs, and which also needs this common > code to support it. > > If it can accelerate this effort, we can perhaps provide as an RFC > the common code part that we have implemented for our work, or else we > are happy to migrate to use common code you provide if it can be > upstreamed fairly soon. I have already started the common code framework. I will send the common code as RFC in couple of days with vdev and pci bus interface. > > Regards, > /Bruce
[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK
Le 26 octobre 2016 2:11:26 PM "Van Haaren, Harry" a ?crit : >> -Original Message- >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob >> >> So far, I have received constructive feedback from Intel, NXP and Linaro >> folks. >> Let me know, if anyone else interested in contributing to the definition of >> eventdev? >> >> If there are no major issues in proposed spec, then Cavium would like work on >> implementing and up-streaming the common code(lib/librte_eventdev/) and >> an associated HW driver.(Requested minor changes of v2 will be addressed >> in next version). > > Hi All, > > I will propose a minor change to the rte_event struct, allowing some bits > to be implementation specific. Currently the rte_event struct has no space > to allow an implementation store any metadata about the event. For software > performance it would be really helpful if there are some bits available for > the implementation to keep some flags about each event. > > I suggest to rework the struct as below which opens 6 bits that were > otherwise wasted, and define them as implementation specific. By > implementation specific it is understood that the implementation can > overwrite any information stored in those bits, and the application must > not expect the data to remain after the event is scheduled. > > OLD: > struct rte_event { > uint32_t flow_id:24; > uint32_t queue_id:8; > uint8_t sched_type; /* Note only 2 bits of 8 are required */ > > NEW: > struct rte_event { > uint32_t flow_id:24; > uint32_t sched_type:2; /* reduced size : but 2 bits is enough for the > enqueue types Ordered,Atomic,Parallel.*/ > uint32_t implementation:6; /* available for implementation specific > metadata */ > uint8_t queue_id; /* still 8 bits as before */ Bitfileds are efficients on Octeon. What's about other CPUs you have in mind? x86 is not as efficient. > > > Thoughts? -Harry
[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK
On Wed, Oct 26, 2016 at 12:11:03PM +, Van Haaren, Harry wrote: > > -Original Message- > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob > > > > So far, I have received constructive feedback from Intel, NXP and Linaro > > folks. > > Let me know, if anyone else interested in contributing to the definition of > > eventdev? > > > > If there are no major issues in proposed spec, then Cavium would like work > > on > > implementing and up-streaming the common code(lib/librte_eventdev/) and > > an associated HW driver.(Requested minor changes of v2 will be addressed > > in next version). > > Hi All, > > I will propose a minor change to the rte_event struct, allowing some bits to > be implementation specific. Currently the rte_event struct has no space to > allow an implementation store any metadata about the event. For software > performance it would be really helpful if there are some bits available for > the implementation to keep some flags about each event. OK. > > I suggest to rework the struct as below which opens 6 bits that were > otherwise wasted, and define them as implementation specific. By > implementation specific it is understood that the implementation can > overwrite any information stored in those bits, and the application must not > expect the data to remain after the event is scheduled. > > OLD: > struct rte_event { > uint32_t flow_id:24; > uint32_t queue_id:8; > uint8_t sched_type; /* Note only 2 bits of 8 are required */ > > NEW: > struct rte_event { > uint32_t flow_id:24; > uint32_t sched_type:2; /* reduced size : but 2 bits is enough for the > enqueue types Ordered,Atomic,Parallel.*/ > uint32_t implementation:6; /* available for implementation specific > metadata */ > uint8_t queue_id; /* still 8 bits as before */ > > > Thoughts? -Harry Looks good to me. I will add it in v3.
[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK
On Wed, Oct 26, 2016 at 05:54:17PM +0530, Jerin Jacob wrote: > On Wed, Oct 26, 2016 at 12:11:03PM +, Van Haaren, Harry wrote: > > > -Original Message- > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob > > > > > > So far, I have received constructive feedback from Intel, NXP and Linaro > > > folks. > > > Let me know, if anyone else interested in contributing to the definition > > > of eventdev? > > > > > > If there are no major issues in proposed spec, then Cavium would like > > > work on > > > implementing and up-streaming the common code(lib/librte_eventdev/) and > > > an associated HW driver.(Requested minor changes of v2 will be addressed > > > in next version). > > > > Hi All, > > > > I will propose a minor change to the rte_event struct, allowing some bits > > to be implementation specific. Currently the rte_event struct has no space > > to allow an implementation store any metadata about the event. For software > > performance it would be really helpful if there are some bits available for > > the implementation to keep some flags about each event. > > OK. > > > > > I suggest to rework the struct as below which opens 6 bits that were > > otherwise wasted, and define them as implementation specific. By > > implementation specific it is understood that the implementation can > > overwrite any information stored in those bits, and the application must > > not expect the data to remain after the event is scheduled. > > > > OLD: > > struct rte_event { > > uint32_t flow_id:24; > > uint32_t queue_id:8; > > uint8_t sched_type; /* Note only 2 bits of 8 are required */ > > > > NEW: > > struct rte_event { > > uint32_t flow_id:24; > > uint32_t sched_type:2; /* reduced size : but 2 bits is enough for the > > enqueue types Ordered,Atomic,Parallel.*/ > > uint32_t implementation:6; /* available for implementation specific > > metadata */ > > uint8_t queue_id; /* still 8 bits as before */ > > > > > > Thoughts? -Harry > > Looks good to me. I will add it in v3. > Thanks. One other suggestion is that it might be useful to provide support for having typed queues explicitly in the API. Right now, when you create an queue, the queue_conf structure takes as parameters how many atomic flows that are needed for the queue, or how many reorder slots need to be reserved for it. This implicitly hints at the type of traffic which will be sent to the queue, but I'm wondering if it's better to make it explicit. There are certain optimisations that can be looked at if we know that a queue only handles packets of a particular type. [Not having to handle reordering when pulling events from a core can be a big win for software!]. How about adding: "allowed_event_types" as a field to rte_event_queue_conf, with possible values: * atomic * ordered * parallel * mixed - allowing all 3 types. I think allowing 2 of three types might make things too complicated. An open question would then be how to behave when the queue type and requested event type conflict. We can either throw an error, or just ignore the event type and always treat enqueued events as being of the queue type. I prefer the latter, because it's faster not having to error-check, and it pushes the responsibility on the app to know what it's doing. /Bruce
[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK
On Tue, Oct 25, 2016 at 11:19:05PM +0530, Jerin Jacob wrote: > On Wed, Oct 12, 2016 at 01:00:16AM +0530, Jerin Jacob wrote: > > Thanks to Intel and NXP folks for the positive and constructive feedback > > I've received so far. Here is the updated RFC(v2). > > > > I've attempted to address as many comments as possible. > > > > This series adds rte_eventdev.h to the DPDK tree with > > adequate documentation in doxygen format. > > > > Updates are also available online: > > > > Related draft header file (this patch): > > https://rawgit.com/jerinjacobk/libeventdev/master/rte_eventdev.h > > > > PDF version(doxgen output): > > https://rawgit.com/jerinjacobk/libeventdev/master/librte_eventdev_v2.pdf > > > > Repo: > > https://github.com/jerinjacobk/libeventdev > > > > Hi Community, > > So far, I have received constructive feedback from Intel, NXP and Linaro > folks. > Let me know, if anyone else interested in contributing to the definition of > eventdev? > > If there are no major issues in proposed spec, then Cavium would like work on > implementing and up-streaming the common code(lib/librte_eventdev/) and > an associated HW driver.(Requested minor changes of v2 will be addressed > in next version). > > We are planning to submit the work for 17.02 or 17.05 release(based on > how implementation goes). > Hi Jerin, thanks for driving this. In terms of the common code framework, when would you see that you might have something to upstream for that? As you know, we've been working on a software implementation which we are now looking to move to the eventdev APIs, and which also needs this common code to support it. If it can accelerate this effort, we can perhaps provide as an RFC the common code part that we have implemented for our work, or else we are happy to migrate to use common code you provide if it can be upstreamed fairly soon. Regards, /Bruce
[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK
> -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob > > So far, I have received constructive feedback from Intel, NXP and Linaro > folks. > Let me know, if anyone else interested in contributing to the definition of > eventdev? > > If there are no major issues in proposed spec, then Cavium would like work on > implementing and up-streaming the common code(lib/librte_eventdev/) and > an associated HW driver.(Requested minor changes of v2 will be addressed > in next version). Hi All, I will propose a minor change to the rte_event struct, allowing some bits to be implementation specific. Currently the rte_event struct has no space to allow an implementation store any metadata about the event. For software performance it would be really helpful if there are some bits available for the implementation to keep some flags about each event. I suggest to rework the struct as below which opens 6 bits that were otherwise wasted, and define them as implementation specific. By implementation specific it is understood that the implementation can overwrite any information stored in those bits, and the application must not expect the data to remain after the event is scheduled. OLD: struct rte_event { uint32_t flow_id:24; uint32_t queue_id:8; uint8_t sched_type; /* Note only 2 bits of 8 are required */ NEW: struct rte_event { uint32_t flow_id:24; uint32_t sched_type:2; /* reduced size : but 2 bits is enough for the enqueue types Ordered,Atomic,Parallel.*/ uint32_t implementation:6; /* available for implementation specific metadata */ uint8_t queue_id; /* still 8 bits as before */ Thoughts? -Harry
[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK
On Wed, Oct 12, 2016 at 01:00:16AM +0530, Jerin Jacob wrote: > Thanks to Intel and NXP folks for the positive and constructive feedback > I've received so far. Here is the updated RFC(v2). > > I've attempted to address as many comments as possible. > > This series adds rte_eventdev.h to the DPDK tree with > adequate documentation in doxygen format. > > Updates are also available online: > > Related draft header file (this patch): > https://rawgit.com/jerinjacobk/libeventdev/master/rte_eventdev.h > > PDF version(doxgen output): > https://rawgit.com/jerinjacobk/libeventdev/master/librte_eventdev_v2.pdf > > Repo: > https://github.com/jerinjacobk/libeventdev > Hi Community, So far, I have received constructive feedback from Intel, NXP and Linaro folks. Let me know, if anyone else interested in contributing to the definition of eventdev? If there are no major issues in proposed spec, then Cavium would like work on implementing and up-streaming the common code(lib/librte_eventdev/) and an associated HW driver.(Requested minor changes of v2 will be addressed in next version). We are planning to submit the work for 17.02 or 17.05 release(based on how implementation goes). /Jerin Cavium
[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK
On Mon, Oct 17, 2016 at 08:26:33PM +, Eads, Gage wrote: > > > > -Original Message- > > From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com] > > Sent: Sunday, October 16, 2016 11:18 PM > > To: Eads, Gage > > Cc: dev at dpdk.org; thomas.monjalon at 6wind.com; Richardson, Bruce > > ; Vangati, Narender > > ; hemant.agrawal at nxp.com > > Subject: Re: [dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven > > programming model framework for DPDK > > > > On Fri, Oct 14, 2016 at 03:00:57PM +, Eads, Gage wrote: > > > Thanks Jerin, this looks good. I've put a few notes/questions inline. > > > > Thanks Gage. > > > > > > > > > + > > > > +/** > > > > + * Get the device identifier for the named event device. > > > > + * > > > > + * @param name > > > > + * Event device name to select the event device identifier. > > > > + * > > > > + * @return > > > > + * Returns event device identifier on success. > > > > + * - <0: Failure to find named event device. > > > > + */ > > > > +extern uint8_t > > > > +rte_event_dev_get_dev_id(const char *name); > > > > > > This return type should be int8_t, or some signed type, to support the > > failure > > case. > > > > Makes sense. I will change to int to make consistent with > > rte_cryptodev_get_dev_id() > > > > > > > > > +}; > > > > + > > > > +/** > > > > + * Schedule one or more events in the event dev. > > > > + * > > > > + * An event dev implementation may define this is a NOOP, for > > > > instance if + * the event dev performs its scheduling in hardware. > > > > + * > > > > + * @param dev_id > > > > + * The identifier of the device. > > > > + */ > > > > +extern void > > > > +rte_event_schedule(uint8_t dev_id); > > > > > > One idea: Have the function return the number of scheduled packets (or 0 > > for > > implementations that do scheduling in hardware). This could be a helpful > > diagnostic for the software scheduler. > > > > How about returning an implementation specific value ? > > Rather than defining certain function associated with returned value. > > Just to make sure it works with all HW/SW implementations. Something like > > below, > > > > /** > > * Schedule one or more events in the event dev. > > * > > * An event dev implementation may define this is a NOOP, for instance if > > * the event dev performs its scheduling in hardware. > > * > > * @param dev_id > > * The identifier of the device. > > * @return > > * Implementation specific value from the event driver for diagnostic > > purpose > > */ > > extern int > > rte_event_schedule(uint8_t dev_id); > > > > > > That's fine by me. OK. I will change it in v3 > > I also had a comment on the return value of rte_event_dev_info_get() in my > previous email: "I'm wondering if this return type should be int, so we can > return an error if the dev_id is invalid." > > What do you think? The void return was based on cryptodev_info_get().I think, it makes sense to return "int". I will change it in v3. > > Thanks, > Gage > > > > >
[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK
> -Original Message- > From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com] > Sent: Sunday, October 16, 2016 11:18 PM > To: Eads, Gage > Cc: dev at dpdk.org; thomas.monjalon at 6wind.com; Richardson, Bruce > ; Vangati, Narender > ; hemant.agrawal at nxp.com > Subject: Re: [dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven > programming model framework for DPDK > > On Fri, Oct 14, 2016 at 03:00:57PM +, Eads, Gage wrote: > > Thanks Jerin, this looks good. I've put a few notes/questions inline. > > Thanks Gage. > > > > > > + > > > +/** > > > + * Get the device identifier for the named event device. > > > + * > > > + * @param name > > > + * Event device name to select the event device identifier. > > > + * > > > + * @return > > > + * Returns event device identifier on success. > > > + * - <0: Failure to find named event device. > > > + */ > > > +extern uint8_t > > > +rte_event_dev_get_dev_id(const char *name); > > > > This return type should be int8_t, or some signed type, to support the > failure > case. > > Makes sense. I will change to int to make consistent with > rte_cryptodev_get_dev_id() > > > > > > +}; > > > + > > > +/** > > > + * Schedule one or more events in the event dev. > > > + * > > > + * An event dev implementation may define this is a NOOP, for > > > instance if + * the event dev performs its scheduling in hardware. > > > + * > > > + * @param dev_id > > > + * The identifier of the device. > > > + */ > > > +extern void > > > +rte_event_schedule(uint8_t dev_id); > > > > One idea: Have the function return the number of scheduled packets (or 0 > for > implementations that do scheduling in hardware). This could be a helpful > diagnostic for the software scheduler. > > How about returning an implementation specific value ? > Rather than defining certain function associated with returned value. > Just to make sure it works with all HW/SW implementations. Something like > below, > > /** > * Schedule one or more events in the event dev. > * > * An event dev implementation may define this is a NOOP, for instance if > * the event dev performs its scheduling in hardware. > * > * @param dev_id > * The identifier of the device. > * @return > * Implementation specific value from the event driver for diagnostic > purpose > */ > extern int > rte_event_schedule(uint8_t dev_id); > > That's fine by me. I also had a comment on the return value of rte_event_dev_info_get() in my previous email: "I'm wondering if this return type should be int, so we can return an error if the dev_id is invalid." What do you think? Thanks, Gage > >
[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK
On Fri, Oct 14, 2016 at 05:02:21PM +0100, Bruce Richardson wrote: > On Wed, Oct 12, 2016 at 01:00:16AM +0530, Jerin Jacob wrote: > > Thanks to Intel and NXP folks for the positive and constructive feedback > > I've received so far. Here is the updated RFC(v2). > > > > I've attempted to address as many comments as possible. > > > > This series adds rte_eventdev.h to the DPDK tree with > > adequate documentation in doxygen format. > > > > Updates are also available online: > > > > Related draft header file (this patch): > > https://rawgit.com/jerinjacobk/libeventdev/master/rte_eventdev.h > > > > PDF version(doxgen output): > > https://rawgit.com/jerinjacobk/libeventdev/master/librte_eventdev_v2.pdf > > > > Repo: > > https://github.com/jerinjacobk/libeventdev > > > > Thanks for all the work on this. Thanks > > > > +/* Event device configuration bitmap flags */ > > +#define RTE_EVENT_DEV_CFG_PER_DEQUEUE_WAIT (1 << 0) > > +/**< Override the global *dequeue_wait_ns* and use per dequeue wait in ns. > > + * \see rte_event_dequeue_wait_time(), rte_event_dequeue() > > + */ > > Can you clarify why this is needed? If an app wants to use the same > dequeue wait times for all dequeues can it not specify that itself via > the wait time parameter, rather than having a global dequeue wait value? The rational for choosing this scheme to have optimized rte_event_dequeue() for some implementation without loosing application portability and need. We mostly have two different types of HW schemes to define the wait time HW1) Have only global wait value for the eventdev across all the dequeue HW2) Per queue wait value In-terms of applications, APP1) Trivial application does not need different dequeue value for each dequeue APP2) Non trivial applications does need different dequeue values This config option can take advantage if application demands only APP1 on HW1 without loosing application potablity.(i.e if application demand for APP2 scheme then HW1 based implementation can have different function pointer to implement dequeue function) The overall theme of the proposal to have more configuration options(like RTE_EVENT_QUEUE_CFG_SINGLE_CONSUMER) to have high performance SW/HW implementations
[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK
On Fri, Oct 14, 2016 at 03:00:57PM +, Eads, Gage wrote: > Thanks Jerin, this looks good. I've put a few notes/questions inline. Thanks Gage. > > > + > > +/** > > + * Get the device identifier for the named event device. > > + * > > + * @param name > > + * Event device name to select the event device identifier. > > + * > > + * @return > > + * Returns event device identifier on success. > > + * - <0: Failure to find named event device. > > + */ > > +extern uint8_t > > +rte_event_dev_get_dev_id(const char *name); > > This return type should be int8_t, or some signed type, to support the > failure case. Makes sense. I will change to int to make consistent with rte_cryptodev_get_dev_id() > > > +}; > > + > > +/** > > + * Schedule one or more events in the event dev. > > + * > > + * An event dev implementation may define this is a NOOP, for instance if > > + * the event dev performs its scheduling in hardware. > > + * > > + * @param dev_id > > + * The identifier of the device. > > + */ > > +extern void > > +rte_event_schedule(uint8_t dev_id); > > One idea: Have the function return the number of scheduled packets (or 0 for > implementations that do scheduling in hardware). This could be a helpful > diagnostic for the software scheduler. How about returning an implementation specific value ? Rather than defining certain function associated with returned value. Just to make sure it works with all HW/SW implementations. Something like below, /** * Schedule one or more events in the event dev. * * An event dev implementation may define this is a NOOP, for instance if * the event dev performs its scheduling in hardware. * * @param dev_id * The identifier of the device. * @return * Implementation specific value from the event driver for diagnostic purpose */ extern int rte_event_schedule(uint8_t dev_id);
[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK
On Fri, Oct 14, 2016 at 10:30:33AM +, Hemant Agrawal wrote: > > > Am I reading this correctly that there is no way to support an > > > indefinite waiting capability? Or is this just saying that if a timed > > > wait is performed there are min/max limits for the wait duration? > > > > Application can wait indefinite if required. see > > RTE_EVENT_DEV_CFG_PER_DEQUEUE_WAIT configuration option. > > > > Trivial application may not need different wait values on each dequeue.This > > is a > > performance optimization opportunity for implementation. > > Jerin, It is irrespective of wait configuration, whether you are using per > device wait or per dequeuer wait. > Can the value of MAX_U32 or MAX_U64 be treated as infinite weight? That will be yet another check in the fast path in the implementation, I think, for more fine-grained wait scheme. Let application configure the device with RTE_EVENT_DEV_CFG_PER_DEQUEUE_WAIT so that the application can have two different function pointer-based implementation for dequeue function if required. With RTE_EVENT_DEV_CFG_PER_DEQUEUE_WAIT configuration, implicitly MAX_U64 becomes infinite weight as the wait is uint64_t. I can add this info in v3 if required. Jerin > > >
[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK
On Wed, Oct 12, 2016 at 01:00:16AM +0530, Jerin Jacob wrote: > Thanks to Intel and NXP folks for the positive and constructive feedback > I've received so far. Here is the updated RFC(v2). > > I've attempted to address as many comments as possible. > > This series adds rte_eventdev.h to the DPDK tree with > adequate documentation in doxygen format. > > Updates are also available online: > > Related draft header file (this patch): > https://rawgit.com/jerinjacobk/libeventdev/master/rte_eventdev.h > > PDF version(doxgen output): > https://rawgit.com/jerinjacobk/libeventdev/master/librte_eventdev_v2.pdf > > Repo: > https://github.com/jerinjacobk/libeventdev > Thanks for all the work on this. > +/* Event device configuration bitmap flags */ > +#define RTE_EVENT_DEV_CFG_PER_DEQUEUE_WAIT (1 << 0) > +/**< Override the global *dequeue_wait_ns* and use per dequeue wait in ns. > + * \see rte_event_dequeue_wait_time(), rte_event_dequeue() > + */ Can you clarify why this is needed? If an app wants to use the same dequeue wait times for all dequeues can it not specify that itself via the wait time parameter, rather than having a global dequeue wait value? /Bruce
[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK
Dear Jerin, Very nice work! This new RFC version opens the way to a unified conceptual model of Software Defined Data Planes supported by diverse implementations such as OpenDataPlane and DPDK. I think this is an important signal to the industry. Fran?ois-Fr?d?ric From: dev <dev-boun...@dpdk.org> on behalf of Jerin Jacob Sent: Tuesday, October 11, 2016 9:30 PM To: dev at dpdk.org Cc: thomas.monjalon at 6wind.com; bruce.richardson at intel.com; narender.vangati at intel.com; hemant.agrawal at nxp.com; gage.eads at intel.com; Jerin Jacob Subject: [dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK Thanks to Intel and NXP folks for the positive and constructive feedback I've received so far. Here is the updated RFC(v2). I've attempted to address as many comments as possible. This series adds rte_eventdev.h to the DPDK tree with adequate documentation in doxygen format. Updates are also available online: Related draft header file (this patch): https://rawgit.com/jerinjacobk/libeventdev/master/rte_eventdev.h PDF version(doxgen output): https://rawgit.com/jerinjacobk/libeventdev/master/librte_eventdev_v2.pdf Repo: https://github.com/jerinjacobk/libeventdev v1..v2 - Added Cavium, Intel, NXP copyrights in header file - Changed the concept of flow queues to flow ids. This is avoid dictating a specific structure to hold the flows. A s/w implementation can do atomic load balancing on multiple flow ids more efficiently than maintaining each event in a specific flow queue. - Change the scheduling group to event queue. A scheduling group is more a stream of events, so an event queue is a better abstraction. - Introduced event port concept, Instead of trying eventdev access to the lcore, a higher level of abstraction called event port is needed which is the application i/f to the eventdev to dequeue and enqueue the events. One or more event queues can be linked to single event port. There can be more than one event port per lcore allowing multiple lightweight threads to have their own i/f into eventdev, if the implementation supports it. An event port will be bound to a lcore or a lightweight thread to keep portable application workflow. An event port abstraction also encapsulates dequeue depth and enqueue depth for a scheduler implementations which can schedule multiple events at a time and output events that can be buffered. - Added configuration options with event queue(nb_atomic_flows, nb_atomic_order_sequences, single consumer etc) and event port(dequeue_queue_depth, enqueue_queue_depth etc) to define the limits on the resource usage.(Useful for optimized software implementation) - Introduced RTE_EVENT_DEV_CAP_QUEUE_QOS and RTE_EVENT_DEV_CAP_EVENT_QOS schemes of priority handling - Added event port to event queue servicing priority. This allows two event ports to connect to the same event queue with different priorities. - Changed the workflow as schedule/dequeue/enqueue. An implementation is free to define schedule as NOOP. A distributed s/w scheduler can use this to schedule events; also a centralized s/w scheduler can make this a NOOP on non-scheduler cores. - Removed Cavium HW specific schedule_from_group API - Removed Cavium HW specific ctxt_update/ctxt_wait APIs. Introduced a more generic "event pinning" concept. i.e If the normal workflow is a dequeue -> do work based on event type -> enqueue, a pin_event argument to enqueue where the pinned event is returned through the normal dequeue) allows application workflow to remain the same whether or not an implementation supports it. - Added dequeue() burst variant - Added the definition of a closed/open system - where open system is memory backed and closed system eventdev has limited capacity. In such systems, it is also useful to denote per event port how many packets can be active in the system. This can serve as a threshold for ethdev like devices so they don't overwhelm core to core events. - Added the option to specify maximum amount of time(in ns) application needs wait on dequeue() - Removed the scheme of expressing the number of flows in log2 format Open item or the item needs improvement. - Abstract the differences in event QoS management with different priority schemes available in different HW or SW implementations with portable application workflow. Based on the feedback, there three different kinds of QoS support available in three different HW or SW implementations. 1) Priority associated with the event queue 2) Priority associated with each event enqueue (Same flow can have two different priority on two separate enqueue) 3) Priority associated with the flow(each flow has unique priority) In v2, The differences abstracted based on device capability (RTE_EVENT_DEV_CAP_QUEUE_QOS for the first scheme, RTE_EVENT_DEV_CAP_EVENT_QOS for the second and third scheme). This scheme wo
[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK
Thanks Jerin, this looks good. I've put a few notes/questions inline. Thanks, Gage > -Original Message- > From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com] > Sent: Tuesday, October 11, 2016 2:30 PM > To: dev at dpdk.org > Cc: thomas.monjalon at 6wind.com; Richardson, Bruce > ; Vangati, Narender > ; hemant.agrawal at nxp.com; Eads, Gage > ; Jerin Jacob > Subject: [dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming > model framework for DPDK > > Thanks to Intel and NXP folks for the positive and constructive feedback > I've received so far. Here is the updated RFC(v2). > > I've attempted to address as many comments as possible. > > This series adds rte_eventdev.h to the DPDK tree with > adequate documentation in doxygen format. > > Updates are also available online: > > Related draft header file (this patch): > https://rawgit.com/jerinjacobk/libeventdev/master/rte_eventdev.h > > PDF version(doxgen output): > https://rawgit.com/jerinjacobk/libeventdev/master/librte_eventdev_v2.pdf > > Repo: > https://github.com/jerinjacobk/libeventdev > > v1..v2 > > - Added Cavium, Intel, NXP copyrights in header file > > - Changed the concept of flow queues to flow ids. > This is avoid dictating a specific structure to hold the flows. > A s/w implementation can do atomic load balancing on multiple > flow ids more efficiently than maintaining each event in a specific flow > queue. > > - Change the scheduling group to event queue. > A scheduling group is more a stream of events, so an event queue is a better > abstraction. > > - Introduced event port concept, Instead of trying eventdev access to the > lcore, > a higher level of abstraction called event port is needed which is the > application i/f to the eventdev to dequeue and enqueue the events. > One or more event queues can be linked to single event port. > There can be more than one event port per lcore allowing multiple lightweight > threads to have their own i/f into eventdev, if the implementation supports > it. > An event port will be bound to a lcore or a lightweight thread to keep > portable application workflow. > An event port abstraction also encapsulates dequeue depth and enqueue depth > for > a scheduler implementations which can schedule multiple events at a time and > output events that can be buffered. > > - Added configuration options with event queue(nb_atomic_flows, > nb_atomic_order_sequences, single consumer etc) > and event port(dequeue_queue_depth, enqueue_queue_depth etc) to define > the > limits on the resource usage.(Useful for optimized software implementation) > > - Introduced RTE_EVENT_DEV_CAP_QUEUE_QOS and > RTE_EVENT_DEV_CAP_EVENT_QOS > schemes of priority handling > > - Added event port to event queue servicing priority. > This allows two event ports to connect to the same event queue with > different priorities. > > - Changed the workflow as schedule/dequeue/enqueue. > An implementation is free to define schedule as NOOP. > A distributed s/w scheduler can use this to schedule events; > also a centralized s/w scheduler can make this a NOOP on non-scheduler cores. > > - Removed Cavium HW specific schedule_from_group API > > - Removed Cavium HW specific ctxt_update/ctxt_wait APIs. > Introduced a more generic "event pinning" concept. i.e > If the normal workflow is a dequeue -> do work based on event type -> > enqueue, > a pin_event argument to enqueue > where the pinned event is returned through the normal dequeue) > allows application workflow to remain the same whether or not an > implementation supports it. > > - Added dequeue() burst variant > > - Added the definition of a closed/open system - where open system is memory > backed and closed system eventdev has limited capacity. > In such systems, it is also useful to denote per event port how many packets > can be active in the system. > This can serve as a threshold for ethdev like devices so they don't overwhelm > core to core events. > > - Added the option to specify maximum amount of time(in ns) application needs > wait on dequeue() > > - Removed the scheme of expressing the number of flows in log2 format > > Open item or the item needs improvement. > > - Abstract the differences in event QoS management with different priority > schemes > available in different HW or SW implementations with portable application > workflow. > > Based on the feedback, there three different kinds of QoS support available > in
[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK
On Thu, Oct 13, 2016 at 11:14:38PM -0500, Bill Fischofer wrote: > Hi Jerin, Hi Bill, Thanks for the review. [snip] > > + * If the device init operation is successful, the correspondence between > > + * the device identifier assigned to the new device and its associated > > + * *rte_event_dev* structure is effectively registered. > > + * Otherwise, both the *rte_event_dev* structure and the device > > identifier are > > + * freed. > > + * > > + * The functions exported by the application Event API to setup a device > > + * designated by its device identifier must be invoked in the following > > order: > > + * - rte_event_dev_configure() > > + * - rte_event_queue_setup() > > + * - rte_event_port_setup() > > + * - rte_event_port_link() > > + * - rte_event_dev_start() > > + * > > + * Then, the application can invoke, in any order, the functions > > + * exported by the Event API to schedule events, dequeue events, enqueue > > events, > > + * change event queue(s) to event port [un]link establishment and so on. > > + * > > + * Application may use rte_event_[queue/port]_default_conf_get() to get > > the > > + * default configuration to set up an event queue or event port by > > + * overriding few default values. > > + * > > + * If the application wants to change the configuration (i.e. call > > + * rte_event_dev_configure(), rte_event_queue_setup(), or > > + * rte_event_port_setup()), it must call rte_event_dev_stop() first to > > stop the > > + * device and then do the reconfiguration before calling > > rte_event_dev_start() > > + * again. The schedule, enqueue and dequeue functions should not be > > invoked > > + * when the device is stopped. > > > > Given this requirement, the question is what happens to events that are "in > flight" at the time rte_event_dev_stop() is called? Is stop an asynchronous > operation that quiesces the event _dev and allows in-flight events to drain > from queues/ports prior to fully stopping, or is some sort of separate > explicit quiesce mechanism required? If stop is synchronous and simply > halts the event_dev, then how is an application to know if subsequent > configure/setup calls would leave these pending events with no place to > stand? > >From an application API perspective rte_event_dev_stop() is a synchronous >function. If the stop has been called for re-configuring the number of queues, ports etc of the device, then "in flight" entry preservation will be implementation defined. else "in flight" entries will be preserved. [snip] > > +extern int > > +rte_event_dev_socket_id(uint8_t dev_id); > > + > > +/* Event device capability bitmap flags */ > > +#define RTE_EVENT_DEV_CAP_QUEUE_QOS(1 << 0) > > +/**< Event scheduling prioritization is based on the priority associated > > with > > + * each event queue. > > + * > > + * \see rte_event_queue_setup(), RTE_EVENT_QUEUE_PRIORITY_NORMAL > > + */ > > +#define RTE_EVENT_DEV_CAP_EVENT_QOS(1 << 1) > > +/**< Event scheduling prioritization is based on the priority associated > > with > > + * each event. Priority of each event is supplied in *rte_event* > > structure > > + * on each enqueue operation. > > + * > > + * \see rte_event_enqueue() > > + */ > > + > > +/** > > + * Event device information > > + */ > > +struct rte_event_dev_info { > > + const char *driver_name;/**< Event driver name */ > > + struct rte_pci_device *pci_dev; /**< PCI information */ > > + uint32_t min_dequeue_wait_ns; > > + /**< Minimum supported global dequeue wait delay(ns) by this > > device */ > > + uint32_t max_dequeue_wait_ns; > > + /**< Maximum supported global dequeue wait delay(ns) by this > > device */ > > + uint32_t dequeue_wait_ns; > > > > Am I reading this correctly that there is no way to support an indefinite > waiting capability? Or is this just saying that if a timed wait is > performed there are min/max limits for the wait duration? Application can wait indefinite if required. see RTE_EVENT_DEV_CFG_PER_DEQUEUE_WAIT configuration option. Trivial application may not need different wait values on each dequeue.This is a performance optimization opportunity for implementation. > > > > + /**< Configured global dequeue wait delay(ns) for this device */ > > + uint8_t max_event_queues; > > + /**< Maximum event_queues supported by this device */ > > + uint32_t max_event_queue_flows; > > + /**< Maximum supported flows in an event queue by this device*/ > > + uint8_t max_event_queue_priority_levels; > > + /**< Maximum number of event queue priority levels by this device. > > +* Valid when the device has RTE_EVENT_DEV_CAP_QUEUE_QOS capability > > +*/ > > + uint8_t nb_event_queues; > > + /**< Configured number of event queues for this device */ > > > > Is 256 a sufficient number of queues? While various SoCs may have limits, > why impose such a small limit architecturally? Each event
[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK
Hi Bill/Jerin, > > Thanks for the review. > > [snip] > > > + * If the device init operation is successful, the correspondence > > > + between > > > + * the device identifier assigned to the new device and its > > > + associated > > > + * *rte_event_dev* structure is effectively registered. > > > + * Otherwise, both the *rte_event_dev* structure and the device > > > identifier are > > > + * freed. > > > + * > > > + * The functions exported by the application Event API to setup a > > > + device > > > + * designated by its device identifier must be invoked in the > > > + following > > > order: > > > + * - rte_event_dev_configure() > > > + * - rte_event_queue_setup() > > > + * - rte_event_port_setup() > > > + * - rte_event_port_link() > > > + * - rte_event_dev_start() > > > + * > > > + * Then, the application can invoke, in any order, the functions > > > + * exported by the Event API to schedule events, dequeue events, > > > + enqueue > > > events, > > > + * change event queue(s) to event port [un]link establishment and so on. > > > + * > > > + * Application may use rte_event_[queue/port]_default_conf_get() to > > > + get > > > the > > > + * default configuration to set up an event queue or event port by > > > + * overriding few default values. > > > + * > > > + * If the application wants to change the configuration (i.e. call > > > + * rte_event_dev_configure(), rte_event_queue_setup(), or > > > + * rte_event_port_setup()), it must call rte_event_dev_stop() first > > > + to > > > stop the > > > + * device and then do the reconfiguration before calling > > > rte_event_dev_start() > > > + * again. The schedule, enqueue and dequeue functions should not be > > > invoked > > > + * when the device is stopped. > > > > > > > Given this requirement, the question is what happens to events that > > are "in flight" at the time rte_event_dev_stop() is called? Is stop an > > asynchronous operation that quiesces the event _dev and allows > > in-flight events to drain from queues/ports prior to fully stopping, > > or is some sort of separate explicit quiesce mechanism required? If > > stop is synchronous and simply halts the event_dev, then how is an > > application to know if subsequent configure/setup calls would leave > > these pending events with no place to stand? > > > > From an application API perspective rte_event_dev_stop() is a synchronous > function. > If the stop has been called for re-configuring the number of queues, ports > etc of > the device, then "in flight" entry preservation will be implementation > defined. > else "in flight" entries will be preserved. > > [snip] > > > > +extern int > > > +rte_event_dev_socket_id(uint8_t dev_id); > > > + > > > +/* Event device capability bitmap flags */ > > > +#define RTE_EVENT_DEV_CAP_QUEUE_QOS(1 << 0) > > > +/**< Event scheduling prioritization is based on the priority > > > +associated > > > with > > > + * each event queue. > > > + * > > > + * \see rte_event_queue_setup(), RTE_EVENT_QUEUE_PRIORITY_NORMAL > > > +*/ > > > +#define RTE_EVENT_DEV_CAP_EVENT_QOS(1 << 1) > > > +/**< Event scheduling prioritization is based on the priority > > > +associated > > > with > > > + * each event. Priority of each event is supplied in *rte_event* > > > structure > > > + * on each enqueue operation. > > > + * > > > + * \see rte_event_enqueue() > > > + */ > > > + > > > +/** > > > + * Event device information > > > + */ > > > +struct rte_event_dev_info { > > > + const char *driver_name;/**< Event driver name */ > > > + struct rte_pci_device *pci_dev; /**< PCI information */ > > > + uint32_t min_dequeue_wait_ns; > > > + /**< Minimum supported global dequeue wait delay(ns) by this > > > device */ > > > + uint32_t max_dequeue_wait_ns; > > > + /**< Maximum supported global dequeue wait delay(ns) by this > > > device */ > > > + uint32_t dequeue_wait_ns; > > > > > > > Am I reading this correctly that there is no way to support an > > indefinite waiting capability? Or is this just saying that if a timed > > wait is performed there are min/max limits for the wait duration? > > Application can wait indefinite if required. see > RTE_EVENT_DEV_CFG_PER_DEQUEUE_WAIT configuration option. > > Trivial application may not need different wait values on each dequeue.This > is a > performance optimization opportunity for implementation. Jerin, It is irrespective of wait configuration, whether you are using per device wait or per dequeuer wait. Can the value of MAX_U32 or MAX_U64 be treated as infinite weight? > > > > > > > > + /**< Configured global dequeue wait delay(ns) for this device */ > > > + uint8_t max_event_queues; > > > + /**< Maximum event_queues supported by this device */ > > > + uint32_t max_event_queue_flows; > > > + /**< Maximum supported flows in an event queue by this device*/ > > > + uint8_t max_event_queue_priority_levels; > > > +
[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK
Hi Jerin, This looks reasonable and seems a welcome addition to DPDK. A few questions noted inline: On Tue, Oct 11, 2016 at 2:30 PM, Jerin Jacob wrote: > Thanks to Intel and NXP folks for the positive and constructive feedback > I've received so far. Here is the updated RFC(v2). > > I've attempted to address as many comments as possible. > > This series adds rte_eventdev.h to the DPDK tree with > adequate documentation in doxygen format. > > Updates are also available online: > > Related draft header file (this patch): > https://rawgit.com/jerinjacobk/libeventdev/master/rte_eventdev.h > > PDF version(doxgen output): > https://rawgit.com/jerinjacobk/libeventdev/master/librte_eventdev_v2.pdf > > Repo: > https://github.com/jerinjacobk/libeventdev > > v1..v2 > > - Added Cavium, Intel, NXP copyrights in header file > > - Changed the concept of flow queues to flow ids. > This is avoid dictating a specific structure to hold the flows. > A s/w implementation can do atomic load balancing on multiple > flow ids more efficiently than maintaining each event in a specific flow > queue. > > - Change the scheduling group to event queue. > A scheduling group is more a stream of events, so an event queue is a > better > abstraction. > > - Introduced event port concept, Instead of trying eventdev access to the > lcore, > a higher level of abstraction called event port is needed which is the > application i/f to the eventdev to dequeue and enqueue the events. > One or more event queues can be linked to single event port. > There can be more than one event port per lcore allowing multiple > lightweight > threads to have their own i/f into eventdev, if the implementation > supports it. > An event port will be bound to a lcore or a lightweight thread to keep > portable application workflow. > An event port abstraction also encapsulates dequeue depth and enqueue > depth for > a scheduler implementations which can schedule multiple events at a time > and > output events that can be buffered. > > - Added configuration options with event queue(nb_atomic_flows, > nb_atomic_order_sequences, single consumer etc) > and event port(dequeue_queue_depth, enqueue_queue_depth etc) to define the > limits on the resource usage.(Useful for optimized software implementation) > > - Introduced RTE_EVENT_DEV_CAP_QUEUE_QOS and RTE_EVENT_DEV_CAP_EVENT_QOS > schemes of priority handling > > - Added event port to event queue servicing priority. > This allows two event ports to connect to the same event queue with > different priorities. > > - Changed the workflow as schedule/dequeue/enqueue. > An implementation is free to define schedule as NOOP. > A distributed s/w scheduler can use this to schedule events; > also a centralized s/w scheduler can make this a NOOP on non-scheduler > cores. > > - Removed Cavium HW specific schedule_from_group API > > - Removed Cavium HW specific ctxt_update/ctxt_wait APIs. > Introduced a more generic "event pinning" concept. i.e > If the normal workflow is a dequeue -> do work based on event type -> > enqueue, > a pin_event argument to enqueue > where the pinned event is returned through the normal dequeue) > allows application workflow to remain the same whether or not an > implementation supports it. > > - Added dequeue() burst variant > > - Added the definition of a closed/open system - where open system is > memory > backed and closed system eventdev has limited capacity. > In such systems, it is also useful to denote per event port how many > packets > can be active in the system. > This can serve as a threshold for ethdev like devices so they don't > overwhelm > core to core events. > > - Added the option to specify maximum amount of time(in ns) application > needs > wait on dequeue() > > - Removed the scheme of expressing the number of flows in log2 format > > Open item or the item needs improvement. > > - Abstract the differences in event QoS management with different priority > schemes > available in different HW or SW implementations with portable application > workflow. > > Based on the feedback, there three different kinds of QoS support > available in > three different HW or SW implementations. > 1) Priority associated with the event queue > 2) Priority associated with each event enqueue > (Same flow can have two different priority on two separate enqueue) > 3) Priority associated with the flow(each flow has unique priority) > > In v2, The differences abstracted based on device capability > (RTE_EVENT_DEV_CAP_QUEUE_QOS for the first scheme, > RTE_EVENT_DEV_CAP_EVENT_QOS for the second and third scheme). > This scheme would call for different application workflow for > nontrivial QoS-enabled applications. > > Looking forward to getting comments from both application and driver > implementation perspective. > > /Jerin > > --- > doc/api/doxy-api-index.md |1 + > doc/api/doxy-api.conf |1 + > lib/librte_eventdev/rte_eventdev.h | 1204
[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK
Thanks to Intel and NXP folks for the positive and constructive feedback I've received so far. Here is the updated RFC(v2). I've attempted to address as many comments as possible. This series adds rte_eventdev.h to the DPDK tree with adequate documentation in doxygen format. Updates are also available online: Related draft header file (this patch): https://rawgit.com/jerinjacobk/libeventdev/master/rte_eventdev.h PDF version(doxgen output): https://rawgit.com/jerinjacobk/libeventdev/master/librte_eventdev_v2.pdf Repo: https://github.com/jerinjacobk/libeventdev v1..v2 - Added Cavium, Intel, NXP copyrights in header file - Changed the concept of flow queues to flow ids. This is avoid dictating a specific structure to hold the flows. A s/w implementation can do atomic load balancing on multiple flow ids more efficiently than maintaining each event in a specific flow queue. - Change the scheduling group to event queue. A scheduling group is more a stream of events, so an event queue is a better abstraction. - Introduced event port concept, Instead of trying eventdev access to the lcore, a higher level of abstraction called event port is needed which is the application i/f to the eventdev to dequeue and enqueue the events. One or more event queues can be linked to single event port. There can be more than one event port per lcore allowing multiple lightweight threads to have their own i/f into eventdev, if the implementation supports it. An event port will be bound to a lcore or a lightweight thread to keep portable application workflow. An event port abstraction also encapsulates dequeue depth and enqueue depth for a scheduler implementations which can schedule multiple events at a time and output events that can be buffered. - Added configuration options with event queue(nb_atomic_flows, nb_atomic_order_sequences, single consumer etc) and event port(dequeue_queue_depth, enqueue_queue_depth etc) to define the limits on the resource usage.(Useful for optimized software implementation) - Introduced RTE_EVENT_DEV_CAP_QUEUE_QOS and RTE_EVENT_DEV_CAP_EVENT_QOS schemes of priority handling - Added event port to event queue servicing priority. This allows two event ports to connect to the same event queue with different priorities. - Changed the workflow as schedule/dequeue/enqueue. An implementation is free to define schedule as NOOP. A distributed s/w scheduler can use this to schedule events; also a centralized s/w scheduler can make this a NOOP on non-scheduler cores. - Removed Cavium HW specific schedule_from_group API - Removed Cavium HW specific ctxt_update/ctxt_wait APIs. Introduced a more generic "event pinning" concept. i.e If the normal workflow is a dequeue -> do work based on event type -> enqueue, a pin_event argument to enqueue where the pinned event is returned through the normal dequeue) allows application workflow to remain the same whether or not an implementation supports it. - Added dequeue() burst variant - Added the definition of a closed/open system - where open system is memory backed and closed system eventdev has limited capacity. In such systems, it is also useful to denote per event port how many packets can be active in the system. This can serve as a threshold for ethdev like devices so they don't overwhelm core to core events. - Added the option to specify maximum amount of time(in ns) application needs wait on dequeue() - Removed the scheme of expressing the number of flows in log2 format Open item or the item needs improvement. - Abstract the differences in event QoS management with different priority schemes available in different HW or SW implementations with portable application workflow. Based on the feedback, there three different kinds of QoS support available in three different HW or SW implementations. 1) Priority associated with the event queue 2) Priority associated with each event enqueue (Same flow can have two different priority on two separate enqueue) 3) Priority associated with the flow(each flow has unique priority) In v2, The differences abstracted based on device capability (RTE_EVENT_DEV_CAP_QUEUE_QOS for the first scheme, RTE_EVENT_DEV_CAP_EVENT_QOS for the second and third scheme). This scheme would call for different application workflow for nontrivial QoS-enabled applications. Looking forward to getting comments from both application and driver implementation perspective. /Jerin --- doc/api/doxy-api-index.md |1 + doc/api/doxy-api.conf |1 + lib/librte_eventdev/rte_eventdev.h | 1204 3 files changed, 1206 insertions(+) create mode 100644 lib/librte_eventdev/rte_eventdev.h diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md index 6675f96..28c1329 100644 --- a/doc/api/doxy-api-index.md +++ b/doc/api/doxy-api-index.md @@ -40,6 +40,7 @@ There are many libraries, so their headers may be grouped by topics: