[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-11-02 Thread Jerin Jacob
On Wed, Nov 02, 2016 at 01:56:27PM +, Bruce Richardson wrote:
> On Wed, Nov 02, 2016 at 06:39:27PM +0530, Jerin Jacob wrote:
> > On Wed, Nov 02, 2016 at 11:35:51AM +, Bruce Richardson wrote:
> > > On Wed, Nov 02, 2016 at 04:55:22PM +0530, Jerin Jacob wrote:
> > > > On Fri, Oct 28, 2016 at 02:36:48PM +0530, Jerin Jacob wrote:
> > > > > On Fri, Oct 28, 2016 at 09:36:46AM +0100, Bruce Richardson wrote:
> > > > > > On Fri, Oct 28, 2016 at 08:31:41AM +0530, Jerin Jacob wrote:
> > > > > > > On Wed, Oct 26, 2016 at 01:54:14PM +0100, Bruce Richardson wrote:
> > > > > > > > On Wed, Oct 26, 2016 at 05:54:17PM +0530, Jerin Jacob wrote:
> > > > > > > How about making default as "mixed" and let application 
> > > > > > > configures what
> > > > > > > is not required?. That way application responsibility is clear.
> > > > > > > something similar to ETH_TXQ_FLAGS_NOMULTSEGS, 
> > > > > > > ETH_TXQ_FLAGS_NOREFCOUNT
> > > > > > > with default.
> > > > > > > 
> > > > > > I suppose it could work, but why bother doing that? If an app knows 
> > > > > > it's
> > > > > > only going to use one traffic type, why not let it just state what 
> > > > > > it
> > > > > > will do rather than try to specify what it won't do. If mixed is 
> > > > > > needed,
> > > > > 
> > > > > My thought was more inline with ethdev spec, like, ref-count is 
> > > > > default,
> > > > > if application need exception then set ETH_TXQ_FLAGS_NOREFCOUNT. But 
> > > > > it is OK, if
> > > > > you need other way.
> > > > > 
> > > > > > then it's easy enough to specify - and we can make it the 
> > > > > > zero/default
> > > > > > value too.
> > > > > 
> > > > > OK. Then we will make MIX as zero/default and add 
> > > > > "allowed_event_types" in
> > > > > event queue config.
> > > > >
> > > > 
> > > > Bruce,
> > > > 
> > > > I have tried to make it as "allowed_event_types" in event queue config.
> > > > However, rte_event_queue_default_conf_get() can also take NULL for 
> > > > default
> > > > configuration. So I think, It makes sense to go with negation approach
> > > > like ethdev to define the default to avoid confusion on the default. So
> > > > I am thinking like below now,
> > > > 
> > > > ? [master][libeventdev] $ git diff
> > > > diff --git a/rte_eventdev.h b/rte_eventdev.h
> > > > index cf22b0e..cac4642 100644
> > > > --- a/rte_eventdev.h
> > > > +++ b/rte_eventdev.h
> > > > @@ -429,6 +429,12 @@ rte_event_dev_configure(uint8_t dev_id, struct
> > > > rte_event_dev_config *config);
> > > >   *
> > > >   *  \see rte_event_port_setup(), rte_event_port_link()
> > > >   */
> > > > +#define RTE_EVENT_QUEUE_CFG_NOATOMIC_TYPE  (1ULL << 1)
> > > > +/**< Skip configuring atomic schedule type resources */
> > > > +#define RTE_EVENT_QUEUE_CFG_NOORDERED_TYPE (1ULL << 2)
> > > > +/**< Skip configuring ordered schedule type resources */
> > > > +#define RTE_EVENT_QUEUE_CFG_NOPARALLEL_TYPE(1ULL << 3)
> > > > +/**< Skip configuring parallel schedule type resources */
> > > > 
> > > >  /** Event queue configuration structure */
> > > >  struct rte_event_queue_conf {
> > > > 
> > > > Thoughts?
> > > > 
> > > 
> > > I'm ok with the default as being all types, in the case where NULL is
> > > specified for the parameter. It does make the most sense.
> > 
> > Yes. That case I need to explicitly mention in the documentation about what
> > is default case. With RTE_EVENT_QUEUE_CFG_NOATOMIC_TYPE scheme it quite
> > understood what is default. Not adding up? :-)
> > 
> 
> Would below not work? DEFAULT explicitly stated, and can be commented to
> say all types allowed.

All I was trying to avoid explicitly stating the default state. Not worth
to have back and forth on slow path configuration, I will keep it as
positive logic as you suggested :-) and inspired from PKT_TX_L4_MASK

#define RTE_EVENT_QUEUE_CFG_TYPE_MASK   (3ULL << 0)
#define RTE_EVENT_QUEUE_CFG_ALL_TYPES   (0ULL << 0) /**< Enable all types */
#define RTE_EVENT_QUEUE_CFG_ATOMIC_ONLY (1ULL << 0)
#define RTE_EVENT_QUEUE_CFG_ORDERED_ONLY(2ULL << 0)
#define RTE_EVENT_QUEUE_CFG_PARALLEL_ONLY   (3ULL << 0)
#define RTE_EVENT_QUEUE_CFG_SINGLE_CONSUMER (1ULL << 2)

> 
> #define RTE_EVENT_QUEUE_CFG_DEFAULT 0
> #define RTE_EVENT_QUEUE_CFG_ALL_TYPES RTE_EVENT_QUEUE_CFG_DEFAULT
> #define RTE_EVENT_QUEUE_CFG_ATOMIC_ONLY (1<<0)
> #define RTE_EVENT_QUEUE_CFG_ORDERED_ONLY (1<<1) 
> 
> 
> /Bruce


[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-11-02 Thread Jerin Jacob
On Wed, Nov 02, 2016 at 11:35:51AM +, Bruce Richardson wrote:
> On Wed, Nov 02, 2016 at 04:55:22PM +0530, Jerin Jacob wrote:
> > On Fri, Oct 28, 2016 at 02:36:48PM +0530, Jerin Jacob wrote:
> > > On Fri, Oct 28, 2016 at 09:36:46AM +0100, Bruce Richardson wrote:
> > > > On Fri, Oct 28, 2016 at 08:31:41AM +0530, Jerin Jacob wrote:
> > > > > On Wed, Oct 26, 2016 at 01:54:14PM +0100, Bruce Richardson wrote:
> > > > > > On Wed, Oct 26, 2016 at 05:54:17PM +0530, Jerin Jacob wrote:
> > > > > How about making default as "mixed" and let application configures 
> > > > > what
> > > > > is not required?. That way application responsibility is clear.
> > > > > something similar to ETH_TXQ_FLAGS_NOMULTSEGS, 
> > > > > ETH_TXQ_FLAGS_NOREFCOUNT
> > > > > with default.
> > > > > 
> > > > I suppose it could work, but why bother doing that? If an app knows it's
> > > > only going to use one traffic type, why not let it just state what it
> > > > will do rather than try to specify what it won't do. If mixed is needed,
> > > 
> > > My thought was more inline with ethdev spec, like, ref-count is default,
> > > if application need exception then set ETH_TXQ_FLAGS_NOREFCOUNT. But it 
> > > is OK, if
> > > you need other way.
> > > 
> > > > then it's easy enough to specify - and we can make it the zero/default
> > > > value too.
> > > 
> > > OK. Then we will make MIX as zero/default and add "allowed_event_types" in
> > > event queue config.
> > >
> > 
> > Bruce,
> > 
> > I have tried to make it as "allowed_event_types" in event queue config.
> > However, rte_event_queue_default_conf_get() can also take NULL for default
> > configuration. So I think, It makes sense to go with negation approach
> > like ethdev to define the default to avoid confusion on the default. So
> > I am thinking like below now,
> > 
> > ? [master][libeventdev] $ git diff
> > diff --git a/rte_eventdev.h b/rte_eventdev.h
> > index cf22b0e..cac4642 100644
> > --- a/rte_eventdev.h
> > +++ b/rte_eventdev.h
> > @@ -429,6 +429,12 @@ rte_event_dev_configure(uint8_t dev_id, struct
> > rte_event_dev_config *config);
> >   *
> >   *  \see rte_event_port_setup(), rte_event_port_link()
> >   */
> > +#define RTE_EVENT_QUEUE_CFG_NOATOMIC_TYPE  (1ULL << 1)
> > +/**< Skip configuring atomic schedule type resources */
> > +#define RTE_EVENT_QUEUE_CFG_NOORDERED_TYPE (1ULL << 2)
> > +/**< Skip configuring ordered schedule type resources */
> > +#define RTE_EVENT_QUEUE_CFG_NOPARALLEL_TYPE(1ULL << 3)
> > +/**< Skip configuring parallel schedule type resources */
> > 
> >  /** Event queue configuration structure */
> >  struct rte_event_queue_conf {
> > 
> > Thoughts?
> > 
> 
> I'm ok with the default as being all types, in the case where NULL is
> specified for the parameter. It does make the most sense.

Yes. That case I need to explicitly mention in the documentation about what
is default case. With RTE_EVENT_QUEUE_CFG_NOATOMIC_TYPE scheme it quite
understood what is default. Not adding up? :-)

> 
> However, for the cases where the user does specify what they want, I
> think it does make more sense, and is easier on the user for things to
> be specified in a positive, rather than negative sense. For a user who
> wants to just use atomic events, having to specify that as "not-reordered
> and not-unordered" just isn't as clear! :-)
> 
> /Bruce
> 


[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-11-02 Thread Jerin Jacob
On Wed, Nov 02, 2016 at 11:48:37AM +, Bruce Richardson wrote:
> On Wed, Nov 02, 2016 at 01:36:34PM +0530, Jerin Jacob wrote:
> > On Fri, Oct 28, 2016 at 01:48:57PM +, Van Haaren, Harry wrote:
> > > > -Original Message-
> > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob
> > > > Sent: Tuesday, October 25, 2016 6:49 PM
> > > 
> > > > 
> > > > Hi Community,
> > > > 
> > > > So far, I have received constructive feedback from Intel, NXP and 
> > > > Linaro folks.
> > > > Let me know, if anyone else interested in contributing to the 
> > > > definition of eventdev?
> > > > 
> > > > If there are no major issues in proposed spec, then Cavium would like 
> > > > work on
> > > > implementing and up-streaming the common code(lib/librte_eventdev/) and
> > > > an associated HW driver.(Requested minor changes of v2 will be addressed
> > > > in next version).
> > > 
> > > 
> > > Hi All,
> > > 
> > > I've been looking at the eventdev API from a use-case point of view, and 
> > > I'm unclear on a how the API caters for two uses. I have simplified these 
> > > as much as possible, think of them as a theoretical unit-test for the API 
> > > :)
> > > 
> > > 
> > > Fragmentation:
> > > 1. Dequeue 8 packets
> > > 2. Process 2 packets
> > > 3. Processing 3rd, this packet needs fragmentation into two packets
> > > 4. Process remaining 5 packets as normal
> > > 
> > > What function calls does the application make to achieve this?
> > > In particular, I'm referring to how can the scheduler know that the 3rd 
> > > packet is the one being fragmented, and how to keep packet order valid. 
> > > 
> > 
> > OK. I will try to share my views on IP fragmentation on event _HW_
> > models(at least on Cavium HW) then we can see, how we can converge.
> > 
> > First, The fragmentation specific logic should be decoupled from the event
> > model as it specific to packet and L3 layer(Not specific to generic event)
> > 
> I would view fragmentation as just one example of a workload like this,
> multicast and broadcast may be two other cases. Yes, they all apply to
> packet, but the general feature support is just how to provide support
> for one event generating multiple further events which should be linked
> together for reordering. [I think this only really applies in the

AFIAK, There two different schemes to "maintain ordering", the first one
is based "reordering buffers" i.e as a list data structure used to hold the
event first and then when it comes correcting the order(ORDERED->ATOMIC),
correct the order based on the previous "reordering buffers".
But some HW implementation use "port" state based reordering scheme
(i.e no external reorder buffer to keep track the order).

So I think, To have portable application workflow, the use case where multiple
event generated based on one event, generated events needs to store in the 
parent event
and in the downstream, process them as required. like fragmentation example in

http://dpdk.org/ml/archives/dev/2016-November/049707.html

The above scheme should OK in your implementation. Right?


> reordered case - which leads to another question: in your experience
> do you see other event types other than packet being handled in a
> "reordered" manner?]

We use both timer events and crypto completion events etc in ORDERED
type. But not like, one event creates N event scheme on those.

> 
> /Bruce
> 


[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-11-02 Thread Jerin Jacob
On Wed, Nov 02, 2016 at 11:45:07AM +, Bruce Richardson wrote:
> On Wed, Nov 02, 2016 at 04:17:04PM +0530, Jerin Jacob wrote:
> > On Wed, Oct 26, 2016 at 12:11:03PM +, Van Haaren, Harry wrote:
> > > > -Original Message-
> > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob
> > > > 
> > > > So far, I have received constructive feedback from Intel, NXP and 
> > > > Linaro folks.
> > > > Let me know, if anyone else interested in contributing to the 
> > > > definition of eventdev?
> > > > 
> > > > If there are no major issues in proposed spec, then Cavium would like 
> > > > work on
> > > > implementing and up-streaming the common code(lib/librte_eventdev/) and
> > > > an associated HW driver.(Requested minor changes of v2 will be addressed
> > > > in next version).
> > >
> > 
> > Hi All,
> > 
> > Two queries,
> > 
> > 1) In SW implementation, Is their any connection between "struct
> > rte_event_port_conf"'s dequeue_queue_depth and enqueue_queue_depth ?
> > i.e it should be enqueue_queue_depth >= dequeue_queue_depth. Right ?
> > Thought of adding the common checks in common layer.
> 
> I think this is probably best left to the driver layers to enforce. For
> us, such a restriction doesn't really make sense, though in many cases
> that would be the usual setup. For accurate load balancing, the dequeue
> queue depth would be small, and the burst size would probably equal the
> queue depth, meaning the enqueue depth needs to be at least as big.
> However, for better throughput, or in cases where all traffic is being
> coalesced to a single core e.g. for transmit out a network port, there
> is no need to keep the dequeue queue shallow and so it can be many times
> the burst size, while the enqueue queue can be kept to 1-2 times the
> burst size.
> 

OK

> > 
> > 2)Any comments on follow item(section under ) that needs improvement.
> > ---
> > Abstract the differences in event QoS management with different
> > priority schemes available in different HW or SW implementations with 
> > portable
> > application workflow.
> > 
> > Based on the feedback, there three different kinds of QoS support
> > available in
> > three different HW or SW implementations.
> > 1) Priority associated with the event queue
> > 2) Priority associated with each event enqueue
> > (Same flow can have two different priority on two separate enqueue)
> > 3) Priority associated with the flow(each flow has unique priority)
> > 
> > In v2, The differences abstracted based on device capability
> > (RTE_EVENT_DEV_CAP_QUEUE_QOS for the first scheme,
> > RTE_EVENT_DEV_CAP_EVENT_QOS for the second and third scheme).
> > This scheme would call for different application workflow for
> > nontrivial QoS-enabled applications.
> > ---
> > After thinking a while, I think, RTE_EVENT_DEV_CAP_EVENT_QOS is a
> > super-set.if so, the subset RTE_EVENT_DEV_CAP_QUEUE_QOS can be
> > implemented with RTE_EVENT_DEV_CAP_EVENT_QOS. i.e We may not need two
> > flags, Just one flag RTE_EVENT_DEV_CAP_EVENT_QOS is enough to fix
> > portability issue with basic QoS enabled applications.
> > 
> > i.e Introduce RTE_EVENT_DEV_CAP_EVENT_QOS as config option in device
> > configure stage if application needs fine granularity on QoS per event
> > enqueue.For trivial applications, configured
> > rte_event_queue_conf->priority can be used as rte_event_enqueue(struct
> > rte_event.priority)
> > 
> So all implementations should support the concept of priority among
> queues, and then there is optional support for event or flow based
> prioritization. Is that a correct interpretation of what you propose?

Yes. If you _can_ implement it and if possible in the system.

> 
> /Bruce
> 


[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-11-02 Thread Jerin Jacob
On Fri, Oct 28, 2016 at 02:36:48PM +0530, Jerin Jacob wrote:
> On Fri, Oct 28, 2016 at 09:36:46AM +0100, Bruce Richardson wrote:
> > On Fri, Oct 28, 2016 at 08:31:41AM +0530, Jerin Jacob wrote:
> > > On Wed, Oct 26, 2016 at 01:54:14PM +0100, Bruce Richardson wrote:
> > > > On Wed, Oct 26, 2016 at 05:54:17PM +0530, Jerin Jacob wrote:
> > > > > On Wed, Oct 26, 2016 at 12:11:03PM +, Van Haaren, Harry wrote:
> > > > > > > -Original Message-
> > > > rte_event_queue_conf, with possible values:
> > > > * atomic
> > > > * ordered
> > > > * parallel
> > > > * mixed - allowing all 3 types. I think allowing 2 of three types might
> > > > make things too complicated.
> > > > 
> > > > An open question would then be how to behave when the queue type and
> > > > requested event type conflict. We can either throw an error, or just
> > > > ignore the event type and always treat enqueued events as being of the
> > > > queue type. I prefer the latter, because it's faster not having to
> > > > error-check, and it pushes the responsibility on the app to know what
> > > > it's doing.
> > > 
> > > How about making default as "mixed" and let application configures what
> > > is not required?. That way application responsibility is clear.
> > > something similar to ETH_TXQ_FLAGS_NOMULTSEGS, ETH_TXQ_FLAGS_NOREFCOUNT
> > > with default.
> > > 
> > I suppose it could work, but why bother doing that? If an app knows it's
> > only going to use one traffic type, why not let it just state what it
> > will do rather than try to specify what it won't do. If mixed is needed,
> 
> My thought was more inline with ethdev spec, like, ref-count is default,
> if application need exception then set ETH_TXQ_FLAGS_NOREFCOUNT. But it is 
> OK, if
> you need other way.
> 
> > then it's easy enough to specify - and we can make it the zero/default
> > value too.
> 
> OK. Then we will make MIX as zero/default and add "allowed_event_types" in
> event queue config.
>

Bruce,

I have tried to make it as "allowed_event_types" in event queue config.
However, rte_event_queue_default_conf_get() can also take NULL for default
configuration. So I think, It makes sense to go with negation approach
like ethdev to define the default to avoid confusion on the default. So
I am thinking like below now,

? [master][libeventdev] $ git diff
diff --git a/rte_eventdev.h b/rte_eventdev.h
index cf22b0e..cac4642 100644
--- a/rte_eventdev.h
+++ b/rte_eventdev.h
@@ -429,6 +429,12 @@ rte_event_dev_configure(uint8_t dev_id, struct
rte_event_dev_config *config);
  *
  *  \see rte_event_port_setup(), rte_event_port_link()
  */
+#define RTE_EVENT_QUEUE_CFG_NOATOMIC_TYPE  (1ULL << 1)
+/**< Skip configuring atomic schedule type resources */
+#define RTE_EVENT_QUEUE_CFG_NOORDERED_TYPE (1ULL << 2)
+/**< Skip configuring ordered schedule type resources */
+#define RTE_EVENT_QUEUE_CFG_NOPARALLEL_TYPE(1ULL << 3)
+/**< Skip configuring parallel schedule type resources */

 /** Event queue configuration structure */
 struct rte_event_queue_conf {

Thoughts?


> /Jerin
> 
> > 
> > Our software implementation for now, only supports one type per queue -
> > which we suspect should meet a lot of use-cases. We'll have to see about
> > adding in mixed types in future.
> > 
> > /Bruce


[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-11-02 Thread Jerin Jacob
On Wed, Oct 26, 2016 at 12:11:03PM +, Van Haaren, Harry wrote:
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob
> > 
> > So far, I have received constructive feedback from Intel, NXP and Linaro 
> > folks.
> > Let me know, if anyone else interested in contributing to the definition of 
> > eventdev?
> > 
> > If there are no major issues in proposed spec, then Cavium would like work 
> > on
> > implementing and up-streaming the common code(lib/librte_eventdev/) and
> > an associated HW driver.(Requested minor changes of v2 will be addressed
> > in next version).
>

Hi All,

Two queries,

1) In SW implementation, Is their any connection between "struct
rte_event_port_conf"'s dequeue_queue_depth and enqueue_queue_depth ?
i.e it should be enqueue_queue_depth >= dequeue_queue_depth. Right ?
Thought of adding the common checks in common layer.

2)Any comments on follow item(section under ) that needs improvement.
---
Abstract the differences in event QoS management with different
priority schemes available in different HW or SW implementations with portable
application workflow.

Based on the feedback, there three different kinds of QoS support
available in
three different HW or SW implementations.
1) Priority associated with the event queue
2) Priority associated with each event enqueue
(Same flow can have two different priority on two separate enqueue)
3) Priority associated with the flow(each flow has unique priority)

In v2, The differences abstracted based on device capability
(RTE_EVENT_DEV_CAP_QUEUE_QOS for the first scheme,
RTE_EVENT_DEV_CAP_EVENT_QOS for the second and third scheme).
This scheme would call for different application workflow for
nontrivial QoS-enabled applications.
---
After thinking a while, I think, RTE_EVENT_DEV_CAP_EVENT_QOS is a
super-set.if so, the subset RTE_EVENT_DEV_CAP_QUEUE_QOS can be
implemented with RTE_EVENT_DEV_CAP_EVENT_QOS. i.e We may not need two
flags, Just one flag RTE_EVENT_DEV_CAP_EVENT_QOS is enough to fix
portability issue with basic QoS enabled applications.

i.e Introduce RTE_EVENT_DEV_CAP_EVENT_QOS as config option in device
configure stage if application needs fine granularity on QoS per event
enqueue.For trivial applications, configured
rte_event_queue_conf->priority can be used as rte_event_enqueue(struct
rte_event.priority)

Thoughts?

/Jerin




[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-11-02 Thread Jerin Jacob
On Fri, Oct 28, 2016 at 03:16:18PM +0100, Bruce Richardson wrote:
> On Fri, Oct 28, 2016 at 02:48:57PM +0100, Van Haaren, Harry wrote:
> > > -Original Message-
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob
> > > Sent: Tuesday, October 25, 2016 6:49 PM
> > 
> > > 
> > > Hi Community,
> > > 
> > > So far, I have received constructive feedback from Intel, NXP and Linaro 
> > > folks.
> > > Let me know, if anyone else interested in contributing to the definition 
> > > of eventdev?
> > > 
> > > If there are no major issues in proposed spec, then Cavium would like 
> > > work on
> > > implementing and up-streaming the common code(lib/librte_eventdev/) and
> > > an associated HW driver.(Requested minor changes of v2 will be addressed
> > > in next version).
> > 
> > 
> > Hi All,
> > 
> > I've been looking at the eventdev API from a use-case point of view, and 
> > I'm unclear on a how the API caters for two uses. I have simplified these 
> > as much as possible, think of them as a theoretical unit-test for the API :)
> > 
> > 
> > Fragmentation:
> > 1. Dequeue 8 packets
> > 2. Process 2 packets
> > 3. Processing 3rd, this packet needs fragmentation into two packets
> > 4. Process remaining 5 packets as normal
> > 
> > What function calls does the application make to achieve this?
> > In particular, I'm referring to how can the scheduler know that the 3rd 
> > packet is the one being fragmented, and how to keep packet order valid. 
> > 
> > 
> > Dropping packets:
> > 1. Dequeue 8 packets
> > 2. Process 2 packets
> > 3. Processing 3rd, this packet needs to be dropped
> > 4. Process remaining 5 packets as normal
> > 
> > What function calls does the application make to achieve this?
> > Again, in particular how does the scheduler know that the 3rd packet is 
> > being dropped.
> > 
> > 
> > Regards, -Harry
> 
> Hi,
> 
> these questions apply particularly to reordered which has a lot more
> complications than the other types in terms of sending packets back into
> the scheduler. However, atomic types will still suffer from problems
> with things the way they are - again if we assume a burst of 8 packets,
> then to forward those packets, we need to re-enqueue them again to the
> scheduler, and also then send 8 releases to the scheduler as well, to
> release the atomic locks for those packets.
> This means that for each packet we have to send two messages to a
> scheduler core, something that is really inefficient.
> 
> This number of messages is critical for any software implementation, as
> the cost of moving items core-to-core is going to be a big bottleneck
> (perhaps the biggest bottleneck) in the system. It's for this reason we
> need to use burst APIs - as with rte_rings.

I agree, That the reason why we have rte_event_*_burst()

> 
> How we have solved this in our implementation, is to allow there to be
> an event operation type. The four operations we implemented are as below
> (using packet as a synonym for event here, since these would mostly
> apply to packets flowing through a system):
> 
> * NEW - just a regular enqueue of a packet, without any previous context

Makes sense. I was trying derive it.Make sense for application
requesting it.

> * FORWARD - enqueue a packet, and mark the flow processing for the
> equivalent packet that was dequeued as completed, i.e.
>   release any atomic locks, or reorder this packet with
>   respect to any other outstanding packets from the event queue.

Default case

> * DROP- this is roughtly equivalent to the existing "release" API call,
> except that having it as an enqueue type allows us to
>   release multiple items in a single call, and also to mix
>   releases with new packets and forwarded packets

Yes. Maps to rte_event_release(), with index parameter, its kind doing
the job. But, Makes sense as flag to enable burst.
But it calls for removing the index parameter. Looks like index parameter
has issue in Intel implementation. If so, may be we(Cavium) can fill the
index in the dequeue as implementation specific bits like Harry
suggested and use it in enqueue.
http://dpdk.org/ml/archives/dev/2016-October/049459.html

Any thoughts from NXP?

> * PARTIAL - this indicates that the packet being enqueued should be
>   treated according to the context of the current packet, but
>   that that context should not be released/completed by the
>   enqueue of this packet. This only really applies for
>   reordered events, and is needed to do fragmentation and or
>   multicast of packets with reordering.

I believe PARTIAL is something, HW implementation will have trouble.
I have outlined other way to fix without coupling fragmentation logic in
scheduler.
http://dpdk.org/ml/archives/dev/2016-November/049707.html

If it makes sense for everyone then may be can
- Introduce "event operation type" bits (NEW, DROP, FORWARD(may not required as 

[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-11-02 Thread Bruce Richardson
On Wed, Nov 02, 2016 at 06:39:27PM +0530, Jerin Jacob wrote:
> On Wed, Nov 02, 2016 at 11:35:51AM +, Bruce Richardson wrote:
> > On Wed, Nov 02, 2016 at 04:55:22PM +0530, Jerin Jacob wrote:
> > > On Fri, Oct 28, 2016 at 02:36:48PM +0530, Jerin Jacob wrote:
> > > > On Fri, Oct 28, 2016 at 09:36:46AM +0100, Bruce Richardson wrote:
> > > > > On Fri, Oct 28, 2016 at 08:31:41AM +0530, Jerin Jacob wrote:
> > > > > > On Wed, Oct 26, 2016 at 01:54:14PM +0100, Bruce Richardson wrote:
> > > > > > > On Wed, Oct 26, 2016 at 05:54:17PM +0530, Jerin Jacob wrote:
> > > > > > How about making default as "mixed" and let application configures 
> > > > > > what
> > > > > > is not required?. That way application responsibility is clear.
> > > > > > something similar to ETH_TXQ_FLAGS_NOMULTSEGS, 
> > > > > > ETH_TXQ_FLAGS_NOREFCOUNT
> > > > > > with default.
> > > > > > 
> > > > > I suppose it could work, but why bother doing that? If an app knows 
> > > > > it's
> > > > > only going to use one traffic type, why not let it just state what it
> > > > > will do rather than try to specify what it won't do. If mixed is 
> > > > > needed,
> > > > 
> > > > My thought was more inline with ethdev spec, like, ref-count is default,
> > > > if application need exception then set ETH_TXQ_FLAGS_NOREFCOUNT. But it 
> > > > is OK, if
> > > > you need other way.
> > > > 
> > > > > then it's easy enough to specify - and we can make it the zero/default
> > > > > value too.
> > > > 
> > > > OK. Then we will make MIX as zero/default and add "allowed_event_types" 
> > > > in
> > > > event queue config.
> > > >
> > > 
> > > Bruce,
> > > 
> > > I have tried to make it as "allowed_event_types" in event queue config.
> > > However, rte_event_queue_default_conf_get() can also take NULL for default
> > > configuration. So I think, It makes sense to go with negation approach
> > > like ethdev to define the default to avoid confusion on the default. So
> > > I am thinking like below now,
> > > 
> > > ? [master][libeventdev] $ git diff
> > > diff --git a/rte_eventdev.h b/rte_eventdev.h
> > > index cf22b0e..cac4642 100644
> > > --- a/rte_eventdev.h
> > > +++ b/rte_eventdev.h
> > > @@ -429,6 +429,12 @@ rte_event_dev_configure(uint8_t dev_id, struct
> > > rte_event_dev_config *config);
> > >   *
> > >   *  \see rte_event_port_setup(), rte_event_port_link()
> > >   */
> > > +#define RTE_EVENT_QUEUE_CFG_NOATOMIC_TYPE  (1ULL << 1)
> > > +/**< Skip configuring atomic schedule type resources */
> > > +#define RTE_EVENT_QUEUE_CFG_NOORDERED_TYPE (1ULL << 2)
> > > +/**< Skip configuring ordered schedule type resources */
> > > +#define RTE_EVENT_QUEUE_CFG_NOPARALLEL_TYPE(1ULL << 3)
> > > +/**< Skip configuring parallel schedule type resources */
> > > 
> > >  /** Event queue configuration structure */
> > >  struct rte_event_queue_conf {
> > > 
> > > Thoughts?
> > > 
> > 
> > I'm ok with the default as being all types, in the case where NULL is
> > specified for the parameter. It does make the most sense.
> 
> Yes. That case I need to explicitly mention in the documentation about what
> is default case. With RTE_EVENT_QUEUE_CFG_NOATOMIC_TYPE scheme it quite
> understood what is default. Not adding up? :-)
> 

Would below not work? DEFAULT explicitly stated, and can be commented to
say all types allowed.

#define RTE_EVENT_QUEUE_CFG_DEFAULT 0
#define RTE_EVENT_QUEUE_CFG_ALL_TYPES RTE_EVENT_QUEUE_CFG_DEFAULT
#define RTE_EVENT_QUEUE_CFG_ATOMIC_ONLY (1<<0)
#define RTE_EVENT_QUEUE_CFG_ORDERED_ONLY (1<<1) 


/Bruce


[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-11-02 Thread Jerin Jacob
On Fri, Oct 28, 2016 at 01:48:57PM +, Van Haaren, Harry wrote:
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob
> > Sent: Tuesday, October 25, 2016 6:49 PM
> 
> > 
> > Hi Community,
> > 
> > So far, I have received constructive feedback from Intel, NXP and Linaro 
> > folks.
> > Let me know, if anyone else interested in contributing to the definition of 
> > eventdev?
> > 
> > If there are no major issues in proposed spec, then Cavium would like work 
> > on
> > implementing and up-streaming the common code(lib/librte_eventdev/) and
> > an associated HW driver.(Requested minor changes of v2 will be addressed
> > in next version).
> 
> 
> Hi All,
> 
> I've been looking at the eventdev API from a use-case point of view, and I'm 
> unclear on a how the API caters for two uses. I have simplified these as much 
> as possible, think of them as a theoretical unit-test for the API :)
> 
> 
> Fragmentation:
> 1. Dequeue 8 packets
> 2. Process 2 packets
> 3. Processing 3rd, this packet needs fragmentation into two packets
> 4. Process remaining 5 packets as normal
> 
> What function calls does the application make to achieve this?
> In particular, I'm referring to how can the scheduler know that the 3rd 
> packet is the one being fragmented, and how to keep packet order valid. 
> 

OK. I will try to share my views on IP fragmentation on event _HW_
models(at least on Cavium HW) then we can see, how we can converge.

First, The fragmentation specific logic should be decoupled from the event
model as it specific to packet and L3 layer(Not specific to generic event)

Now, let us consider the fragmentation handling with non-burst case and single 
flow.
The following text outlines the event flow

a)Setup an event device with single event queue
b)Link multiple ports to single event queue
c)Event producer enqueues p0..p7 packets to event queue with ORDERED
type.(let's assume p2 packet needs to be fragmented i.e application
needs to create p2.0 and p2.1 from p2)
d)Since it is an ORDERED type, p0 to p7 packets are distributed to multiple
ports in parallel(assigned to each lcore or lightweight thread)
e) each lcore/lightweight thread get the packet from designated event port
and process them in parallel and enqueue back to ATOMIC type to maintain
ordering
f)The one lcore dequeues the p2 packet, understands it needs to be
fragmented due to MTU size etc. So it calls rte_ipv4_fragment_packet()
and store the fragmented packet p2.0 and p2.1 in private area of p2 mbuf.
and as usual like other workers, it enqueues p2 to atomic queue for maintaining
the order.
g)On the atomic flow, when lcore dequeues packets, then it comes in order 
p0..p7.
The application sends p0 to p7 on the wire. When application checks the p2 mbuf
private area it understands it is fragmented and then sends p2.0 and p2.1
on the wire.

OR

skip the fragmentation step in (f) and in step (g),
while processing the p2, run over rte_ipv4_fragment_packet() and split the 
packet
and transmit the packets(in case application don't want to deal with mbuf 
private area)

Now, When it comes to BURST scheme. We are planning to create a SW
structure as a virtual event port and associate N 
(N=rte_event_port_dequeue_depth())
physical HW event ports to the virtual port.
That way, it just come as an extension to non burst API and on the
release call have explicit "index" and identify the physical event port
associated with the virtual port.

/Jerin

> 
> Dropping packets:
> 1. Dequeue 8 packets
> 2. Process 2 packets
> 3. Processing 3rd, this packet needs to be dropped
> 4. Process remaining 5 packets as normal
> 
> What function calls does the application make to achieve this?
> Again, in particular how does the scheduler know that the 3rd packet is being 
> dropped.

rte_event_release(..,..,3)??

> 
> 
> Regards, -Harry


[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-11-02 Thread Bruce Richardson
On Wed, Nov 02, 2016 at 01:36:34PM +0530, Jerin Jacob wrote:
> On Fri, Oct 28, 2016 at 01:48:57PM +, Van Haaren, Harry wrote:
> > > -Original Message-
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob
> > > Sent: Tuesday, October 25, 2016 6:49 PM
> > 
> > > 
> > > Hi Community,
> > > 
> > > So far, I have received constructive feedback from Intel, NXP and Linaro 
> > > folks.
> > > Let me know, if anyone else interested in contributing to the definition 
> > > of eventdev?
> > > 
> > > If there are no major issues in proposed spec, then Cavium would like 
> > > work on
> > > implementing and up-streaming the common code(lib/librte_eventdev/) and
> > > an associated HW driver.(Requested minor changes of v2 will be addressed
> > > in next version).
> > 
> > 
> > Hi All,
> > 
> > I've been looking at the eventdev API from a use-case point of view, and 
> > I'm unclear on a how the API caters for two uses. I have simplified these 
> > as much as possible, think of them as a theoretical unit-test for the API :)
> > 
> > 
> > Fragmentation:
> > 1. Dequeue 8 packets
> > 2. Process 2 packets
> > 3. Processing 3rd, this packet needs fragmentation into two packets
> > 4. Process remaining 5 packets as normal
> > 
> > What function calls does the application make to achieve this?
> > In particular, I'm referring to how can the scheduler know that the 3rd 
> > packet is the one being fragmented, and how to keep packet order valid. 
> > 
> 
> OK. I will try to share my views on IP fragmentation on event _HW_
> models(at least on Cavium HW) then we can see, how we can converge.
> 
> First, The fragmentation specific logic should be decoupled from the event
> model as it specific to packet and L3 layer(Not specific to generic event)
> 
I would view fragmentation as just one example of a workload like this,
multicast and broadcast may be two other cases. Yes, they all apply to
packet, but the general feature support is just how to provide support
for one event generating multiple further events which should be linked
together for reordering. [I think this only really applies in the
reordered case - which leads to another question: in your experience
do you see other event types other than packet being handled in a
"reordered" manner?]

/Bruce



[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-11-02 Thread Bruce Richardson
On Wed, Nov 02, 2016 at 04:17:04PM +0530, Jerin Jacob wrote:
> On Wed, Oct 26, 2016 at 12:11:03PM +, Van Haaren, Harry wrote:
> > > -Original Message-
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob
> > > 
> > > So far, I have received constructive feedback from Intel, NXP and Linaro 
> > > folks.
> > > Let me know, if anyone else interested in contributing to the definition 
> > > of eventdev?
> > > 
> > > If there are no major issues in proposed spec, then Cavium would like 
> > > work on
> > > implementing and up-streaming the common code(lib/librte_eventdev/) and
> > > an associated HW driver.(Requested minor changes of v2 will be addressed
> > > in next version).
> >
> 
> Hi All,
> 
> Two queries,
> 
> 1) In SW implementation, Is their any connection between "struct
> rte_event_port_conf"'s dequeue_queue_depth and enqueue_queue_depth ?
> i.e it should be enqueue_queue_depth >= dequeue_queue_depth. Right ?
> Thought of adding the common checks in common layer.

I think this is probably best left to the driver layers to enforce. For
us, such a restriction doesn't really make sense, though in many cases
that would be the usual setup. For accurate load balancing, the dequeue
queue depth would be small, and the burst size would probably equal the
queue depth, meaning the enqueue depth needs to be at least as big.
However, for better throughput, or in cases where all traffic is being
coalesced to a single core e.g. for transmit out a network port, there
is no need to keep the dequeue queue shallow and so it can be many times
the burst size, while the enqueue queue can be kept to 1-2 times the
burst size.

> 
> 2)Any comments on follow item(section under ) that needs improvement.
> ---
> Abstract the differences in event QoS management with different
> priority schemes available in different HW or SW implementations with portable
> application workflow.
> 
> Based on the feedback, there three different kinds of QoS support
> available in
> three different HW or SW implementations.
> 1) Priority associated with the event queue
> 2) Priority associated with each event enqueue
> (Same flow can have two different priority on two separate enqueue)
> 3) Priority associated with the flow(each flow has unique priority)
> 
> In v2, The differences abstracted based on device capability
> (RTE_EVENT_DEV_CAP_QUEUE_QOS for the first scheme,
> RTE_EVENT_DEV_CAP_EVENT_QOS for the second and third scheme).
> This scheme would call for different application workflow for
> nontrivial QoS-enabled applications.
> ---
> After thinking a while, I think, RTE_EVENT_DEV_CAP_EVENT_QOS is a
> super-set.if so, the subset RTE_EVENT_DEV_CAP_QUEUE_QOS can be
> implemented with RTE_EVENT_DEV_CAP_EVENT_QOS. i.e We may not need two
> flags, Just one flag RTE_EVENT_DEV_CAP_EVENT_QOS is enough to fix
> portability issue with basic QoS enabled applications.
> 
> i.e Introduce RTE_EVENT_DEV_CAP_EVENT_QOS as config option in device
> configure stage if application needs fine granularity on QoS per event
> enqueue.For trivial applications, configured
> rte_event_queue_conf->priority can be used as rte_event_enqueue(struct
> rte_event.priority)
> 
So all implementations should support the concept of priority among
queues, and then there is optional support for event or flow based
prioritization. Is that a correct interpretation of what you propose?

/Bruce



[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-11-02 Thread Bruce Richardson
On Wed, Nov 02, 2016 at 04:55:22PM +0530, Jerin Jacob wrote:
> On Fri, Oct 28, 2016 at 02:36:48PM +0530, Jerin Jacob wrote:
> > On Fri, Oct 28, 2016 at 09:36:46AM +0100, Bruce Richardson wrote:
> > > On Fri, Oct 28, 2016 at 08:31:41AM +0530, Jerin Jacob wrote:
> > > > On Wed, Oct 26, 2016 at 01:54:14PM +0100, Bruce Richardson wrote:
> > > > > On Wed, Oct 26, 2016 at 05:54:17PM +0530, Jerin Jacob wrote:
> > > > > > On Wed, Oct 26, 2016 at 12:11:03PM +, Van Haaren, Harry wrote:
> > > > > > > > -Original Message-
> > > > > rte_event_queue_conf, with possible values:
> > > > > * atomic
> > > > > * ordered
> > > > > * parallel
> > > > > * mixed - allowing all 3 types. I think allowing 2 of three types 
> > > > > might
> > > > > make things too complicated.
> > > > > 
> > > > > An open question would then be how to behave when the queue type and
> > > > > requested event type conflict. We can either throw an error, or just
> > > > > ignore the event type and always treat enqueued events as being of the
> > > > > queue type. I prefer the latter, because it's faster not having to
> > > > > error-check, and it pushes the responsibility on the app to know what
> > > > > it's doing.
> > > > 
> > > > How about making default as "mixed" and let application configures what
> > > > is not required?. That way application responsibility is clear.
> > > > something similar to ETH_TXQ_FLAGS_NOMULTSEGS, ETH_TXQ_FLAGS_NOREFCOUNT
> > > > with default.
> > > > 
> > > I suppose it could work, but why bother doing that? If an app knows it's
> > > only going to use one traffic type, why not let it just state what it
> > > will do rather than try to specify what it won't do. If mixed is needed,
> > 
> > My thought was more inline with ethdev spec, like, ref-count is default,
> > if application need exception then set ETH_TXQ_FLAGS_NOREFCOUNT. But it is 
> > OK, if
> > you need other way.
> > 
> > > then it's easy enough to specify - and we can make it the zero/default
> > > value too.
> > 
> > OK. Then we will make MIX as zero/default and add "allowed_event_types" in
> > event queue config.
> >
> 
> Bruce,
> 
> I have tried to make it as "allowed_event_types" in event queue config.
> However, rte_event_queue_default_conf_get() can also take NULL for default
> configuration. So I think, It makes sense to go with negation approach
> like ethdev to define the default to avoid confusion on the default. So
> I am thinking like below now,
> 
> ? [master][libeventdev] $ git diff
> diff --git a/rte_eventdev.h b/rte_eventdev.h
> index cf22b0e..cac4642 100644
> --- a/rte_eventdev.h
> +++ b/rte_eventdev.h
> @@ -429,6 +429,12 @@ rte_event_dev_configure(uint8_t dev_id, struct
> rte_event_dev_config *config);
>   *
>   *  \see rte_event_port_setup(), rte_event_port_link()
>   */
> +#define RTE_EVENT_QUEUE_CFG_NOATOMIC_TYPE  (1ULL << 1)
> +/**< Skip configuring atomic schedule type resources */
> +#define RTE_EVENT_QUEUE_CFG_NOORDERED_TYPE (1ULL << 2)
> +/**< Skip configuring ordered schedule type resources */
> +#define RTE_EVENT_QUEUE_CFG_NOPARALLEL_TYPE(1ULL << 3)
> +/**< Skip configuring parallel schedule type resources */
> 
>  /** Event queue configuration structure */
>  struct rte_event_queue_conf {
> 
> Thoughts?
> 

I'm ok with the default as being all types, in the case where NULL is
specified for the parameter. It does make the most sense.

However, for the cases where the user does specify what they want, I
think it does make more sense, and is easier on the user for things to
be specified in a positive, rather than negative sense. For a user who
wants to just use atomic events, having to specify that as "not-reordered
and not-unordered" just isn't as clear! :-)

/Bruce



[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-10-28 Thread Bruce Richardson
On Fri, Oct 28, 2016 at 02:48:57PM +0100, Van Haaren, Harry wrote:
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob
> > Sent: Tuesday, October 25, 2016 6:49 PM
> 
> > 
> > Hi Community,
> > 
> > So far, I have received constructive feedback from Intel, NXP and Linaro 
> > folks.
> > Let me know, if anyone else interested in contributing to the definition of 
> > eventdev?
> > 
> > If there are no major issues in proposed spec, then Cavium would like work 
> > on
> > implementing and up-streaming the common code(lib/librte_eventdev/) and
> > an associated HW driver.(Requested minor changes of v2 will be addressed
> > in next version).
> 
> 
> Hi All,
> 
> I've been looking at the eventdev API from a use-case point of view, and I'm 
> unclear on a how the API caters for two uses. I have simplified these as much 
> as possible, think of them as a theoretical unit-test for the API :)
> 
> 
> Fragmentation:
> 1. Dequeue 8 packets
> 2. Process 2 packets
> 3. Processing 3rd, this packet needs fragmentation into two packets
> 4. Process remaining 5 packets as normal
> 
> What function calls does the application make to achieve this?
> In particular, I'm referring to how can the scheduler know that the 3rd 
> packet is the one being fragmented, and how to keep packet order valid. 
> 
> 
> Dropping packets:
> 1. Dequeue 8 packets
> 2. Process 2 packets
> 3. Processing 3rd, this packet needs to be dropped
> 4. Process remaining 5 packets as normal
> 
> What function calls does the application make to achieve this?
> Again, in particular how does the scheduler know that the 3rd packet is being 
> dropped.
> 
> 
> Regards, -Harry

Hi,

these questions apply particularly to reordered which has a lot more
complications than the other types in terms of sending packets back into
the scheduler. However, atomic types will still suffer from problems
with things the way they are - again if we assume a burst of 8 packets,
then to forward those packets, we need to re-enqueue them again to the
scheduler, and also then send 8 releases to the scheduler as well, to
release the atomic locks for those packets.
This means that for each packet we have to send two messages to a
scheduler core, something that is really inefficient.

This number of messages is critical for any software implementation, as
the cost of moving items core-to-core is going to be a big bottleneck
(perhaps the biggest bottleneck) in the system. It's for this reason we
need to use burst APIs - as with rte_rings.

How we have solved this in our implementation, is to allow there to be
an event operation type. The four operations we implemented are as below
(using packet as a synonym for event here, since these would mostly
apply to packets flowing through a system):

* NEW - just a regular enqueue of a packet, without any previous context
* FORWARD - enqueue a packet, and mark the flow processing for the
equivalent packet that was dequeued as completed, i.e.
release any atomic locks, or reorder this packet with
respect to any other outstanding packets from the event queue.
* DROP- this is roughtly equivalent to the existing "release" API call,
except that having it as an enqueue type allows us to
release multiple items in a single call, and also to mix
releases with new packets and forwarded packets
* PARTIAL - this indicates that the packet being enqueued should be
treated according to the context of the current packet, but
that that context should not be released/completed by the
enqueue of this packet. This only really applies for
reordered events, and is needed to do fragmentation and or
multicast of packets with reordering.


Therefore, I think we need to use some of the bits just freed up in the
event structure to include an enqueue operation type. Without it, I just
can't see how the API can ever support burst operation on packets.

Regards,
/Bruce



[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-10-28 Thread Jerin Jacob
On Fri, Oct 28, 2016 at 09:36:46AM +0100, Bruce Richardson wrote:
> On Fri, Oct 28, 2016 at 08:31:41AM +0530, Jerin Jacob wrote:
> > On Wed, Oct 26, 2016 at 01:54:14PM +0100, Bruce Richardson wrote:
> > > On Wed, Oct 26, 2016 at 05:54:17PM +0530, Jerin Jacob wrote:
> > > > On Wed, Oct 26, 2016 at 12:11:03PM +, Van Haaren, Harry wrote:
> > > > > > -Original Message-
> > > > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob
> > > Thanks. One other suggestion is that it might be useful to provide
> > > support for having typed queues explicitly in the API. Right now, when
> > > you create an queue, the queue_conf structure takes as parameters how
> > > many atomic flows that are needed for the queue, or how many reorder
> > > slots need to be reserved for it. This implicitly hints at the type of
> > > traffic which will be sent to the queue, but I'm wondering if it's
> > > better to make it explicit. There are certain optimisations that can be
> > > looked at if we know that a queue only handles packets of a particular
> > > type. [Not having to handle reordering when pulling events from a core
> > > can be a big win for software!].
> > 
> > If it helps in SW implementation, then I think we can add this in queue
> > configuration. 
> > 
> > > 
> > > How about adding: "allowed_event_types" as a field to
> > > rte_event_queue_conf, with possible values:
> > > * atomic
> > > * ordered
> > > * parallel
> > > * mixed - allowing all 3 types. I think allowing 2 of three types might
> > > make things too complicated.
> > > 
> > > An open question would then be how to behave when the queue type and
> > > requested event type conflict. We can either throw an error, or just
> > > ignore the event type and always treat enqueued events as being of the
> > > queue type. I prefer the latter, because it's faster not having to
> > > error-check, and it pushes the responsibility on the app to know what
> > > it's doing.
> > 
> > How about making default as "mixed" and let application configures what
> > is not required?. That way application responsibility is clear.
> > something similar to ETH_TXQ_FLAGS_NOMULTSEGS, ETH_TXQ_FLAGS_NOREFCOUNT
> > with default.
> > 
> I suppose it could work, but why bother doing that? If an app knows it's
> only going to use one traffic type, why not let it just state what it
> will do rather than try to specify what it won't do. If mixed is needed,

My thought was more inline with ethdev spec, like, ref-count is default,
if application need exception then set ETH_TXQ_FLAGS_NOREFCOUNT. But it is OK, 
if
you need other way.

> then it's easy enough to specify - and we can make it the zero/default
> value too.

OK. Then we will make MIX as zero/default and add "allowed_event_types" in
event queue config.

/Jerin

> 
> Our software implementation for now, only supports one type per queue -
> which we suspect should meet a lot of use-cases. We'll have to see about
> adding in mixed types in future.
> 
> /Bruce


[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-10-28 Thread Van Haaren, Harry
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob
> Sent: Tuesday, October 25, 2016 6:49 PM

> 
> Hi Community,
> 
> So far, I have received constructive feedback from Intel, NXP and Linaro 
> folks.
> Let me know, if anyone else interested in contributing to the definition of 
> eventdev?
> 
> If there are no major issues in proposed spec, then Cavium would like work on
> implementing and up-streaming the common code(lib/librte_eventdev/) and
> an associated HW driver.(Requested minor changes of v2 will be addressed
> in next version).


Hi All,

I've been looking at the eventdev API from a use-case point of view, and I'm 
unclear on a how the API caters for two uses. I have simplified these as much 
as possible, think of them as a theoretical unit-test for the API :)


Fragmentation:
1. Dequeue 8 packets
2. Process 2 packets
3. Processing 3rd, this packet needs fragmentation into two packets
4. Process remaining 5 packets as normal

What function calls does the application make to achieve this?
In particular, I'm referring to how can the scheduler know that the 3rd packet 
is the one being fragmented, and how to keep packet order valid. 


Dropping packets:
1. Dequeue 8 packets
2. Process 2 packets
3. Processing 3rd, this packet needs to be dropped
4. Process remaining 5 packets as normal

What function calls does the application make to achieve this?
Again, in particular how does the scheduler know that the 3rd packet is being 
dropped.


Regards, -Harry


[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-10-28 Thread Van Haaren, Harry
> From: Vincent Jardin [mailto:vincent.jardin at 6wind.com]
> Sent: Wednesday, October 26, 2016 7:37 PM
> Le 26 octobre 2016 2:11:26 PM "Van Haaren, Harry"
>  a ?crit :
> 
> >> -Original Message-
> >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob
> >>
> >> So far, I have received constructive feedback from Intel, NXP and Linaro 
> >> folks.
> >> Let me know, if anyone else interested in contributing to the definition of
> >> eventdev?
> >>
> >> If there are no major issues in proposed spec, then Cavium would like work 
> >> on
> >> implementing and up-streaming the common code(lib/librte_eventdev/) and
> >> an associated HW driver.(Requested minor changes of v2 will be addressed
> >> in next version).
> >
> > Hi All,
> >
> > I will propose a minor change to the rte_event struct, allowing some bits
> > to be implementation specific. Currently the rte_event struct has no space
> > to allow an implementation store any metadata about the event. For software
> > performance it would be really helpful if there are some bits available for
> > the implementation to keep some flags about each event.
> >
> > I suggest to rework the struct as below which opens 6 bits that were
> > otherwise wasted, and define them as implementation specific. By
> > implementation specific it is understood that the implementation can
> > overwrite any information stored in those bits, and the application must
> > not expect the data to remain after the event is scheduled.
> >
> > OLD:
> > struct rte_event {
> > uint32_t flow_id:24;
> > uint32_t queue_id:8;
> > uint8_t  sched_type; /* Note only 2 bits of 8 are required */
> >
> > NEW:
> > struct rte_event {
> > uint32_t flow_id:24;
> > uint32_t sched_type:2; /* reduced size : but 2 bits is enough for the
> > enqueue types Ordered,Atomic,Parallel.*/
> > uint32_t implementation:6; /* available for implementation specific
> > metadata */
> > uint8_t queue_id; /* still 8 bits as before */
> 
> Bitfileds are efficients on Octeon. What's about other CPUs you have in
> mind? x86 is not as efficient.

Given the rte_event struct is 16 bytes and there's no free space to use, I see 
no alternative than using bitfields in this case. Wecloming suggestions of a 
better way to layout the structure to avoid the bitfield.

Regards, -Harry


[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-10-28 Thread Bruce Richardson
On Fri, Oct 28, 2016 at 08:31:41AM +0530, Jerin Jacob wrote:
> On Wed, Oct 26, 2016 at 01:54:14PM +0100, Bruce Richardson wrote:
> > On Wed, Oct 26, 2016 at 05:54:17PM +0530, Jerin Jacob wrote:
> > > On Wed, Oct 26, 2016 at 12:11:03PM +, Van Haaren, Harry wrote:
> > > > > -Original Message-
> > > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob
> > Thanks. One other suggestion is that it might be useful to provide
> > support for having typed queues explicitly in the API. Right now, when
> > you create an queue, the queue_conf structure takes as parameters how
> > many atomic flows that are needed for the queue, or how many reorder
> > slots need to be reserved for it. This implicitly hints at the type of
> > traffic which will be sent to the queue, but I'm wondering if it's
> > better to make it explicit. There are certain optimisations that can be
> > looked at if we know that a queue only handles packets of a particular
> > type. [Not having to handle reordering when pulling events from a core
> > can be a big win for software!].
> 
> If it helps in SW implementation, then I think we can add this in queue
> configuration. 
> 
> > 
> > How about adding: "allowed_event_types" as a field to
> > rte_event_queue_conf, with possible values:
> > * atomic
> > * ordered
> > * parallel
> > * mixed - allowing all 3 types. I think allowing 2 of three types might
> > make things too complicated.
> > 
> > An open question would then be how to behave when the queue type and
> > requested event type conflict. We can either throw an error, or just
> > ignore the event type and always treat enqueued events as being of the
> > queue type. I prefer the latter, because it's faster not having to
> > error-check, and it pushes the responsibility on the app to know what
> > it's doing.
> 
> How about making default as "mixed" and let application configures what
> is not required?. That way application responsibility is clear.
> something similar to ETH_TXQ_FLAGS_NOMULTSEGS, ETH_TXQ_FLAGS_NOREFCOUNT
> with default.
> 
I suppose it could work, but why bother doing that? If an app knows it's
only going to use one traffic type, why not let it just state what it
will do rather than try to specify what it won't do. If mixed is needed,
then it's easy enough to specify - and we can make it the zero/default
value too.

Our software implementation for now, only supports one type per queue -
which we suspect should meet a lot of use-cases. We'll have to see about
adding in mixed types in future.

/Bruce


[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-10-28 Thread Jerin Jacob
On Wed, Oct 26, 2016 at 01:54:14PM +0100, Bruce Richardson wrote:
> On Wed, Oct 26, 2016 at 05:54:17PM +0530, Jerin Jacob wrote:
> > On Wed, Oct 26, 2016 at 12:11:03PM +, Van Haaren, Harry wrote:
> > > > -Original Message-
> > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob
> Thanks. One other suggestion is that it might be useful to provide
> support for having typed queues explicitly in the API. Right now, when
> you create an queue, the queue_conf structure takes as parameters how
> many atomic flows that are needed for the queue, or how many reorder
> slots need to be reserved for it. This implicitly hints at the type of
> traffic which will be sent to the queue, but I'm wondering if it's
> better to make it explicit. There are certain optimisations that can be
> looked at if we know that a queue only handles packets of a particular
> type. [Not having to handle reordering when pulling events from a core
> can be a big win for software!].

If it helps in SW implementation, then I think we can add this in queue
configuration. 

> 
> How about adding: "allowed_event_types" as a field to
> rte_event_queue_conf, with possible values:
> * atomic
> * ordered
> * parallel
> * mixed - allowing all 3 types. I think allowing 2 of three types might
> make things too complicated.
> 
> An open question would then be how to behave when the queue type and
> requested event type conflict. We can either throw an error, or just
> ignore the event type and always treat enqueued events as being of the
> queue type. I prefer the latter, because it's faster not having to
> error-check, and it pushes the responsibility on the app to know what
> it's doing.

How about making default as "mixed" and let application configures what
is not required?. That way application responsibility is clear.
something similar to ETH_TXQ_FLAGS_NOMULTSEGS, ETH_TXQ_FLAGS_NOREFCOUNT
with default.

/Jerin


> 
> /Bruce


[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-10-27 Thread Jerin Jacob
On Wed, Oct 26, 2016 at 01:43:25PM +0100, Bruce Richardson wrote:
> On Tue, Oct 25, 2016 at 11:19:05PM +0530, Jerin Jacob wrote:
> > On Wed, Oct 12, 2016 at 01:00:16AM +0530, Jerin Jacob wrote:
> > > Thanks to Intel and NXP folks for the positive and constructive feedback
> > > I've received so far. Here is the updated RFC(v2).
> > > 
> > > I've attempted to address as many comments as possible.
> > > 
> > > This series adds rte_eventdev.h to the DPDK tree with
> > > adequate documentation in doxygen format.
> > > 
> > > Updates are also available online:
> > > 
> > > Related draft header file (this patch):
> > > https://rawgit.com/jerinjacobk/libeventdev/master/rte_eventdev.h
> > > 
> > > PDF version(doxgen output):
> > > https://rawgit.com/jerinjacobk/libeventdev/master/librte_eventdev_v2.pdf
> > > 
> > > Repo:
> > > https://github.com/jerinjacobk/libeventdev
> > >
> > 
> > Hi Community,
> > 
> > So far, I have received constructive feedback from Intel, NXP and Linaro 
> > folks.
> > Let me know, if anyone else interested in contributing to the definition of 
> > eventdev?
> > 
> > If there are no major issues in proposed spec, then Cavium would like work 
> > on
> > implementing and up-streaming the common code(lib/librte_eventdev/) and
> > an associated HW driver.(Requested minor changes of v2 will be addressed
> > in next version).
> > 
> > We are planning to submit the work for 17.02 or 17.05 release(based on
> > how implementation goes).
> > 
> 
> Hi Jerin,

Hi Bruce,

> 
> thanks for driving this. In terms of the common code framework, when
> would you see that you might have something to upstream for that? As you
> know, we've been working on a software implementation which we are now
> looking to move to the eventdev APIs, and which also needs this common
> code to support it. 
> 
> If it can accelerate this effort, we can perhaps provide as an RFC
> the common code part that we have implemented for our work, or else we
> are happy to migrate to use common code you provide if it can be
> upstreamed fairly soon.

I have already started the common code framework. I will send the common code
as RFC in couple of days with vdev and pci bus interface.

> 
> Regards,
> /Bruce


[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-10-26 Thread Vincent Jardin


Le 26 octobre 2016 2:11:26 PM "Van Haaren, Harry" 
 a ?crit :

>> -Original Message-
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob
>>
>> So far, I have received constructive feedback from Intel, NXP and Linaro 
>> folks.
>> Let me know, if anyone else interested in contributing to the definition of 
>> eventdev?
>>
>> If there are no major issues in proposed spec, then Cavium would like work on
>> implementing and up-streaming the common code(lib/librte_eventdev/) and
>> an associated HW driver.(Requested minor changes of v2 will be addressed
>> in next version).
>
> Hi All,
>
> I will propose a minor change to the rte_event struct, allowing some bits 
> to be implementation specific. Currently the rte_event struct has no space 
> to allow an implementation store any metadata about the event. For software 
> performance it would be really helpful if there are some bits available for 
> the implementation to keep some flags about each event.
>
> I suggest to rework the struct as below which opens 6 bits that were 
> otherwise wasted, and define them as implementation specific. By 
> implementation specific it is understood that the implementation can 
> overwrite any information stored in those bits, and the application must 
> not expect the data to remain after the event is scheduled.
>
> OLD:
> struct rte_event {
>   uint32_t flow_id:24;
>   uint32_t queue_id:8;
>   uint8_t  sched_type; /* Note only 2 bits of 8 are required */
>
> NEW:
> struct rte_event {
>   uint32_t flow_id:24;
>   uint32_t sched_type:2; /* reduced size : but 2 bits is enough for the 
> enqueue types Ordered,Atomic,Parallel.*/
>   uint32_t implementation:6; /* available for implementation specific 
> metadata */
>   uint8_t queue_id; /* still 8 bits as before */

Bitfileds are efficients on Octeon. What's about other CPUs you have in 
mind? x86 is not as efficient.


>
>
> Thoughts? -Harry




[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-10-26 Thread Jerin Jacob
On Wed, Oct 26, 2016 at 12:11:03PM +, Van Haaren, Harry wrote:
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob
> > 
> > So far, I have received constructive feedback from Intel, NXP and Linaro 
> > folks.
> > Let me know, if anyone else interested in contributing to the definition of 
> > eventdev?
> > 
> > If there are no major issues in proposed spec, then Cavium would like work 
> > on
> > implementing and up-streaming the common code(lib/librte_eventdev/) and
> > an associated HW driver.(Requested minor changes of v2 will be addressed
> > in next version).
> 
> Hi All,
> 
> I will propose a minor change to the rte_event struct, allowing some bits to 
> be implementation specific. Currently the rte_event struct has no space to 
> allow an implementation store any metadata about the event. For software 
> performance it would be really helpful if there are some bits available for 
> the implementation to keep some flags about each event.

OK.

> 
> I suggest to rework the struct as below which opens 6 bits that were 
> otherwise wasted, and define them as implementation specific. By 
> implementation specific it is understood that the implementation can 
> overwrite any information stored in those bits, and the application must not 
> expect the data to remain after the event is scheduled.
> 
> OLD:
> struct rte_event {
>   uint32_t flow_id:24;
>   uint32_t queue_id:8;
>   uint8_t  sched_type; /* Note only 2 bits of 8 are required */
> 
> NEW:
> struct rte_event {
>   uint32_t flow_id:24;
>   uint32_t sched_type:2; /* reduced size : but 2 bits is enough for the 
> enqueue types Ordered,Atomic,Parallel.*/
>   uint32_t implementation:6; /* available for implementation specific 
> metadata */
>   uint8_t queue_id; /* still 8 bits as before */
> 
> 
> Thoughts? -Harry

Looks good to me. I will add it in v3.




[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-10-26 Thread Bruce Richardson
On Wed, Oct 26, 2016 at 05:54:17PM +0530, Jerin Jacob wrote:
> On Wed, Oct 26, 2016 at 12:11:03PM +, Van Haaren, Harry wrote:
> > > -Original Message-
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob
> > > 
> > > So far, I have received constructive feedback from Intel, NXP and Linaro 
> > > folks.
> > > Let me know, if anyone else interested in contributing to the definition 
> > > of eventdev?
> > > 
> > > If there are no major issues in proposed spec, then Cavium would like 
> > > work on
> > > implementing and up-streaming the common code(lib/librte_eventdev/) and
> > > an associated HW driver.(Requested minor changes of v2 will be addressed
> > > in next version).
> > 
> > Hi All,
> > 
> > I will propose a minor change to the rte_event struct, allowing some bits 
> > to be implementation specific. Currently the rte_event struct has no space 
> > to allow an implementation store any metadata about the event. For software 
> > performance it would be really helpful if there are some bits available for 
> > the implementation to keep some flags about each event.
> 
> OK.
> 
> > 
> > I suggest to rework the struct as below which opens 6 bits that were 
> > otherwise wasted, and define them as implementation specific. By 
> > implementation specific it is understood that the implementation can 
> > overwrite any information stored in those bits, and the application must 
> > not expect the data to remain after the event is scheduled.
> > 
> > OLD:
> > struct rte_event {
> > uint32_t flow_id:24;
> > uint32_t queue_id:8;
> > uint8_t  sched_type; /* Note only 2 bits of 8 are required */
> > 
> > NEW:
> > struct rte_event {
> > uint32_t flow_id:24;
> > uint32_t sched_type:2; /* reduced size : but 2 bits is enough for the 
> > enqueue types Ordered,Atomic,Parallel.*/
> > uint32_t implementation:6; /* available for implementation specific 
> > metadata */
> > uint8_t queue_id; /* still 8 bits as before */
> > 
> > 
> > Thoughts? -Harry
> 
> Looks good to me. I will add it in v3.
> 
Thanks. One other suggestion is that it might be useful to provide
support for having typed queues explicitly in the API. Right now, when
you create an queue, the queue_conf structure takes as parameters how
many atomic flows that are needed for the queue, or how many reorder
slots need to be reserved for it. This implicitly hints at the type of
traffic which will be sent to the queue, but I'm wondering if it's
better to make it explicit. There are certain optimisations that can be
looked at if we know that a queue only handles packets of a particular
type. [Not having to handle reordering when pulling events from a core
can be a big win for software!].

How about adding: "allowed_event_types" as a field to
rte_event_queue_conf, with possible values:
* atomic
* ordered
* parallel
* mixed - allowing all 3 types. I think allowing 2 of three types might
make things too complicated.

An open question would then be how to behave when the queue type and
requested event type conflict. We can either throw an error, or just
ignore the event type and always treat enqueued events as being of the
queue type. I prefer the latter, because it's faster not having to
error-check, and it pushes the responsibility on the app to know what
it's doing.

/Bruce


[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-10-26 Thread Bruce Richardson
On Tue, Oct 25, 2016 at 11:19:05PM +0530, Jerin Jacob wrote:
> On Wed, Oct 12, 2016 at 01:00:16AM +0530, Jerin Jacob wrote:
> > Thanks to Intel and NXP folks for the positive and constructive feedback
> > I've received so far. Here is the updated RFC(v2).
> > 
> > I've attempted to address as many comments as possible.
> > 
> > This series adds rte_eventdev.h to the DPDK tree with
> > adequate documentation in doxygen format.
> > 
> > Updates are also available online:
> > 
> > Related draft header file (this patch):
> > https://rawgit.com/jerinjacobk/libeventdev/master/rte_eventdev.h
> > 
> > PDF version(doxgen output):
> > https://rawgit.com/jerinjacobk/libeventdev/master/librte_eventdev_v2.pdf
> > 
> > Repo:
> > https://github.com/jerinjacobk/libeventdev
> >
> 
> Hi Community,
> 
> So far, I have received constructive feedback from Intel, NXP and Linaro 
> folks.
> Let me know, if anyone else interested in contributing to the definition of 
> eventdev?
> 
> If there are no major issues in proposed spec, then Cavium would like work on
> implementing and up-streaming the common code(lib/librte_eventdev/) and
> an associated HW driver.(Requested minor changes of v2 will be addressed
> in next version).
> 
> We are planning to submit the work for 17.02 or 17.05 release(based on
> how implementation goes).
> 

Hi Jerin,

thanks for driving this. In terms of the common code framework, when
would you see that you might have something to upstream for that? As you
know, we've been working on a software implementation which we are now
looking to move to the eventdev APIs, and which also needs this common
code to support it. 

If it can accelerate this effort, we can perhaps provide as an RFC
the common code part that we have implemented for our work, or else we
are happy to migrate to use common code you provide if it can be
upstreamed fairly soon.

Regards,
/Bruce


[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-10-26 Thread Van Haaren, Harry
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob
> 
> So far, I have received constructive feedback from Intel, NXP and Linaro 
> folks.
> Let me know, if anyone else interested in contributing to the definition of 
> eventdev?
> 
> If there are no major issues in proposed spec, then Cavium would like work on
> implementing and up-streaming the common code(lib/librte_eventdev/) and
> an associated HW driver.(Requested minor changes of v2 will be addressed
> in next version).

Hi All,

I will propose a minor change to the rte_event struct, allowing some bits to be 
implementation specific. Currently the rte_event struct has no space to allow 
an implementation store any metadata about the event. For software performance 
it would be really helpful if there are some bits available for the 
implementation to keep some flags about each event.

I suggest to rework the struct as below which opens 6 bits that were otherwise 
wasted, and define them as implementation specific. By implementation specific 
it is understood that the implementation can overwrite any information stored 
in those bits, and the application must not expect the data to remain after the 
event is scheduled.

OLD:
struct rte_event {
uint32_t flow_id:24;
uint32_t queue_id:8;
uint8_t  sched_type; /* Note only 2 bits of 8 are required */

NEW:
struct rte_event {
uint32_t flow_id:24;
uint32_t sched_type:2; /* reduced size : but 2 bits is enough for the 
enqueue types Ordered,Atomic,Parallel.*/
uint32_t implementation:6; /* available for implementation specific 
metadata */
uint8_t queue_id; /* still 8 bits as before */


Thoughts? -Harry


[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-10-26 Thread Jerin Jacob
On Wed, Oct 12, 2016 at 01:00:16AM +0530, Jerin Jacob wrote:
> Thanks to Intel and NXP folks for the positive and constructive feedback
> I've received so far. Here is the updated RFC(v2).
> 
> I've attempted to address as many comments as possible.
> 
> This series adds rte_eventdev.h to the DPDK tree with
> adequate documentation in doxygen format.
> 
> Updates are also available online:
> 
> Related draft header file (this patch):
> https://rawgit.com/jerinjacobk/libeventdev/master/rte_eventdev.h
> 
> PDF version(doxgen output):
> https://rawgit.com/jerinjacobk/libeventdev/master/librte_eventdev_v2.pdf
> 
> Repo:
> https://github.com/jerinjacobk/libeventdev
>

Hi Community,

So far, I have received constructive feedback from Intel, NXP and Linaro folks.
Let me know, if anyone else interested in contributing to the definition of 
eventdev?

If there are no major issues in proposed spec, then Cavium would like work on
implementing and up-streaming the common code(lib/librte_eventdev/) and
an associated HW driver.(Requested minor changes of v2 will be addressed
in next version).

We are planning to submit the work for 17.02 or 17.05 release(based on
how implementation goes).

/Jerin
Cavium


[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-10-18 Thread Jerin Jacob
On Mon, Oct 17, 2016 at 08:26:33PM +, Eads, Gage wrote:
> 
> 
> >  -Original Message-
> >  From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com]
> >  Sent: Sunday, October 16, 2016 11:18 PM
> >  To: Eads, Gage 
> >  Cc: dev at dpdk.org; thomas.monjalon at 6wind.com; Richardson, Bruce
> >  ; Vangati, Narender
> >  ; hemant.agrawal at nxp.com
> >  Subject: Re: [dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven
> >  programming model framework for DPDK
> >  
> >  On Fri, Oct 14, 2016 at 03:00:57PM +, Eads, Gage wrote:
> >  > Thanks Jerin, this looks good. I've put a few notes/questions inline.
> >  
> >  Thanks Gage.
> >  
> >  >
> >  > >  +
> >  > >  +/**
> >  > >  + * Get the device identifier for the named event device.
> >  > >  + *
> >  > >  + * @param name
> >  > >  + *   Event device name to select the event device identifier.
> >  > >  + *
> >  > >  + * @return
> >  > >  + *   Returns event device identifier on success.
> >  > >  + *   - <0: Failure to find named event device.
> >  > >  + */
> >  > >  +extern uint8_t
> >  > >  +rte_event_dev_get_dev_id(const char *name);
> >  >
> >  > This return type should be int8_t, or some signed type, to support the 
> > failure
> >  case.
> >  
> >  Makes sense. I will change to int to make consistent with
> >  rte_cryptodev_get_dev_id()
> >  
> >  >
> >  > >  +};
> >  > >  +
> >  > >  +/**
> >  > >  + * Schedule one or more events in the event dev.
> >  > >  + *
> >  > >  + * An event dev implementation may define this is a NOOP, for
> >  > > instance if  + * the event dev performs its scheduling in hardware.
> >  > >  + *
> >  > >  + * @param dev_id
> >  > >  + *   The identifier of the device.
> >  > >  + */
> >  > >  +extern void
> >  > >  +rte_event_schedule(uint8_t dev_id);
> >  >
> >  > One idea: Have the function return the number of scheduled packets (or 0 
> > for
> >  implementations that do scheduling in hardware). This could be a helpful
> >  diagnostic for the software scheduler.
> >  
> >  How about returning an implementation specific value ?
> >  Rather than defining certain function associated with returned value.
> >  Just to  make sure it works with all HW/SW implementations. Something like
> >  below,
> >  
> >  /**
> >   * Schedule one or more events in the event dev.
> >   *
> >   * An event dev implementation may define this is a NOOP, for instance if
> >   * the event dev performs its scheduling in hardware.
> >   *
> >   * @param dev_id
> >   *   The identifier of the device.
> >   * @return
> >   *   Implementation specific value from the event driver for diagnostic 
> > purpose
> >   */
> >  extern int
> >  rte_event_schedule(uint8_t dev_id);
> >  
> >  
> 
> That's fine by me.

OK. I will change it in v3

> 
> I also had a comment on the return value of rte_event_dev_info_get() in my 
> previous email: "I'm wondering if this return type should be int, so we can 
> return an error if the dev_id is invalid."
> 
> What do you think?

The void return was based on cryptodev_info_get().I think, it makes
sense to return "int". I will change it in v3.


> 
> Thanks,
> Gage
> 
> >  
> >  


[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-10-17 Thread Eads, Gage


>  -Original Message-
>  From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com]
>  Sent: Sunday, October 16, 2016 11:18 PM
>  To: Eads, Gage 
>  Cc: dev at dpdk.org; thomas.monjalon at 6wind.com; Richardson, Bruce
>  ; Vangati, Narender
>  ; hemant.agrawal at nxp.com
>  Subject: Re: [dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven
>  programming model framework for DPDK
>  
>  On Fri, Oct 14, 2016 at 03:00:57PM +, Eads, Gage wrote:
>  > Thanks Jerin, this looks good. I've put a few notes/questions inline.
>  
>  Thanks Gage.
>  
>  >
>  > >  +
>  > >  +/**
>  > >  + * Get the device identifier for the named event device.
>  > >  + *
>  > >  + * @param name
>  > >  + *   Event device name to select the event device identifier.
>  > >  + *
>  > >  + * @return
>  > >  + *   Returns event device identifier on success.
>  > >  + *   - <0: Failure to find named event device.
>  > >  + */
>  > >  +extern uint8_t
>  > >  +rte_event_dev_get_dev_id(const char *name);
>  >
>  > This return type should be int8_t, or some signed type, to support the 
> failure
>  case.
>  
>  Makes sense. I will change to int to make consistent with
>  rte_cryptodev_get_dev_id()
>  
>  >
>  > >  +};
>  > >  +
>  > >  +/**
>  > >  + * Schedule one or more events in the event dev.
>  > >  + *
>  > >  + * An event dev implementation may define this is a NOOP, for
>  > > instance if  + * the event dev performs its scheduling in hardware.
>  > >  + *
>  > >  + * @param dev_id
>  > >  + *   The identifier of the device.
>  > >  + */
>  > >  +extern void
>  > >  +rte_event_schedule(uint8_t dev_id);
>  >
>  > One idea: Have the function return the number of scheduled packets (or 0 
> for
>  implementations that do scheduling in hardware). This could be a helpful
>  diagnostic for the software scheduler.
>  
>  How about returning an implementation specific value ?
>  Rather than defining certain function associated with returned value.
>  Just to  make sure it works with all HW/SW implementations. Something like
>  below,
>  
>  /**
>   * Schedule one or more events in the event dev.
>   *
>   * An event dev implementation may define this is a NOOP, for instance if
>   * the event dev performs its scheduling in hardware.
>   *
>   * @param dev_id
>   *   The identifier of the device.
>   * @return
>   *   Implementation specific value from the event driver for diagnostic 
> purpose
>   */
>  extern int
>  rte_event_schedule(uint8_t dev_id);
>  
>  

That's fine by me.

I also had a comment on the return value of rte_event_dev_info_get() in my 
previous email: "I'm wondering if this return type should be int, so we can 
return an error if the dev_id is invalid."

What do you think?

Thanks,
Gage

>  
>  


[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-10-17 Thread Jerin Jacob
On Fri, Oct 14, 2016 at 05:02:21PM +0100, Bruce Richardson wrote:
> On Wed, Oct 12, 2016 at 01:00:16AM +0530, Jerin Jacob wrote:
> > Thanks to Intel and NXP folks for the positive and constructive feedback
> > I've received so far. Here is the updated RFC(v2).
> > 
> > I've attempted to address as many comments as possible.
> > 
> > This series adds rte_eventdev.h to the DPDK tree with
> > adequate documentation in doxygen format.
> > 
> > Updates are also available online:
> > 
> > Related draft header file (this patch):
> > https://rawgit.com/jerinjacobk/libeventdev/master/rte_eventdev.h
> > 
> > PDF version(doxgen output):
> > https://rawgit.com/jerinjacobk/libeventdev/master/librte_eventdev_v2.pdf
> > 
> > Repo:
> > https://github.com/jerinjacobk/libeventdev
> > 
> 
> Thanks for all the work on this.

Thanks

> 
> 
> > +/* Event device configuration bitmap flags */
> > +#define RTE_EVENT_DEV_CFG_PER_DEQUEUE_WAIT (1 << 0)
> > +/**< Override the global *dequeue_wait_ns* and use per dequeue wait in ns.
> > + *  \see rte_event_dequeue_wait_time(), rte_event_dequeue()
> > + */
> 
> Can you clarify why this is needed? If an app wants to use the same
> dequeue wait times for all dequeues can it not specify that itself via
> the wait time parameter, rather than having a global dequeue wait value?

The rational for choosing this scheme to have optimized
rte_event_dequeue() for some implementation without loosing application
portability and need.

We mostly have two different types of HW schemes to define the wait time

HW1) Have only global wait value for the eventdev across all the
dequeue
HW2) Per queue wait value

In-terms of applications,
APP1) Trivial application does not need different dequeue value for each
dequeue
APP2) Non trivial applications does need different dequeue values

This config option can take advantage if application demands only APP1
on HW1 without loosing application potablity.(i.e if application demand
for APP2 scheme then HW1 based implementation can have different function
pointer to implement dequeue function)

The overall theme of the proposal to have more configuration options(like
RTE_EVENT_QUEUE_CFG_SINGLE_CONSUMER) to have high performance SW/HW 
implementations



[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-10-17 Thread Jerin Jacob
On Fri, Oct 14, 2016 at 03:00:57PM +, Eads, Gage wrote:
> Thanks Jerin, this looks good. I've put a few notes/questions inline.

Thanks Gage.

> 
> >  +
> >  +/**
> >  + * Get the device identifier for the named event device.
> >  + *
> >  + * @param name
> >  + *   Event device name to select the event device identifier.
> >  + *
> >  + * @return
> >  + *   Returns event device identifier on success.
> >  + *   - <0: Failure to find named event device.
> >  + */
> >  +extern uint8_t
> >  +rte_event_dev_get_dev_id(const char *name);
> 
> This return type should be int8_t, or some signed type, to support the 
> failure case.

Makes sense. I will change to int to make consistent with 
rte_cryptodev_get_dev_id()

> 
> >  +};
> >  +
> >  +/**
> >  + * Schedule one or more events in the event dev.
> >  + *
> >  + * An event dev implementation may define this is a NOOP, for instance if
> >  + * the event dev performs its scheduling in hardware.
> >  + *
> >  + * @param dev_id
> >  + *   The identifier of the device.
> >  + */
> >  +extern void
> >  +rte_event_schedule(uint8_t dev_id);
> 
> One idea: Have the function return the number of scheduled packets (or 0 for 
> implementations that do scheduling in hardware). This could be a helpful 
> diagnostic for the software scheduler.

How about returning an implementation specific value ?
Rather than defining certain function associated with returned value.
Just to  make sure it works with all HW/SW implementations. Something like 
below,

/**
 * Schedule one or more events in the event dev.
 *
 * An event dev implementation may define this is a NOOP, for instance if
 * the event dev performs its scheduling in hardware.
 *
 * @param dev_id
 *   The identifier of the device.
 * @return
 *   Implementation specific value from the event driver for diagnostic purpose
 */
extern int
rte_event_schedule(uint8_t dev_id);






[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-10-14 Thread Jerin Jacob
On Fri, Oct 14, 2016 at 10:30:33AM +, Hemant Agrawal wrote:

> > > Am I reading this correctly that there is no way to support an
> > > indefinite waiting capability? Or is this just saying that if a timed
> > > wait is performed there are min/max limits for the wait duration?
> > 
> > Application can wait indefinite if required. see
> > RTE_EVENT_DEV_CFG_PER_DEQUEUE_WAIT configuration option.
> > 
> > Trivial application may not need different wait values on each dequeue.This 
> > is a
> > performance optimization opportunity for implementation.
> 
>  Jerin, It is irrespective of wait configuration, whether you are using per 
> device wait or per dequeuer wait. 
>  Can the value of MAX_U32 or MAX_U64 be treated as infinite weight? 

That will be yet another check in the fast path in the implementation, I
think, for more fine-grained wait scheme. Let application configure the device
with RTE_EVENT_DEV_CFG_PER_DEQUEUE_WAIT so that the application can have
two different function pointer-based implementation for dequeue function
if required.

With RTE_EVENT_DEV_CFG_PER_DEQUEUE_WAIT configuration, implicitly
MAX_U64 becomes infinite weight as the wait is uint64_t.
I can add this info in v3 if required.

Jerin

> 
> > 


[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-10-14 Thread Bruce Richardson
On Wed, Oct 12, 2016 at 01:00:16AM +0530, Jerin Jacob wrote:
> Thanks to Intel and NXP folks for the positive and constructive feedback
> I've received so far. Here is the updated RFC(v2).
> 
> I've attempted to address as many comments as possible.
> 
> This series adds rte_eventdev.h to the DPDK tree with
> adequate documentation in doxygen format.
> 
> Updates are also available online:
> 
> Related draft header file (this patch):
> https://rawgit.com/jerinjacobk/libeventdev/master/rte_eventdev.h
> 
> PDF version(doxgen output):
> https://rawgit.com/jerinjacobk/libeventdev/master/librte_eventdev_v2.pdf
> 
> Repo:
> https://github.com/jerinjacobk/libeventdev
> 

Thanks for all the work on this.


> +/* Event device configuration bitmap flags */
> +#define RTE_EVENT_DEV_CFG_PER_DEQUEUE_WAIT (1 << 0)
> +/**< Override the global *dequeue_wait_ns* and use per dequeue wait in ns.
> + *  \see rte_event_dequeue_wait_time(), rte_event_dequeue()
> + */

Can you clarify why this is needed? If an app wants to use the same
dequeue wait times for all dequeues can it not specify that itself via
the wait time parameter, rather than having a global dequeue wait value?

/Bruce


[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-10-14 Thread Francois Ozog
Dear Jerin,

Very nice work!

This new RFC version opens the way to a unified conceptual model of
Software Defined Data Planes supported by diverse implementations such
as OpenDataPlane and DPDK.

I think this is an important signal to the industry.

Fran?ois-Fr?d?ric


From: dev <dev-boun...@dpdk.org> on behalf of Jerin Jacob

Sent: Tuesday, October 11, 2016 9:30 PM
To: dev at dpdk.org
Cc: thomas.monjalon at 6wind.com; bruce.richardson at intel.com;
narender.vangati at intel.com; hemant.agrawal at nxp.com;
gage.eads at intel.com; Jerin Jacob
Subject: [dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven
programming model framework for DPDK

Thanks to Intel and NXP folks for the positive and constructive feedback
I've received so far. Here is the updated RFC(v2).

I've attempted to address as many comments as possible.

This series adds rte_eventdev.h to the DPDK tree with
adequate documentation in doxygen format.

Updates are also available online:

Related draft header file (this patch):
https://rawgit.com/jerinjacobk/libeventdev/master/rte_eventdev.h

PDF version(doxgen output):
https://rawgit.com/jerinjacobk/libeventdev/master/librte_eventdev_v2.pdf

Repo:
https://github.com/jerinjacobk/libeventdev

v1..v2

- Added Cavium, Intel, NXP copyrights in header file

- Changed the concept of flow queues to flow ids.
This is avoid dictating a specific structure to hold the flows.
A s/w implementation can do atomic load balancing on multiple
flow ids more efficiently than maintaining each event in a specific flow queue.

- Change the scheduling group to event queue.
A scheduling group is more a stream of events, so an event queue is a better
 abstraction.

- Introduced event port concept, Instead of trying eventdev access to the lcore,
a higher level of abstraction called event port is needed which is the
application i/f to the eventdev to dequeue and enqueue the events.
One or more event queues can be linked to single event port.
There can be more than one event port per lcore allowing multiple lightweight
threads to have their own i/f into eventdev, if the implementation supports it.
An event port will be bound to a lcore or a lightweight thread to keep
portable application workflow.
An event port abstraction also encapsulates dequeue depth and enqueue depth for
a scheduler implementations which can schedule multiple events at a time and
output events that can be buffered.

- Added configuration options with event queue(nb_atomic_flows,
nb_atomic_order_sequences, single consumer etc)
and event port(dequeue_queue_depth, enqueue_queue_depth etc) to define the
limits on the resource usage.(Useful for optimized software implementation)

- Introduced RTE_EVENT_DEV_CAP_QUEUE_QOS and RTE_EVENT_DEV_CAP_EVENT_QOS
schemes of priority handling

- Added event port to event queue servicing priority.
This allows two event ports to connect to the same event queue with
different priorities.

- Changed the workflow as schedule/dequeue/enqueue.
An implementation is free to define schedule as NOOP.
A distributed s/w scheduler can use this to schedule events;
also a centralized s/w scheduler can make this a NOOP on non-scheduler cores.

- Removed Cavium HW specific schedule_from_group API

- Removed Cavium HW specific ctxt_update/ctxt_wait APIs.
 Introduced a more generic "event pinning" concept. i.e
If the normal workflow is a dequeue -> do work based on event type -> enqueue,
a pin_event argument to enqueue
where the pinned event is returned through the normal dequeue)
allows application workflow to remain the same whether or not an
implementation supports it.

- Added dequeue() burst variant

- Added the definition of a closed/open system - where open system is memory
backed and closed system eventdev has limited capacity.
In such systems, it is also useful to denote per event port how many packets
can be active in the system.
This can serve as a threshold for ethdev like devices so they don't overwhelm
core to core events.

- Added the option to specify maximum amount of time(in ns) application needs
wait on dequeue()

- Removed the scheme of expressing the number of flows in log2 format

Open item or the item needs improvement.

- Abstract the differences in event QoS management with different
priority schemes
available in different HW or SW implementations with portable
application workflow.

Based on the feedback, there three different kinds of QoS support available in
three different HW or SW implementations.
1) Priority associated with the event queue
2) Priority associated with each event enqueue
(Same flow can have two different priority on two separate enqueue)
3) Priority associated with the flow(each flow has unique priority)

In v2, The differences abstracted based on device capability
(RTE_EVENT_DEV_CAP_QUEUE_QOS for the first scheme,
RTE_EVENT_DEV_CAP_EVENT_QOS for the second and third scheme).
This scheme wo

[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-10-14 Thread Eads, Gage
Thanks Jerin, this looks good. I've put a few notes/questions inline.

Thanks,
Gage

>  -Original Message-
>  From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com]
>  Sent: Tuesday, October 11, 2016 2:30 PM
>  To: dev at dpdk.org
>  Cc: thomas.monjalon at 6wind.com; Richardson, Bruce
>  ; Vangati, Narender
>  ; hemant.agrawal at nxp.com; Eads, Gage
>  ; Jerin Jacob 
>  Subject: [dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming
>  model framework for DPDK
>  
>  Thanks to Intel and NXP folks for the positive and constructive feedback
>  I've received so far. Here is the updated RFC(v2).
>  
>  I've attempted to address as many comments as possible.
>  
>  This series adds rte_eventdev.h to the DPDK tree with
>  adequate documentation in doxygen format.
>  
>  Updates are also available online:
>  
>  Related draft header file (this patch):
>  https://rawgit.com/jerinjacobk/libeventdev/master/rte_eventdev.h
>  
>  PDF version(doxgen output):
>  https://rawgit.com/jerinjacobk/libeventdev/master/librte_eventdev_v2.pdf
>  
>  Repo:
>  https://github.com/jerinjacobk/libeventdev
>  
>  v1..v2
>  
>  - Added Cavium, Intel, NXP copyrights in header file
>  
>  - Changed the concept of flow queues to flow ids.
>  This is avoid dictating a specific structure to hold the flows.
>  A s/w implementation can do atomic load balancing on multiple
>  flow ids more efficiently than maintaining each event in a specific flow 
> queue.
>  
>  - Change the scheduling group to event queue.
>  A scheduling group is more a stream of events, so an event queue is a better
>   abstraction.
>  
>  - Introduced event port concept, Instead of trying eventdev access to the 
> lcore,
>  a higher level of abstraction called event port is needed which is the
>  application i/f to the eventdev to dequeue and enqueue the events.
>  One or more event queues can be linked to single event port.
>  There can be more than one event port per lcore allowing multiple lightweight
>  threads to have their own i/f into eventdev, if the implementation supports 
> it.
>  An event port will be bound to a lcore or a lightweight thread to keep
>  portable application workflow.
>  An event port abstraction also encapsulates dequeue depth and enqueue depth
>  for
>  a scheduler implementations which can schedule multiple events at a time and
>  output events that can be buffered.
>  
>  - Added configuration options with event queue(nb_atomic_flows,
>  nb_atomic_order_sequences, single consumer etc)
>  and event port(dequeue_queue_depth, enqueue_queue_depth etc) to define
>  the
>  limits on the resource usage.(Useful for optimized software implementation)
>  
>  - Introduced RTE_EVENT_DEV_CAP_QUEUE_QOS and
>  RTE_EVENT_DEV_CAP_EVENT_QOS
>  schemes of priority handling
>  
>  - Added event port to event queue servicing priority.
>  This allows two event ports to connect to the same event queue with
>  different priorities.
>  
>  - Changed the workflow as schedule/dequeue/enqueue.
>  An implementation is free to define schedule as NOOP.
>  A distributed s/w scheduler can use this to schedule events;
>  also a centralized s/w scheduler can make this a NOOP on non-scheduler cores.
>  
>  - Removed Cavium HW specific schedule_from_group API
>  
>  - Removed Cavium HW specific ctxt_update/ctxt_wait APIs.
>   Introduced a more generic "event pinning" concept. i.e
>  If the normal workflow is a dequeue -> do work based on event type ->
>  enqueue,
>  a pin_event argument to enqueue
>  where the pinned event is returned through the normal dequeue)
>  allows application workflow to remain the same whether or not an
>  implementation supports it.
>  
>  - Added dequeue() burst variant
>  
>  - Added the definition of a closed/open system - where open system is memory
>  backed and closed system eventdev has limited capacity.
>  In such systems, it is also useful to denote per event port how many packets
>  can be active in the system.
>  This can serve as a threshold for ethdev like devices so they don't overwhelm
>  core to core events.
>  
>  - Added the option to specify maximum amount of time(in ns) application needs
>  wait on dequeue()
>  
>  - Removed the scheme of expressing the number of flows in log2 format
>  
>  Open item or the item needs improvement.
>  
>  - Abstract the differences in event QoS management with different priority
>  schemes
>  available in different HW or SW implementations with portable application
>  workflow.
>  
>  Based on the feedback, there three different kinds of QoS support available 
> in

[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-10-14 Thread Jerin Jacob
On Thu, Oct 13, 2016 at 11:14:38PM -0500, Bill Fischofer wrote:
> Hi Jerin,

Hi Bill,

Thanks for the review.

[snip]
> > + * If the device init operation is successful, the correspondence between
> > + * the device identifier assigned to the new device and its associated
> > + * *rte_event_dev* structure is effectively registered.
> > + * Otherwise, both the *rte_event_dev* structure and the device
> > identifier are
> > + * freed.
> > + *
> > + * The functions exported by the application Event API to setup a device
> > + * designated by its device identifier must be invoked in the following
> > order:
> > + * - rte_event_dev_configure()
> > + * - rte_event_queue_setup()
> > + * - rte_event_port_setup()
> > + * - rte_event_port_link()
> > + * - rte_event_dev_start()
> > + *
> > + * Then, the application can invoke, in any order, the functions
> > + * exported by the Event API to schedule events, dequeue events, enqueue
> > events,
> > + * change event queue(s) to event port [un]link establishment and so on.
> > + *
> > + * Application may use rte_event_[queue/port]_default_conf_get() to get
> > the
> > + * default configuration to set up an event queue or event port by
> > + * overriding few default values.
> > + *
> > + * If the application wants to change the configuration (i.e. call
> > + * rte_event_dev_configure(), rte_event_queue_setup(), or
> > + * rte_event_port_setup()), it must call rte_event_dev_stop() first to
> > stop the
> > + * device and then do the reconfiguration before calling
> > rte_event_dev_start()
> > + * again. The schedule, enqueue and dequeue functions should not be
> > invoked
> > + * when the device is stopped.
> >
> 
> Given this requirement, the question is what happens to events that are "in
> flight" at the time rte_event_dev_stop() is called? Is stop an asynchronous
> operation that quiesces the event _dev and allows in-flight events to drain
> from queues/ports prior to fully stopping, or is some sort of separate
> explicit quiesce mechanism required? If stop is synchronous and simply
> halts the event_dev, then how is an application to know if subsequent
> configure/setup calls would leave these pending events with no place to
> stand?
>

>From an application API perspective rte_event_dev_stop() is a synchronous 
>function.
If the stop has been called for re-configuring the number of queues, ports etc 
of
the device, then "in flight" entry preservation will be implementation defined.
else "in flight" entries will be preserved.

[snip]

> > +extern int
> > +rte_event_dev_socket_id(uint8_t dev_id);
> > +
> > +/* Event device capability bitmap flags */
> > +#define RTE_EVENT_DEV_CAP_QUEUE_QOS(1 << 0)
> > +/**< Event scheduling prioritization is based on the priority associated
> > with
> > + *  each event queue.
> > + *
> > + *  \see rte_event_queue_setup(), RTE_EVENT_QUEUE_PRIORITY_NORMAL
> > + */
> > +#define RTE_EVENT_DEV_CAP_EVENT_QOS(1 << 1)
> > +/**< Event scheduling prioritization is based on the priority associated
> > with
> > + *  each event. Priority of each event is supplied in *rte_event*
> > structure
> > + *  on each enqueue operation.
> > + *
> > + *  \see rte_event_enqueue()
> > + */
> > +
> > +/**
> > + * Event device information
> > + */
> > +struct rte_event_dev_info {
> > +   const char *driver_name;/**< Event driver name */
> > +   struct rte_pci_device *pci_dev; /**< PCI information */
> > +   uint32_t min_dequeue_wait_ns;
> > +   /**< Minimum supported global dequeue wait delay(ns) by this
> > device */
> > +   uint32_t max_dequeue_wait_ns;
> > +   /**< Maximum supported global dequeue wait delay(ns) by this
> > device */
> > +   uint32_t dequeue_wait_ns;
> >
> 
> Am I reading this correctly that there is no way to support an indefinite
> waiting capability? Or is this just saying that if a timed wait is
> performed there are min/max limits for the wait duration?

Application can wait indefinite if required. see
RTE_EVENT_DEV_CFG_PER_DEQUEUE_WAIT configuration option.

Trivial application may not need different wait values on each dequeue.This is
a performance optimization opportunity for implementation.

> 
> 
> > +   /**< Configured global dequeue wait delay(ns) for this device */
> > +   uint8_t max_event_queues;
> > +   /**< Maximum event_queues supported by this device */
> > +   uint32_t max_event_queue_flows;
> > +   /**< Maximum supported flows in an event queue by this device*/
> > +   uint8_t max_event_queue_priority_levels;
> > +   /**< Maximum number of event queue priority levels by this device.
> > +* Valid when the device has RTE_EVENT_DEV_CAP_QUEUE_QOS capability
> > +*/
> > +   uint8_t nb_event_queues;
> > +   /**< Configured number of event queues for this device */
> >
> 
> Is 256 a sufficient number of queues? While various SoCs may have limits,
> why impose such a small limit architecturally?

Each event 

[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-10-14 Thread Hemant Agrawal
 Hi Bill/Jerin,

> 
> Thanks for the review.
> 
> [snip]
> > > + * If the device init operation is successful, the correspondence
> > > + between
> > > + * the device identifier assigned to the new device and its
> > > + associated
> > > + * *rte_event_dev* structure is effectively registered.
> > > + * Otherwise, both the *rte_event_dev* structure and the device
> > > identifier are
> > > + * freed.
> > > + *
> > > + * The functions exported by the application Event API to setup a
> > > + device
> > > + * designated by its device identifier must be invoked in the
> > > + following
> > > order:
> > > + * - rte_event_dev_configure()
> > > + * - rte_event_queue_setup()
> > > + * - rte_event_port_setup()
> > > + * - rte_event_port_link()
> > > + * - rte_event_dev_start()
> > > + *
> > > + * Then, the application can invoke, in any order, the functions
> > > + * exported by the Event API to schedule events, dequeue events,
> > > + enqueue
> > > events,
> > > + * change event queue(s) to event port [un]link establishment and so on.
> > > + *
> > > + * Application may use rte_event_[queue/port]_default_conf_get() to
> > > + get
> > > the
> > > + * default configuration to set up an event queue or event port by
> > > + * overriding few default values.
> > > + *
> > > + * If the application wants to change the configuration (i.e. call
> > > + * rte_event_dev_configure(), rte_event_queue_setup(), or
> > > + * rte_event_port_setup()), it must call rte_event_dev_stop() first
> > > + to
> > > stop the
> > > + * device and then do the reconfiguration before calling
> > > rte_event_dev_start()
> > > + * again. The schedule, enqueue and dequeue functions should not be
> > > invoked
> > > + * when the device is stopped.
> > >
> >
> > Given this requirement, the question is what happens to events that
> > are "in flight" at the time rte_event_dev_stop() is called? Is stop an
> > asynchronous operation that quiesces the event _dev and allows
> > in-flight events to drain from queues/ports prior to fully stopping,
> > or is some sort of separate explicit quiesce mechanism required? If
> > stop is synchronous and simply halts the event_dev, then how is an
> > application to know if subsequent configure/setup calls would leave
> > these pending events with no place to stand?
> >
> 
> From an application API perspective rte_event_dev_stop() is a synchronous
> function.
> If the stop has been called for re-configuring the number of queues, ports 
> etc of
> the device, then "in flight" entry preservation will be implementation 
> defined.
> else "in flight" entries will be preserved.
> 
> [snip]
> 
> > > +extern int
> > > +rte_event_dev_socket_id(uint8_t dev_id);
> > > +
> > > +/* Event device capability bitmap flags */
> > > +#define RTE_EVENT_DEV_CAP_QUEUE_QOS(1 << 0)
> > > +/**< Event scheduling prioritization is based on the priority
> > > +associated
> > > with
> > > + *  each event queue.
> > > + *
> > > + *  \see rte_event_queue_setup(), RTE_EVENT_QUEUE_PRIORITY_NORMAL
> > > +*/
> > > +#define RTE_EVENT_DEV_CAP_EVENT_QOS(1 << 1)
> > > +/**< Event scheduling prioritization is based on the priority
> > > +associated
> > > with
> > > + *  each event. Priority of each event is supplied in *rte_event*
> > > structure
> > > + *  on each enqueue operation.
> > > + *
> > > + *  \see rte_event_enqueue()
> > > + */
> > > +
> > > +/**
> > > + * Event device information
> > > + */
> > > +struct rte_event_dev_info {
> > > +   const char *driver_name;/**< Event driver name */
> > > +   struct rte_pci_device *pci_dev; /**< PCI information */
> > > +   uint32_t min_dequeue_wait_ns;
> > > +   /**< Minimum supported global dequeue wait delay(ns) by this
> > > device */
> > > +   uint32_t max_dequeue_wait_ns;
> > > +   /**< Maximum supported global dequeue wait delay(ns) by this
> > > device */
> > > +   uint32_t dequeue_wait_ns;
> > >
> >
> > Am I reading this correctly that there is no way to support an
> > indefinite waiting capability? Or is this just saying that if a timed
> > wait is performed there are min/max limits for the wait duration?
> 
> Application can wait indefinite if required. see
> RTE_EVENT_DEV_CFG_PER_DEQUEUE_WAIT configuration option.
> 
> Trivial application may not need different wait values on each dequeue.This 
> is a
> performance optimization opportunity for implementation.

 Jerin, It is irrespective of wait configuration, whether you are using per 
device wait or per dequeuer wait. 
 Can the value of MAX_U32 or MAX_U64 be treated as infinite weight? 

> 
> >
> >
> > > +   /**< Configured global dequeue wait delay(ns) for this device */
> > > +   uint8_t max_event_queues;
> > > +   /**< Maximum event_queues supported by this device */
> > > +   uint32_t max_event_queue_flows;
> > > +   /**< Maximum supported flows in an event queue by this device*/
> > > +   uint8_t max_event_queue_priority_levels;
> > > +   

[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-10-14 Thread Bill Fischofer
Hi Jerin,

This looks reasonable and seems a welcome addition to DPDK. A few questions
noted inline:

On Tue, Oct 11, 2016 at 2:30 PM, Jerin Jacob  wrote:

> Thanks to Intel and NXP folks for the positive and constructive feedback
> I've received so far. Here is the updated RFC(v2).
>
> I've attempted to address as many comments as possible.
>
> This series adds rte_eventdev.h to the DPDK tree with
> adequate documentation in doxygen format.
>
> Updates are also available online:
>
> Related draft header file (this patch):
> https://rawgit.com/jerinjacobk/libeventdev/master/rte_eventdev.h
>
> PDF version(doxgen output):
> https://rawgit.com/jerinjacobk/libeventdev/master/librte_eventdev_v2.pdf
>
> Repo:
> https://github.com/jerinjacobk/libeventdev
>
> v1..v2
>
> - Added Cavium, Intel, NXP copyrights in header file
>
> - Changed the concept of flow queues to flow ids.
> This is avoid dictating a specific structure to hold the flows.
> A s/w implementation can do atomic load balancing on multiple
> flow ids more efficiently than maintaining each event in a specific flow
> queue.
>
> - Change the scheduling group to event queue.
> A scheduling group is more a stream of events, so an event queue is a
> better
>  abstraction.
>
> - Introduced event port concept, Instead of trying eventdev access to the
> lcore,
> a higher level of abstraction called event port is needed which is the
> application i/f to the eventdev to dequeue and enqueue the events.
> One or more event queues can be linked to single event port.
> There can be more than one event port per lcore allowing multiple
> lightweight
> threads to have their own i/f into eventdev, if the implementation
> supports it.
> An event port will be bound to a lcore or a lightweight thread to keep
> portable application workflow.
> An event port abstraction also encapsulates dequeue depth and enqueue
> depth for
> a scheduler implementations which can schedule multiple events at a time
> and
> output events that can be buffered.
>
> - Added configuration options with event queue(nb_atomic_flows,
> nb_atomic_order_sequences, single consumer etc)
> and event port(dequeue_queue_depth, enqueue_queue_depth etc) to define the
> limits on the resource usage.(Useful for optimized software implementation)
>
> - Introduced RTE_EVENT_DEV_CAP_QUEUE_QOS and RTE_EVENT_DEV_CAP_EVENT_QOS
> schemes of priority handling
>
> - Added event port to event queue servicing priority.
> This allows two event ports to connect to the same event queue with
> different priorities.
>
> - Changed the workflow as schedule/dequeue/enqueue.
> An implementation is free to define schedule as NOOP.
> A distributed s/w scheduler can use this to schedule events;
> also a centralized s/w scheduler can make this a NOOP on non-scheduler
> cores.
>
> - Removed Cavium HW specific schedule_from_group API
>
> - Removed Cavium HW specific ctxt_update/ctxt_wait APIs.
>  Introduced a more generic "event pinning" concept. i.e
> If the normal workflow is a dequeue -> do work based on event type ->
> enqueue,
> a pin_event argument to enqueue
> where the pinned event is returned through the normal dequeue)
> allows application workflow to remain the same whether or not an
> implementation supports it.
>
> - Added dequeue() burst variant
>
> - Added the definition of a closed/open system - where open system is
> memory
> backed and closed system eventdev has limited capacity.
> In such systems, it is also useful to denote per event port how many
> packets
> can be active in the system.
> This can serve as a threshold for ethdev like devices so they don't
> overwhelm
> core to core events.
>
> - Added the option to specify maximum amount of time(in ns) application
> needs
> wait on dequeue()
>
> - Removed the scheme of expressing the number of flows in log2 format
>
> Open item or the item needs improvement.
> 
> - Abstract the differences in event QoS management with different priority
> schemes
> available in different HW or SW implementations with portable application
> workflow.
>
> Based on the feedback, there three different kinds of QoS support
> available in
> three different HW or SW implementations.
> 1) Priority associated with the event queue
> 2) Priority associated with each event enqueue
> (Same flow can have two different priority on two separate enqueue)
> 3) Priority associated with the flow(each flow has unique priority)
>
> In v2, The differences abstracted based on device capability
> (RTE_EVENT_DEV_CAP_QUEUE_QOS for the first scheme,
> RTE_EVENT_DEV_CAP_EVENT_QOS for the second and third scheme).
> This scheme would call for different application workflow for
> nontrivial QoS-enabled applications.
>
> Looking forward to getting comments from both application and driver
> implementation perspective.
>
> /Jerin
>
> ---
>  doc/api/doxy-api-index.md  |1 +
>  doc/api/doxy-api.conf  |1 +
>  lib/librte_eventdev/rte_eventdev.h | 1204 

[dpdk-dev] [RFC] [PATCH v2] libeventdev: event driven programming model framework for DPDK

2016-10-12 Thread Jerin Jacob
Thanks to Intel and NXP folks for the positive and constructive feedback
I've received so far. Here is the updated RFC(v2).

I've attempted to address as many comments as possible.

This series adds rte_eventdev.h to the DPDK tree with
adequate documentation in doxygen format.

Updates are also available online:

Related draft header file (this patch):
https://rawgit.com/jerinjacobk/libeventdev/master/rte_eventdev.h

PDF version(doxgen output):
https://rawgit.com/jerinjacobk/libeventdev/master/librte_eventdev_v2.pdf

Repo:
https://github.com/jerinjacobk/libeventdev

v1..v2

- Added Cavium, Intel, NXP copyrights in header file

- Changed the concept of flow queues to flow ids.
This is avoid dictating a specific structure to hold the flows.
A s/w implementation can do atomic load balancing on multiple
flow ids more efficiently than maintaining each event in a specific flow queue.

- Change the scheduling group to event queue.
A scheduling group is more a stream of events, so an event queue is a better
 abstraction.

- Introduced event port concept, Instead of trying eventdev access to the lcore,
a higher level of abstraction called event port is needed which is the
application i/f to the eventdev to dequeue and enqueue the events.
One or more event queues can be linked to single event port.
There can be more than one event port per lcore allowing multiple lightweight
threads to have their own i/f into eventdev, if the implementation supports it.
An event port will be bound to a lcore or a lightweight thread to keep
portable application workflow.
An event port abstraction also encapsulates dequeue depth and enqueue depth for
a scheduler implementations which can schedule multiple events at a time and
output events that can be buffered.

- Added configuration options with event queue(nb_atomic_flows,
nb_atomic_order_sequences, single consumer etc)
and event port(dequeue_queue_depth, enqueue_queue_depth etc) to define the
limits on the resource usage.(Useful for optimized software implementation)

- Introduced RTE_EVENT_DEV_CAP_QUEUE_QOS and RTE_EVENT_DEV_CAP_EVENT_QOS
schemes of priority handling

- Added event port to event queue servicing priority.
This allows two event ports to connect to the same event queue with
different priorities.

- Changed the workflow as schedule/dequeue/enqueue.
An implementation is free to define schedule as NOOP.
A distributed s/w scheduler can use this to schedule events;
also a centralized s/w scheduler can make this a NOOP on non-scheduler cores.

- Removed Cavium HW specific schedule_from_group API

- Removed Cavium HW specific ctxt_update/ctxt_wait APIs.
 Introduced a more generic "event pinning" concept. i.e
If the normal workflow is a dequeue -> do work based on event type -> enqueue,
a pin_event argument to enqueue
where the pinned event is returned through the normal dequeue)
allows application workflow to remain the same whether or not an
implementation supports it.

- Added dequeue() burst variant

- Added the definition of a closed/open system - where open system is memory
backed and closed system eventdev has limited capacity.
In such systems, it is also useful to denote per event port how many packets
can be active in the system.
This can serve as a threshold for ethdev like devices so they don't overwhelm
core to core events.

- Added the option to specify maximum amount of time(in ns) application needs
wait on dequeue()

- Removed the scheme of expressing the number of flows in log2 format

Open item or the item needs improvement.

- Abstract the differences in event QoS management with different priority 
schemes
available in different HW or SW implementations with portable application 
workflow.

Based on the feedback, there three different kinds of QoS support available in
three different HW or SW implementations.
1) Priority associated with the event queue
2) Priority associated with each event enqueue
(Same flow can have two different priority on two separate enqueue)
3) Priority associated with the flow(each flow has unique priority)

In v2, The differences abstracted based on device capability
(RTE_EVENT_DEV_CAP_QUEUE_QOS for the first scheme,
RTE_EVENT_DEV_CAP_EVENT_QOS for the second and third scheme).
This scheme would call for different application workflow for
nontrivial QoS-enabled applications.

Looking forward to getting comments from both application and driver
implementation perspective.

/Jerin

---
 doc/api/doxy-api-index.md  |1 +
 doc/api/doxy-api.conf  |1 +
 lib/librte_eventdev/rte_eventdev.h | 1204 
 3 files changed, 1206 insertions(+)
 create mode 100644 lib/librte_eventdev/rte_eventdev.h

diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index 6675f96..28c1329 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -40,6 +40,7 @@ There are many libraries, so their headers may be grouped by 
topics: