[dpdk-dev] [PATCH 2/4] eventdev: implement the northbound APIs

2016-11-22 Thread Eads, Gage


>  -Original Message-
>  From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com]
>  Sent: Tuesday, November 22, 2016 2:00 PM
>  To: Eads, Gage 
>  Cc: dev at dpdk.org; Richardson, Bruce ; Van
>  Haaren, Harry ; hemant.agrawal at nxp.com
>  Subject: Re: [dpdk-dev] [PATCH 2/4] eventdev: implement the northbound APIs
>  
>  On Tue, Nov 22, 2016 at 07:43:03PM +, Eads, Gage wrote:
>  > >  > >  > > One open issue I noticed is the "typical workflow"
>  > > description starting in  > >  rte_eventdev.h:204 conflicts with the
>  > > centralized software PMD that Harry  > >  posted last week.
>  > > Specifically, that PMD expects a single core to call the  > >
>  > > schedule function. We could extend the documentation to account for
>  > > this  > >  alternative style of scheduler invocation, or discuss
>  > > ways to make the  software  > >  PMD work with the documented
>  > > workflow. I prefer the former, but either  way I  > >  think we
>  > > ought to expose the scheduler's expected usage to the user --
>  > > perhaps  > >  through an RTE_EVENT_DEV_CAP flag?
>  > >  > >  >
>  > >  > >  > I prefer former too, you can propose the documentation
>  > > change required  for  > >  software PMD.
>  > >  >
>  > >  > Sure, proposal follows. The "typical workflow" isn't the most
>  > > optimal by  having a conditional in the fast-path, of course, but it
>  > > demonstrates the idea  simply.
>  > >  >
>  > >  > (line 204)
>  > >  >  * An event driven based application has following typical
>  > > workflow on
>  > >  fastpath:
>  > >  >  * \code{.c}
>  > >  >  *  while (1) {
>  > >  >  *
>  > >  >  *  if (dev_info.event_dev_cap &
>  > >  >  *  RTE_EVENT_DEV_CAP_DISTRIBUTED_SCHED)
>  > >  >  *  rte_event_schedule(dev_id);
>  > >
>  > >  Yes, I like the idea of RTE_EVENT_DEV_CAP_DISTRIBUTED_SCHED.
>  > >  It  can be input to application/subsystem to  launch separate
>  > > core(s) for schedule functions.
>  > >  But, I think, the "dev_info.event_dev_cap &
>  > > RTE_EVENT_DEV_CAP_DISTRIBUTED_SCHED"
>  > >  check can be moved inside the implementation(to make the better
>  > > decisions  and  avoiding consuming cycles on HW based schedulers.
>  >
>  > How would this check work? Wouldn't it prevent any core from running the
>  software scheduler in the centralized case?
>  
>  I guess you may not need RTE_EVENT_DEV_CAP here, instead need flag for
>  device configure here
>  
>  #define RTE_EVENT_DEV_CFG_DISTRIBUTED_SCHED (1ULL << 1)
>  
>  struct rte_event_dev_config config;
>  config.event_dev_cfg = RTE_EVENT_DEV_CFG_DISTRIBUTED_SCHED;
>  rte_event_dev_configure(.., );
>  
>  on the driver side on configure,
>  if (config.event_dev_cfg & RTE_EVENT_DEV_CFG_DISTRIBUTED_SCHED)
>   eventdev->schedule = NULL;
>  else // centralized case
>   eventdev->schedule = your_centrized_schedule_function;
>  
>  Does that work?

Hm, I fear the API would give users the impression that they can select the 
scheduling behavior of a given eventdev, when a software scheduler is more 
likely to be either distributed or centralized -- not both.

What if we use the capability flag, and define rte_event_schedule() as the 
scheduling function for centralized schedulers and rte_event_dequeue() as the 
scheduling function for distributed schedulers? That way, the datapath could be 
the simple dequeue -> process -> enqueue. Applications would check the 
capability flag at configuration time to decide whether or not to launch an 
lcore that calls rte_event_schedule().

>  
>  >
>  > >
>  > >  >  *
>  > >  >  *  rte_event_dequeue(...);
>  > >  >  *
>  > >  >  *  (event processing)
>  > >  >  *
>  > >  >  *  rte_event_enqueue(...);
>  > >  >  *  }
>  > >  >  * \endcode
>  > >  >  *
>  > >  >  * The *schedule* operation is intended to do event scheduling,
>  > > and the  >  * *dequeue* operation returns the scheduled events. An
>  > > implementation  >  * is free to define the semantics between
>  > > *schedule* and *dequeue*. For  >  * example, a system based on a
>  > > hardware scheduler can define its  >  * rte_event_schedule() to be
>  > > an NOOP, whereas a software scheduler can  use  >  * the *schedule*
>  > > operation to schedule events. The  >  *
>  > > RTE_EVENT_DEV_CAP_DISTRIBUTED_SCHED capability flag indicates
>  > > whether  >  * rte_event_schedule() should be called by all cores or
>  > > by a single (typically  >  * dedicated) core.
>  > >  >
>  > >  > (line 308)
>  > >  > #define RTE_EVENT_DEV_CAP_DISTRIBUTED_SCHED (1ULL < 2)  > /**<
>  > > Event scheduling implementation is distributed and all cores must
>  > > execute  >  *  rte_event_schedule(). If unset, the implementation is
>  > > centralized and  >  *  a single core must execute the schedule
>  > > operation.
>  > >  >  *
>  > >  >  *  \see rte_event_schedule()
>  > >  >  */
>  > >  >
>  > >  > >  >
>  > >  > >  > On same note, If software PMD based workflow 

[dpdk-dev] dpdk/vpp and cross-version migration for vhost

2016-11-22 Thread Yuanhan Liu
On Thu, Nov 17, 2016 at 07:37:09PM +0200, Michael S. Tsirkin wrote:
> On Thu, Nov 17, 2016 at 05:49:36PM +0800, Yuanhan Liu wrote:
> > On Thu, Nov 17, 2016 at 09:47:09AM +0100, Maxime Coquelin wrote:
> > > 
> > > 
> > > On 11/17/2016 09:29 AM, Yuanhan Liu wrote:
> > > >As usaual, sorry for late response :/
> > > >
> > > >On Thu, Oct 13, 2016 at 08:50:52PM +0300, Michael S. Tsirkin wrote:
> > > >>Hi!
> > > >>So it looks like we face a problem with cross-version
> > > >>migration when using vhost. It's not new but became more
> > > >>acute with the advent of vhost user.
> > > >>
> > > >>For users to be able to migrate between different versions
> > > >>of the hypervisor the interface exposed to guests
> > > >>by hypervisor must stay unchanged.
> > > >>
> > > >>The problem is that a qemu device is connected
> > > >>to a backend in another process, so the interface
> > > >>exposed to guests depends on the capabilities of that
> > > >>process.
> > > >>
> > > >>Specifically, for vhost user interface based on virtio, this includes
> > > >>the "host features" bitmap that defines the interface, as well as more
> > > >>host values such as the max ring size.  Adding new features/changing
> > > >>values to this interface is required to make progress, but on the other
> > > >>hand we need ability to get the old host features to be compatible.
> > > >
> > > >It looks like to the same issue of vhost-user reconnect to me. For 
> > > >example,
> > > >
> > > >- start dpdk 16.07 & qemu 2.5
> > > >- kill dpdk
> > > >- start dpdk 16.11
> > > >
> > > >Though DPDK 16.11 has more features comparing to dpdk 16.07 (say, 
> > > >indirect),
> > > >above should work. Because qemu saves the negotiated features before the
> > > >disconnect and stores it back after the reconnection.
> > > >
> > > >commit a463215b087c41d7ca94e51aa347cde523831873
> > > >Author: Marc-Andr? Lureau 
> > > >Date:   Mon Jun 6 18:45:05 2016 +0200
> > > >
> > > >vhost-net: save & restore vhost-user acked features
> > > >
> > > >The initial vhost-user connection sets the features to be 
> > > > negotiated
> > > >with the driver. Renegotiation isn't possible without device 
> > > > reset.
> > > >
> > > >To handle reconnection of vhost-user backend, ensure the same 
> > > > set of
> > > >features are provided, and reuse already acked features.
> > > >
> > > >Signed-off-by: Marc-Andr? Lureau 
> > > >
> > > >
> > > >So we could do similar to vhost-user? I mean, save the acked features
> > > >before migration and store it back after it. This should be able to
> > > >keep the compatibility. If user downgrades DPDK version, it also could
> > > >be easily detected, and then exit with an error to user: migration
> > > >failed due to un-compatible vhost features.
> > > >
> > > >Just some rough thoughts. Makes tiny sense?
> > > 
> > > My understanding is that the management tool has to know whether
> > > versions are compatible before initiating the migration:
> > 
> > Makes sense. How about getting and restoring the acked features through
> > qemu command lines then, say, through the monitor interface?
> > 
> > With that, it would be something like:
> > 
> > - start vhost-user backend (DPDK, VPP, or whatever) & qemu in the src host
> > 
> > - read the acked features (through monitor interface)
> > 
> > - start vhost-user backend in the dst host
> > 
> > - start qemu in the dst host with the just queried acked features
> > 
> >   QEMU then is expected to use this feature set for the later vhost-user
> >   feature negotitation. Exit if features compatibility is broken.
> > 
> > Thoughts?
> > 
> > --yliu
> 
> 
> You keep assuming that you have the VM started first and
> figure out things afterwards, but this does not work.
> 
> Think about a cluster of machines. You want to start a VM in
> a way that will ensure compatibility with all hosts
> in a cluster.

I see. I was more considering about the case when the dst
host (including the qemu and dpdk combo) is given, and
then determine whether it will be a successfull migration
or not.

And you are asking that we need to know which host could
be a good candidate before starting the migration. In such
case, we indeed need some inputs from both the qemu and
vhost-user backend.

For DPDK, I think it could be simple, just as you said, it
could be either a tiny script, or even a macro defined in
the source code file (we extend it every time we add a
new feature) to let the libvirt to read it. Or something
else.

> If you don't, guest visible interface will change
> and you won't be able to migrate.
> 
> It does not make sense to discuss feature bits specifically
> since that is not the only part of interface.
> For example, max ring size supported might change.

I don't quite understand why we have to consider the max ring
size here? Isn't it a virtio device attribute, that QEMU could
provide such compatibility information?

I mean, DPDK is supposed to support vary vring 

[dpdk-dev] Proposal for a new Committer model

2016-11-22 Thread Ferruh Yigit
On 11/22/2016 7:52 PM, Neil Horman wrote:
> On Mon, Nov 21, 2016 at 09:52:41AM +0100, Thomas Monjalon wrote:
>> 2016-11-18 13:09, Neil Horman:
>>> A) Further promote subtree maintainership.  This was a conversation that I
>>> proposed some time ago, but my proposed granularity was discarded in favor
>>> of something that hasn't worked as well (in my opinion).  That is to say a
>>> few driver pmds (i40e and fm10k come to mind) have their own tree that
>>> send pull requests to Thomas.
>>
>> Yes we tried this fine granularity and stated that it was not working well.
>> We are now using the bigger granularity that you describe below.
>>
> Ok, thats good, but that must be _very_ new.  Looking at your git tree, I see 
> no
> merge commits.  How are you pulling from those subtrees?

next-net tree is active for last three releases.

I guess following is the first commit to the sub-tree:
http://dpdk.org/ml/archives/dev/2016-February/032580.html

sub-trees rebase on top of main tree regularly, that is why there is no
merge commit.

> 
> 
>>> We should be sharding that at a much higher
>>> granularity and using it much more consistently.  That is to say, that we
>>> should have a maintainer for all the ethernet pmds, and another for the
>>> crypto pmds, another for the core eal layer, another for misc libraries
>>> that have low patch volumes, etc.
>>
>> Yes we could open a tree for EAL and another one for the core libraries.
>>
> That could be worthwhile.  Lets see how the net and crypto subtrees work out
> (assuming again that these trees are newly founded)
> 
> 
>>> Each of those subdivisions should have
>>> their own list to communicate on, and each should have a tree that
>>> integrates patches for their own subsystem, and they should on a regular
>>> cycle send pull requests to Thomas.
>>
>> Yes I think it is now a good idea to split the mailing list traffic,
>> at least for netdev and cryptodev.
>>
> Agreed, that serves two purposes, it lowers the volume for people with a
> specific interest (i.e. its a rudimentary filter), and it avoids confusion
> between you and the subtree maintainer (that is to say, you don't have to even
> consider pulling patches that go to the crypo and net lists, you just have to
> trust that they pull those patches in and send you appropriate pull requests).

I still find single mail list more useful.
Also with current process, after -rc2 release, patches directly merged
into main tree instead of sub-trees...

> 
>>> Thomas in turn should by and large,
>>> only be integrating pull requests.  This should address our high-
>>> throughput issue, in that it will allow multiple maintainers to share the
>>> workload, and integration should be relatively easy.
>>
>> Yes in an ideal organization, the last committer does only a last check
>> that technical plan and fairness are respected.
>> So it gives more time to coordinate the plans :)
>>
> Correct.  Thats never 100% accurate of course, some things will still have to
> come to you directly, simply by virtue of the fact that they don't completely
> fit anywhere else, but thats ok, the goal is really just to get your total 
> patch
> volume lower, and replace it with pull requests that you can either trivialy
> mere or figure out with the help of the subtree maintainer.
> 
>>> B) Designate alternates to serve as backups for the maintainer when they
>>> are unavailable.  This provides high-availablility, and sounds very much
>>> like your proposal, but in the interests of clarity, there is still a
>>> single maintainer at any one time, it just may change to ensure the
>>> continued merging of patches, if the primary maintainer isn't available.
>>> Ideally however, those backup alternates arent needed, because most of the
>>> primary maintainers work in merging pull requests, which are done based on
>>> the trust of the submaintainer, and done during a very limited window of
>>> time.  This also partially addreses multi-vendor fairness if your subtree
>>> maintainers come from multiple participating companies.
>>
>> About the merge window, I do not have a strong opinion about how it can be
>> improved. However, I know that closing the window too early makes developer
>> unhappy because it makes wait - between development start and its release -
>> longer.
> 
> This is a fair point, but I'm not talking about closing it early here, all
> I'm suggesting is that, if you do proper pull requests from subtrees, your 
> tree
> Thomas will only need a reasonably small window of time to accept new 
> features,
> because you'll just merge the subtrees, rather than integrating individual
> patches.  E.g. you won't be constantly merging patches over the course of a
> development cycle, your tree's HEAD will mostly consist of merge commits as
> subtree maintainers send you pull requests, and ideally they will send those
> near the start of a window.  How long you keep your merge window open after 
> that
> is up to you.
> 
> Neil
> 
> 
>  
>>



[dpdk-dev] Adding API to force freeing consumed buffers in TX ring

2016-11-22 Thread Wiles, Keith

> On Nov 21, 2016, at 9:25 AM, Richardson, Bruce  intel.com> wrote:
> 
> On Mon, Nov 21, 2016 at 04:06:32PM +0100, Olivier Matz wrote:
>> Hi,
>> 
>> On 11/21/2016 03:33 PM, Wiles, Keith wrote:
>>> 
 On Nov 21, 2016, at 4:48 AM, Damjan Marion (damarion) >>> cisco.com> wrote:
 
 
 Hi,
 
 Currently in VPP we do memcpy of whole packet when we need to do 
 replication as we cannot know if specific buffer is transmitted
 from tx ring before we update it again (i.e. l2 header rewrite).
 
 Unless there is already a way to address this issue in DPDK which I?m not 
 aware
 of my proposal is that we provide mechanism for polling TX ring 
 for consumed buffers. This can be either completely new API or 
 extension of rte_etx_tx_burst (i.e. special case when nb_pkts=0).
 
 This will allows us to start polling tx ring when we expect some 
 mbuf back, instead of waiting for next tx burst (which we don?t know
 when it will happen) and hoping that we will reach free_threshold soon.
>>> 
>>> +1
>>> 
>>> In Pktgen I have the problem of not being able to reclaim all of the TX 
>>> mbufs to update them for the next set of packets to send. I know this is 
>>> not a common case, but I do see the case where the application needs its 
>>> mbufs freed off the TX ring. Currently you need to have at least a TX ring 
>>> size of mbufs on hand to make sure you can send to a TX ring. If you 
>>> allocate too few you run into a deadlock case as the number of mbufs  on a 
>>> TX ring does not hit the flush mark. If you are sending to multiple TX 
>>> rings on the same numa node from the a single TX pool you have to 
>>> understand the total number of mbufs you need to have allocated to hit the 
>>> TX flush on each ring. Not a clean way to handle the problems as you may 
>>> have limited memory or require some logic to add more mbufs for dynamic 
>>> ports.
>>> 
>>> Anyway it would be great to require a way to clean up the TX done ring, 
>>> using nb_pkts == 0 is the simplest way, but a new API is fine too.
 
 Any thoughts?
>> 
>> Yes, it looks useful to have a such API.
>> 
>> I would prefer another function instead of diverting the meaning of
>> nb_pkts. Maybe this?
>> 
>>  void rte_eth_tx_free_bufs(uint8_t port_id, uint16_t queue_id);
>> 
> 
> Third parameter for a limit(hint) of the number of bufs to free? If the
> TX ring is big, we might not want to stall other work for a long time
> while we free a huge number of buffers.

In order to move this along some, if we create the following API:

int rte_eth_tx_done_cleanup(uint8_t port_id, uint16_t queue_id, uint32_t 
free_cnt);

Return the number of freed mbufs or -1 if not supported or invalid params.
free_cnt of zero means free all possible mbufs or just at most the number 
suggested.
The free_cnt could be a uint16_t, but I do not think it matters much.

The rte_eth_tx_done_cleanup() call will return -1 if the PMD does not support 
or port_id, queue_id are invalid.

The default in the eth_dev structure of function pointers would be NULL(not 
supported) to not require all of the drivers to be updated today. We can then 
add the support as we go along.

We could have a features request API for tx_done support and PCTYPE, plus 
others if we want to go down that path too.

> 
>   /Bruce

Regards,
Keith



[dpdk-dev] [PATCH 2/4] eventdev: implement the northbound APIs

2016-11-22 Thread Eads, Gage


>  -Original Message-
>  From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com]
>  Sent: Tuesday, November 22, 2016 12:19 PM
>  To: Eads, Gage 
>  Cc: dev at dpdk.org; Richardson, Bruce ; Van
>  Haaren, Harry ; hemant.agrawal at nxp.com
>  Subject: Re: [dpdk-dev] [PATCH 2/4] eventdev: implement the northbound APIs
>  
>  On Tue, Nov 22, 2016 at 03:15:52PM +, Eads, Gage wrote:
>  >
>  >
>  > >  -Original Message-
>  > >  From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com]
>  > >  Sent: Monday, November 21, 2016 1:32 PM
>  > >  To: Eads, Gage 
>  > >  Cc: dev at dpdk.org; Richardson, Bruce ; 
> Van
>  > >  Haaren, Harry ; hemant.agrawal at nxp.com
>  > >  Subject: Re: [dpdk-dev] [PATCH 2/4] eventdev: implement the northbound
>  APIs
>  > >
>  > >  On Tue, Nov 22, 2016 at 12:43:58AM +0530, Jerin Jacob wrote:
>  > >  > On Mon, Nov 21, 2016 at 05:45:51PM +, Eads, Gage wrote:
>  > >  > > Hi Jerin,
>  > >  > >
>  > >  > > I did a quick review and overall this implementation looks good. I
>  noticed
>  > >  just one issue in rte_event_queue_setup(): the check of
>  > >  nb_atomic_order_sequences is being applied to atomic-type queues, but
>  that
>  > >  field applies to ordered-type queues.
>  > >  >
>  > >  > Thanks Gage. I will fix that in v2.
>  > >  >
>  > >  > >
>  > >  > > One open issue I noticed is the "typical workflow" description 
> starting in
>  > >  rte_eventdev.h:204 conflicts with the centralized software PMD that 
> Harry
>  > >  posted last week. Specifically, that PMD expects a single core to call 
> the
>  > >  schedule function. We could extend the documentation to account for this
>  > >  alternative style of scheduler invocation, or discuss ways to make the
>  software
>  > >  PMD work with the documented workflow. I prefer the former, but either
>  way I
>  > >  think we ought to expose the scheduler's expected usage to the user --
>  perhaps
>  > >  through an RTE_EVENT_DEV_CAP flag?
>  > >  >
>  > >  > I prefer former too, you can propose the documentation change required
>  for
>  > >  software PMD.
>  >
>  > Sure, proposal follows. The "typical workflow" isn't the most optimal by
>  having a conditional in the fast-path, of course, but it demonstrates the 
> idea
>  simply.
>  >
>  > (line 204)
>  >  * An event driven based application has following typical workflow on
>  fastpath:
>  >  * \code{.c}
>  >  *  while (1) {
>  >  *
>  >  *  if (dev_info.event_dev_cap &
>  >  *  RTE_EVENT_DEV_CAP_DISTRIBUTED_SCHED)
>  >  *  rte_event_schedule(dev_id);
>  
>  Yes, I like the idea of RTE_EVENT_DEV_CAP_DISTRIBUTED_SCHED.
>  It  can be input to application/subsystem to
>  launch separate core(s) for schedule functions.
>  But, I think, the "dev_info.event_dev_cap &
>  RTE_EVENT_DEV_CAP_DISTRIBUTED_SCHED"
>  check can be moved inside the implementation(to make the better decisions
>  and
>  avoiding consuming cycles on HW based schedulers.

How would this check work? Wouldn't it prevent any core from running the 
software scheduler in the centralized case?

>  
>  >  *
>  >  *  rte_event_dequeue(...);
>  >  *
>  >  *  (event processing)
>  >  *
>  >  *  rte_event_enqueue(...);
>  >  *  }
>  >  * \endcode
>  >  *
>  >  * The *schedule* operation is intended to do event scheduling, and the
>  >  * *dequeue* operation returns the scheduled events. An implementation
>  >  * is free to define the semantics between *schedule* and *dequeue*. For
>  >  * example, a system based on a hardware scheduler can define its
>  >  * rte_event_schedule() to be an NOOP, whereas a software scheduler can
>  use
>  >  * the *schedule* operation to schedule events. The
>  >  * RTE_EVENT_DEV_CAP_DISTRIBUTED_SCHED capability flag indicates
>  whether
>  >  * rte_event_schedule() should be called by all cores or by a single 
> (typically
>  >  * dedicated) core.
>  >
>  > (line 308)
>  > #define RTE_EVENT_DEV_CAP_DISTRIBUTED_SCHED (1ULL < 2)
>  > /**< Event scheduling implementation is distributed and all cores must
>  execute
>  >  *  rte_event_schedule(). If unset, the implementation is centralized and
>  >  *  a single core must execute the schedule operation.
>  >  *
>  >  *  \see rte_event_schedule()
>  >  */
>  >
>  > >  >
>  > >  > On same note, If software PMD based workflow need  a separate core(s)
>  for
>  > >  > schedule function then, Can we hide that from API specification and 
> pass
>  an
>  > >  > argument to SW pmd to define the scheduling core(s)?
>  > >  >
>  > >  > Something like --vdev=eventsw0,schedule_cmask=0x2
>  >
>  > An API for controlling the scheduler coremask instead of (or perhaps in
>  addition to) the vdev argument would be good, to allow runtime control. I can
>  imagine apps that scale the number of cores based on load, and in doing so
>  may want to migrate the scheduler to a different core.
>  
>  Yes, an API for number of scheduler core looks OK. 

[dpdk-dev] [PATCH v2 3/8] drivers: Use ETH_DEV_PCI_DEV() helper

2016-11-22 Thread Shreyansh Jain
On Monday 21 November 2016 10:25 PM, Jan Blunck wrote:
> The drivers should not directly access the rte_eth_dev->pci_dev but use
> a macro instead. This is a preparation for replacing the pci_dev with
> a struct rte_device member in the future.
>
> Signed-off-by: Jan Blunck 
> ---
>  drivers/net/bnxt/bnxt_ethdev.c   | 19 ++-
>  drivers/net/bnxt/bnxt_ring.c | 11 +++---
>  drivers/net/cxgbe/cxgbe_ethdev.c |  2 +-
>  drivers/net/e1000/em_ethdev.c| 20 ++-
>  drivers/net/e1000/igb_ethdev.c   | 50 +++
>  drivers/net/e1000/igb_pf.c   |  3 +-
>  drivers/net/ena/ena_ethdev.c |  2 +-
>  drivers/net/enic/enic_ethdev.c   |  2 +-
>  drivers/net/fm10k/fm10k_ethdev.c | 49 ++-

I found a couple of placed in the fm10k_ethdev file where pci_dev usage 
can be replaced with ETH_DEV_PCI_DEV() macro. For example,
  - fm10k_dev_tx_init() +681,
  - fm10k_set_tx_function +2774

Can you please check once again?

>  drivers/net/i40e/i40e_ethdev.c   | 44 
>  drivers/net/i40e/i40e_ethdev.h   |  4 +++
>  drivers/net/i40e/i40e_ethdev_vf.c| 38 ++---
>  drivers/net/ixgbe/ixgbe_ethdev.c | 65 
> +---
>  drivers/net/ixgbe/ixgbe_pf.c |  2 +-
>  drivers/net/qede/qede_ethdev.c   | 17 +-
>  drivers/net/vmxnet3/vmxnet3_ethdev.c |  4 +--
>  16 files changed, 185 insertions(+), 147 deletions(-)

Some changes in szedata2 are also on similar lines:
  - rte_szedata2_eth_dev_init +1419, +1420, ... 
  - rte_szedata2_eth_dev_uninit

Some changes in nicvf_ethdev.c also are missing, I think:
  - nicvf_eth_dev_init +1980
  - nicvf_dev_info_get +1350

and nfp/nfp_net.c
  - nfp_net_init(), +2333, +2403
  - nfp_net_close, +718, +737
  - nfp_net_dev_link_status_print
  - nfp_net_irq_unmask, +1161

and bnx2x_ethdev.c
  - bnx2x_common_dev_init
  - and access to intr_handle which can replaced with
ETH_DEV_TO_INTR_HANDLE

Is there any specific reason these changes are not part of your patch?

[...]
> diff --git a/drivers/net/i40e/i40e_ethdev.h b/drivers/net/i40e/i40e_ethdev.h
> index 298cef4..9d4bea7 100644
> --- a/drivers/net/i40e/i40e_ethdev.h
> +++ b/drivers/net/i40e/i40e_ethdev.h
> @@ -671,6 +671,10 @@ i40e_get_vsi_from_adapter(struct i40e_adapter *adapter)
>  #define I40E_VF_TO_HW(vf) \
>   (&(((struct i40e_vf *)vf)->adapter->hw))
>
> +/* ETH_DEV_TO_INTR_HANDLE */
> +#define ETH_DEV_TO_INTR_HANDLE(ptr) \
> + (&(ETH_DEV_PCI_DEV(ptr)->intr_handle))
> +

Can this be in rte_ethdev.h just like ETH_DEV_PCI_DEV?
Or, as this is specific to rte_pci_device, probably in rte_pci.h?
Many drivers can be replaced this for accessing intr_handle.

-
Shreyansh


[dpdk-dev] [PATCH v2 1/8] eal: define container_of macro

2016-11-22 Thread Shreyansh Jain
On Monday 21 November 2016 10:25 PM, Jan Blunck wrote:
> This macro is based on Jan Viktorin's original patch but also checks the
> type of the passed pointer against the type of the member.
>
> Signed-off-by: Jan Viktorin 
> Signed-off-by: Shreyansh Jain 
> [jblunck at infradead.org: add type checking and __extension__]
> Signed-off-by: Jan Blunck 
> ---
>  lib/librte_eal/common/include/rte_common.h | 20 
>  1 file changed, 20 insertions(+)
>
> diff --git a/lib/librte_eal/common/include/rte_common.h 
> b/lib/librte_eal/common/include/rte_common.h
> index db5ac91..8dda3e2 100644
> --- a/lib/librte_eal/common/include/rte_common.h
> +++ b/lib/librte_eal/common/include/rte_common.h
> @@ -331,6 +331,26 @@ rte_bsf32(uint32_t v)
>  #define offsetof(TYPE, MEMBER)  __builtin_offsetof (TYPE, MEMBER)
>  #endif
>
> +/**
> + * Return pointer to the wrapping struct instance.
> + *
> + * Example:
> + *
> + *  struct wrapper {
> + *  ...
> + *  struct child c;
> + *  ...
> + *  };
> + *
> + *  struct child *x = obtain(...);
> + *  struct wrapper *w = container_of(x, struct wrapper, c);
> + */
> +#ifndef container_of
> +#define container_of(ptr, type, member)  __extension__ ({
> \
> + typeof(((type *)0)->member) *_ptr = (ptr);  \
> + (type *)(((char *)_ptr) - offsetof(type, member)); })
> +#endif
> +
>  #define _RTE_STR(x) #x
>  /** Take a macro value and get a string version of it */
>  #define RTE_STR(x) _RTE_STR(x)
>

I will start using this in my patchset.

Acked-by: Shreyansh Jain 


[dpdk-dev] [PATCH 0/2] l2fwd/l3fwd: rework long options parsing

2016-11-22 Thread Olivier Matz
Hi,

On 11/22/2016 02:52 PM, Olivier Matz wrote:
> These 2 patches were part of this RFC, which will not be integrated:
> http://dpdk.org/ml/archives/dev/2016-September/046974.html
> 
> It does not bring any functional change, it just reworks the way long
> options are parsed in l2fwd and l3fwd to avoid uneeded strcmp() calls
> and to ease the addition of a new long option in the future.
> 
> I send them in case maintainers think it is better this way, but I have
> no real need.
> 
> Olivier Matz (2):
>   l3fwd: rework long options parsing
>   l2fwd: rework long options parsing
> 
>  examples/l2fwd/main.c |  30 +++--
>  examples/l3fwd/main.c | 169 
> ++
>  2 files changed, 111 insertions(+), 88 deletions(-)
> 

Sorry, I missed some checkpatch issues. I'll fix them in v2.
I'm waiting a bit for other comments, in case of.


Olivier


[dpdk-dev] [PATCH 1/2] l3fwd: rework long options parsing

2016-11-22 Thread Olivier Matz
Avoid the use of several strncpy() since getopt is able to
map a long option with an id, which can be matched in the
same switch/case than short options.

Signed-off-by: Olivier Matz 
---
 examples/l3fwd/main.c | 169 ++
 1 file changed, 87 insertions(+), 82 deletions(-)

diff --git a/examples/l3fwd/main.c b/examples/l3fwd/main.c
index 7223e77..f84ef50 100644
--- a/examples/l3fwd/main.c
+++ b/examples/l3fwd/main.c
@@ -474,6 +474,13 @@ parse_eth_dest(const char *optarg)
 #define MAX_JUMBO_PKT_LEN  9600
 #define MEMPOOL_CACHE_SIZE 256

+static const char short_options[] =
+   "p:"  /* portmask */
+   "P"   /* promiscuous */
+   "L"   /* enable long prefix match */
+   "E"   /* enable exact match */
+   ;
+
 #define CMD_LINE_OPT_CONFIG "config"
 #define CMD_LINE_OPT_ETH_DEST "eth-dest"
 #define CMD_LINE_OPT_NO_NUMA "no-numa"
@@ -481,6 +488,31 @@ parse_eth_dest(const char *optarg)
 #define CMD_LINE_OPT_ENABLE_JUMBO "enable-jumbo"
 #define CMD_LINE_OPT_HASH_ENTRY_NUM "hash-entry-num"
 #define CMD_LINE_OPT_PARSE_PTYPE "parse-ptype"
+enum {
+   /* long options mapped to a short option */
+
+   /* first long only option value must be >= 256, so that we won't
+* conflict with short options */
+   CMD_LINE_OPT_MIN_NUM = 256,
+   CMD_LINE_OPT_CONFIG_NUM,
+   CMD_LINE_OPT_ETH_DEST_NUM,
+   CMD_LINE_OPT_NO_NUMA_NUM,
+   CMD_LINE_OPT_IPV6_NUM,
+   CMD_LINE_OPT_ENABLE_JUMBO_NUM,
+   CMD_LINE_OPT_HASH_ENTRY_NUM_NUM,
+   CMD_LINE_OPT_PARSE_PTYPE_NUM,
+};
+
+static const struct option lgopts[] = {
+   {CMD_LINE_OPT_CONFIG, 1, 0, CMD_LINE_OPT_CONFIG_NUM},
+   {CMD_LINE_OPT_ETH_DEST, 1, 0, CMD_LINE_OPT_ETH_DEST_NUM},
+   {CMD_LINE_OPT_NO_NUMA, 0, 0, CMD_LINE_OPT_NO_NUMA_NUM},
+   {CMD_LINE_OPT_IPV6, 0, 0, CMD_LINE_OPT_IPV6_NUM},
+   {CMD_LINE_OPT_ENABLE_JUMBO, 0, 0, CMD_LINE_OPT_ENABLE_JUMBO_NUM},
+   {CMD_LINE_OPT_HASH_ENTRY_NUM, 1, 0, CMD_LINE_OPT_HASH_ENTRY_NUM_NUM},
+   {CMD_LINE_OPT_PARSE_PTYPE, 0, 0, CMD_LINE_OPT_PARSE_PTYPE_NUM},
+   {NULL, 0, 0, 0}
+};

 /*
  * This expression is used to calculate the number of mbufs needed
@@ -504,16 +536,6 @@ parse_args(int argc, char **argv)
char **argvopt;
int option_index;
char *prgname = argv[0];
-   static struct option lgopts[] = {
-   {CMD_LINE_OPT_CONFIG, 1, 0, 0},
-   {CMD_LINE_OPT_ETH_DEST, 1, 0, 0},
-   {CMD_LINE_OPT_NO_NUMA, 0, 0, 0},
-   {CMD_LINE_OPT_IPV6, 0, 0, 0},
-   {CMD_LINE_OPT_ENABLE_JUMBO, 0, 0, 0},
-   {CMD_LINE_OPT_HASH_ENTRY_NUM, 1, 0, 0},
-   {CMD_LINE_OPT_PARSE_PTYPE, 0, 0, 0},
-   {NULL, 0, 0, 0}
-   };

argvopt = argv;

@@ -534,7 +556,7 @@ parse_args(int argc, char **argv)
"L3FWD: LPM and EM are mutually exclusive, select only one";
const char *str13 = "L3FWD: LPM or EM none selected, default LPM on";

-   while ((opt = getopt_long(argc, argvopt, "p:PLE",
+   while ((opt = getopt_long(argc, argvopt, short_options,
lgopts, _index)) != EOF) {

switch (opt) {
@@ -547,6 +569,7 @@ parse_args(int argc, char **argv)
return -1;
}
break;
+
case 'P':
printf("%s\n", str2);
promiscuous_on = 1;
@@ -563,89 +586,71 @@ parse_args(int argc, char **argv)
break;

/* long options */
-   case 0:
-   if (!strncmp(lgopts[option_index].name,
-   CMD_LINE_OPT_CONFIG,
-   sizeof(CMD_LINE_OPT_CONFIG))) {
-
-   ret = parse_config(optarg);
-   if (ret) {
-   printf("%s\n", str5);
-   print_usage(prgname);
-   return -1;
-   }
-   }
-
-   if (!strncmp(lgopts[option_index].name,
-   CMD_LINE_OPT_ETH_DEST,
-   sizeof(CMD_LINE_OPT_ETH_DEST))) {
-   parse_eth_dest(optarg);
-   }
-
-   if (!strncmp(lgopts[option_index].name,
-   CMD_LINE_OPT_NO_NUMA,
-   sizeof(CMD_LINE_OPT_NO_NUMA))) {
-   printf("%s\n", str6);
-   numa_on = 0;
+   case CMD_LINE_OPT_CONFIG_NUM:
+   ret = parse_config(optarg);
+   if (ret) {
+   printf("%s\n", str5);
+   

[dpdk-dev] Proposal for a new Committer model

2016-11-22 Thread Neil Horman
On Mon, Nov 21, 2016 at 09:52:41AM +0100, Thomas Monjalon wrote:
> 2016-11-18 13:09, Neil Horman:
> > A) Further promote subtree maintainership.  This was a conversation that I
> > proposed some time ago, but my proposed granularity was discarded in favor
> > of something that hasn't worked as well (in my opinion).  That is to say a
> > few driver pmds (i40e and fm10k come to mind) have their own tree that
> > send pull requests to Thomas.
> 
> Yes we tried this fine granularity and stated that it was not working well.
> We are now using the bigger granularity that you describe below.
> 
Ok, thats good, but that must be _very_ new.  Looking at your git tree, I see no
merge commits.  How are you pulling from those subtrees?


> > We should be sharding that at a much higher
> > granularity and using it much more consistently.  That is to say, that we
> > should have a maintainer for all the ethernet pmds, and another for the
> > crypto pmds, another for the core eal layer, another for misc libraries
> > that have low patch volumes, etc.
> 
> Yes we could open a tree for EAL and another one for the core libraries.
> 
That could be worthwhile.  Lets see how the net and crypto subtrees work out
(assuming again that these trees are newly founded)


> > Each of those subdivisions should have
> > their own list to communicate on, and each should have a tree that
> > integrates patches for their own subsystem, and they should on a regular
> > cycle send pull requests to Thomas.
> 
> Yes I think it is now a good idea to split the mailing list traffic,
> at least for netdev and cryptodev.
> 
Agreed, that serves two purposes, it lowers the volume for people with a
specific interest (i.e. its a rudimentary filter), and it avoids confusion
between you and the subtree maintainer (that is to say, you don't have to even
consider pulling patches that go to the crypo and net lists, you just have to
trust that they pull those patches in and send you appropriate pull requests).

> > Thomas in turn should by and large,
> > only be integrating pull requests.  This should address our high-
> > throughput issue, in that it will allow multiple maintainers to share the
> > workload, and integration should be relatively easy.
> 
> Yes in an ideal organization, the last committer does only a last check
> that technical plan and fairness are respected.
> So it gives more time to coordinate the plans :)
> 
Correct.  Thats never 100% accurate of course, some things will still have to
come to you directly, simply by virtue of the fact that they don't completely
fit anywhere else, but thats ok, the goal is really just to get your total patch
volume lower, and replace it with pull requests that you can either trivialy
mere or figure out with the help of the subtree maintainer.

> > B) Designate alternates to serve as backups for the maintainer when they
> > are unavailable.  This provides high-availablility, and sounds very much
> > like your proposal, but in the interests of clarity, there is still a
> > single maintainer at any one time, it just may change to ensure the
> > continued merging of patches, if the primary maintainer isn't available.
> > Ideally however, those backup alternates arent needed, because most of the
> > primary maintainers work in merging pull requests, which are done based on
> > the trust of the submaintainer, and done during a very limited window of
> > time.  This also partially addreses multi-vendor fairness if your subtree
> > maintainers come from multiple participating companies.
> 
> About the merge window, I do not have a strong opinion about how it can be
> improved. However, I know that closing the window too early makes developer
> unhappy because it makes wait - between development start and its release -
> longer.

This is a fair point, but I'm not talking about closing it early here, all
I'm suggesting is that, if you do proper pull requests from subtrees, your tree
Thomas will only need a reasonably small window of time to accept new features,
because you'll just merge the subtrees, rather than integrating individual
patches.  E.g. you won't be constantly merging patches over the course of a
development cycle, your tree's HEAD will mostly consist of merge commits as
subtree maintainers send you pull requests, and ideally they will send those
near the start of a window.  How long you keep your merge window open after that
is up to you.

Neil



> 


[dpdk-dev] [PATCH 0/2] l2fwd/l3fwd: rework long options parsing

2016-11-22 Thread Olivier Matz
These 2 patches were part of this RFC, which will not be integrated:
http://dpdk.org/ml/archives/dev/2016-September/046974.html

It does not bring any functional change, it just reworks the way long
options are parsed in l2fwd and l3fwd to avoid uneeded strcmp() calls
and to ease the addition of a new long option in the future.

I send them in case maintainers think it is better this way, but I have
no real need.

Olivier Matz (2):
  l3fwd: rework long options parsing
  l2fwd: rework long options parsing

 examples/l2fwd/main.c |  30 +++--
 examples/l3fwd/main.c | 169 ++
 2 files changed, 111 insertions(+), 88 deletions(-)

-- 
2.8.1



[dpdk-dev] [PATCH 0/4] libeventdev API and northbound implementation

2016-11-22 Thread Shreyansh Jain
On Tuesday 22 November 2016 07:30 AM, Yuanhan Liu wrote:
> On Sat, Nov 19, 2016 at 12:57:15AM +0530, Jerin Jacob wrote:
>> On Fri, Nov 18, 2016 at 04:04:29PM +, Bruce Richardson wrote:
>>> +Thomas
>>>
>>> On Fri, Nov 18, 2016 at 03:25:18PM +, Bruce Richardson wrote:
 On Fri, Nov 18, 2016 at 11:14:58AM +0530, Jerin Jacob wrote:
> As previously discussed in RFC v1 [1], RFC v2 [2], with changes
> described in [3] (also pasted below), here is the first non-draft series
> for this new API.
>
> [1] http://dpdk.org/ml/archives/dev/2016-August/045181.html
> [2] http://dpdk.org/ml/archives/dev/2016-October/048592.html
> [3] http://dpdk.org/ml/archives/dev/2016-October/048196.html
>
> Changes since RFC v2:
>
> - Updated the documentation to define the need for this library[Jerin]
> - Added RTE_EVENT_QUEUE_CFG_*_ONLY configuration parameters in
>   struct rte_event_queue_conf to enable optimized sw implementation 
> [Bruce]
> - Introduced RTE_EVENT_OP* ops [Bruce]
> - Added nb_event_queue_flows,nb_event_port_dequeue_depth, 
> nb_event_port_enqueue_depth
>   in rte_event_dev_configure() like ethdev and crypto library[Jerin]
> - Removed rte_event_release() and replaced with RTE_EVENT_OP_RELEASE ops 
> to
>   reduce fast path APIs and it is redundant too[Jerin]
> - In the view of better application portability, Removed pin_event
>   from rte_event_enqueue as it is just hint and Intel/NXP can not support 
> it[Jerin]
> - Added rte_event_port_links_get()[Jerin]
> - Added rte_event_dev_dump[Harry]
>
> Notes:
>
> - This patch set is check-patch clean with an exception that
> 02/04 has one WARNING:MACRO_WITH_FLOW_CONTROL
> - Looking forward to getting additional maintainers for libeventdev
>
>
> Possible next steps:
> 1) Review this patch set
> 2) Integrate Intel's SW driver[http://dpdk.org/dev/patchwork/patch/17049/]
> 3) Review proposed examples/eventdev_pipeline 
> application[http://dpdk.org/dev/patchwork/patch/17053/]
> 4) Review proposed functional 
> tests[http://dpdk.org/dev/patchwork/patch/17051/]
> 5) Cavium's HW based eventdev driver
>
> I am planning to work on (3),(4) and (5)
>
 Thanks Jerin,

 we'll review and get back to you with any comments or feedback (1), and
 obviously start working on item (2) also! :-)

 I'm also wonder whether we should have a staging tree for this work to
 make interaction between us easier. Although this may not be
 finalised enough for 17.02 release, do you think having an
 dpdk-eventdev-next tree would be a help? My thinking is that once we get
 the eventdev library itself in reasonable shape following our review, we
 could commit that and make any changes thereafter as new patches, rather
 than constantly respinning the same set. It also gives us a clean git
 tree to base the respective driver implementations on from our two sides.

 Thomas, any thoughts here on your end - or from anyone else?
>>
>> I was thinking more or less along the same lines. To avoid re-spinning the
>> same set, it is better to have libeventdev library mark as EXPERIMENTAL
>> and commit it somewhere on dpdk-eventdev-next or main tree
>>
>> I think, EXPERIMENTAL status can be changed only when
>> - At least two event drivers available
>> - Functional test applications fine with at least two drivers
>> - Portable example application to showcase the features of the library
>> - eventdev integration with another dpdk subsystem such as ethdev
>
> I'm wondering maybe we could have a staging tree, for all features like
> this one (and one branch for each feature)?
>
>   --yliu
>

+1

It would help a lot of 'experimental' stuff reach a wider audience 
without waiting for a complete cycle of upstreaming.
Though, I am not sure how would we limit the branches - or if that is 
even required.

-- 
-
Shreyansh


[dpdk-dev] [PATCH] virtio: tx with can_push when VERSION_1 is set

2016-11-22 Thread Maxime Coquelin
Hi Pierre,

On 11/22/2016 10:54 AM, Pierre Pfister (ppfister) wrote:
> Hello Maxime,
>
>> Le 9 nov. 2016 ? 15:51, Maxime Coquelin  a 
>> ?crit :
>>
>> Hi Pierre,
>>
>> On 11/09/2016 01:42 PM, Pierre Pfister (ppfister) wrote:
>>> Hello Maxime,
>>>
>>> Sorry for the late reply.
>>>
>>>
 Le 8 nov. 2016 ? 10:44, Maxime Coquelin  a 
 ?crit :

 Hi Pierre,

 On 11/08/2016 10:31 AM, Pierre Pfister (ppfister) wrote:
> Current virtio driver advertises VERSION_1 support,
> but does not handle device's VERSION_1 support when
> sending packets (it looks for ANY_LAYOUT feature,
> which is absent).
>
> This patch enables 'can_push' in tx path when VERSION_1
> is advertised by the device.
>
> This significantly improves small packets forwarding rate
> towards devices advertising VERSION_1 feature.
 I think it depends whether offloading is enabled or not.
 If no offloading enabled, I measured significant drop.
 Indeed, when no offloading is enabled, the Tx path in Virtio
 does not access the virtio header before your patch, as the header is 
 memset to zero at device init time.
 With your patch, it gets memset to zero at every transmit in the hot
 path.
>>>
>>> Right. On the virtio side that is true, but on the device side, we have to 
>>> access the header anyway.
>> No more now, if no offload features have been negotiated.
>> I have done a patch that landed in v16.11 to skip header parsing in
>> this case.
>> That said, we still have to access its descriptor.
>>
>>> And accessing two descriptors (with the address resolution and memory fetch 
>>> which comes with it)
>>> is a costy operation compared to a single one.
>>> In the case indirect descriptors are used, this is 1 desc access instead or 
>>> 3.
>> I agree this is far from being optimal.
>>
>>> And in the case chained descriptors are used, this doubles the number of 
>>> packets that you can put in your queue.
>>>
>>> Those are the results in my PHY -> VM (testpmd) -> PHY setup
>>> Traffic is flowing bidirectionally. Numbers are for lossless-rates.
>>>
>>> When chained buffers are used for dpdk's TX: 2x2.13Mpps
>>> When indirect descriptors are used for dpdk's TX: 2x2.38Mpps
>>> When shallow buffers are used for dpdk's TX (with this patch): 2x2.42Mpps
>> When I tried it, I also did PVP 0% benchmark, and I got opposite results. 
>> Chained and indirect cases were significantly better.
>>
>> My PVP setup was using a single NIC and single Virtio PMD, and NIC2VM
>> forwarding was IO mode done with testpmd on host, and Rx->Tx forwarding
>> was macswap mode on guest side.
>>
>> I also saw some perf regression when running simple tespmd test on both
>> ends.
>>
>> Yuanhan, did you run some benchmark with your series enabling
>> ANY_LAYOUT?
>
> It was enabled. But the specs specify that VERSION_1 includes ANY_LAYOUT.
> Therefor, Qemu removes ANY_LAYOUT when VERSION_1 is set.
>
> We can keep arguing about which is fastest. I guess we have different setups 
> and different results, so we probably are deadlocked here.
> But in any case, the current code is inconsistent, as it uses single 
> descriptor when ANY_LAYOUT is set, but not when VERSION_1 is set.
>
> I believe it makes sense to use single-descriptor any time it is possible, 
> but you are free to think otherwise.
> Please make a call and make the code consistent (removes single-descriptors 
> all together, or use them when VERSION_1 is set too). Otherwise it just 
> creates yet-another testing headache.
I also think it makes sense to have a single descriptor, but I had to
highlight that I noticed a significant performance degradation when
using single descriptor on my setup.

I'm fine we take your patch in virtio-next, so that more testing is 
conducted.

Thanks,
Maxime

>
> Thanks,
>
> - Pierre
>
>>
>>>
>>> I must also note that qemu 2.5 does not seem to deal with VERSION_1 and 
>>> ANY_LAYOUT correctly.
>>> The patch I am proposing here works for qemu 2.7, but with qemu 2.5, 
>>> testpmd still behaves as if ANY_LAYOUT (or VERSION_1) was not available. 
>>> This is not catastrophic. But just note that you will not see performance 
>>> in some cases with qemu 2.5.
>>
>> Thanks for the info.
>>
>> Regards,
>> Maxime
>


[dpdk-dev] [RFC PATCH 0/7] RFC: EventDev Software PMD

2016-11-22 Thread Richardson, Bruce


> -Original Message-
> From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com]
> Sent: Monday, November 21, 2016 8:19 PM
> To: Richardson, Bruce 
> Cc: Van Haaren, Harry ; dev at dpdk.org
> Subject: Re: [dpdk-dev] [RFC PATCH 0/7] RFC: EventDev Software PMD
> 
> On Mon, Nov 21, 2016 at 09:48:56AM +, Bruce Richardson wrote:
> > On Sat, Nov 19, 2016 at 03:53:25AM +0530, Jerin Jacob wrote:
> > > On Thu, Nov 17, 2016 at 10:05:07AM +, Bruce Richardson wrote:
> > > > > 2) device stats API can be based on capability, HW
> > > > > implementations may not support all the stats
> > > >
> > > > Yes, this is something we were thinking about. It would be nice if
> > > > we could at least come up with a common set of stats - maybe even
> > > > ones tracked at an eventdev API level, e.g. nb enqueues/dequeues.
> > > > As well as that, we think the idea of an xstats API, like in
> > > > ethdev, might work well. For our software implementation, having
> > > > visibility into the scheduler behaviour can be important, so we'd
> > > > like a way to report out things like internal queue depths etc.
> > > >
> > >
> > > Since these are not very generic hardware, I am not sure how much
> > > sense to have generic stats API. But, Something similar to ethdev's
> > > xstat(any capability based) the scheme works well. Look forward to
> seeing API proposal with common code.
> > >
> > > Jerin
> > >
> > Well, to start off with, some stats that could be tracked at the API
> > level could be common. What about counts of number of enqueues and
> > dequeues?
> >
> > I suppose the other way we can look at this is: once we get a few
> > implementations of the interface, we can look at the provided xstats
> > values from each one, and see if there is anything common between them.
> 
> That makes more sense to me as we don't have proposed counts. I think,
> Then we should not use stats for functional tests as proposed. We could
> verify the functional test by embedding some value in event object on
> enqueue and later check the same on dequeue kind of scheme.
> 
> Jerin
> 

Yes, that can work. Many of the unit tests we are looking at are likely
specific to our software implementation, so we may end up doing a separate
sw-eventdev specific unit test set, as well as a general eventdev set.

/Bruce


[dpdk-dev] [PATCH 7/7] examples/eventdev_pipeline: adding example

2016-11-22 Thread Richardson, Bruce


> -Original Message-
> From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com]
> Sent: Tuesday, November 22, 2016 6:02 AM
> To: Van Haaren, Harry 
> Cc: dev at dpdk.org; Eads, Gage ; Richardson, Bruce
> 
> Subject: Re: [dpdk-dev] [PATCH 7/7] examples/eventdev_pipeline: adding
> example
> 
> On Wed, Nov 16, 2016 at 06:00:07PM +, Harry van Haaren wrote:
> > This patch adds a sample app to the examples/ directory, which can be
> > used as a reference application and for general testing. The
> > application requires two ethdev ports and expects traffic to be
> > flowing. The application must be run with the --vdev flags as follows
> > to indicate to EAL that a virtual eventdev device called "evdev_sw0" is
> available to be used:
> >
> > ./build/eventdev_pipeline --vdev evdev_sw0
> >
> > The general flow of the traffic is as follows:
> >
> > Rx core -> Atomic Queue => 4 worker cores => TX core
> >
> > A scheduler core is required to do the packet scheduling, making this
> > configuration require 7 cores (Rx, Tx, Scheduler, and 4 workers).
> > Finally a master core brings the core count to 8 for this
> > configuration. The
> 
> Thanks for the example application.I will try to share my views on ethdev
> integration and usability perspective. Hope we can converge.

Hi Jerin, 

thanks for the feedback. We'll take it on board for a subsequent version
we produce. Additional comments and queries on your feedback inline below.

/Bruce

> 
> Some of the high level details first before getting into exact details.
> 
> 1) From the HW and ethdev integration perspective, The integrated NIC
> controllers does not need producer core(s) to push the event/packets to
> event queue. So, I was thinking to use 6WIND rte_flow spec to create the
> "ethdev port to event queue wiring" connection by extending the output
> ACTION definition, which specifies event queue its need to enqueued to for
> the given ethdev port (something your are doing in application).
> 
> I guess, the producer part of this example can be created as common code,
> somewhere in rte_flow/ethdev to reuse. We would need this scheme also
> where when we deal with external nics + HW event manager use case
> 
Yes. This is something to consider.

For the pure-software model, we also might want to look at the opposite
approach, where we register an ethdev with the scheduler for it to "pull"
new packets from. This would allow it to bypass the port logic for the new
packets. 

An alternative for this is to extend the schedule API to allow a burst of
packets to be passed in to be scheduled immediately as "NEW" packets. The end
results should be the same, saving cycles by bypassing unneeded processing
for the new packets.

> The complete event driven model can be verified and exercised without
> integrating with eventdev subsystem. So I think, may be we need to focus
> on functional applications without ethdev to verify the eventdev features
> like(automatic multicore scaling,  dynamic load balancing, pipelining,
> packet ingress order maintenance and synchronization services) and then
> integrate with ethdev

Yes, comprehensive unit tests will be needed too. But an example app that
pulls packets from an external NIC I also think is needed to get a feel
for the performance of the scheduler with real traffic.

> 
> > +   const unsigned cores_needed = num_workers +
> > +   /*main*/1 +
> > +   /*sched*/1 +
> > +   /*TX*/1 +
> > +   /*RX*/1;
> > +
> 
> 2) One of the prime aims of the event driven model is to remove the fixed
> function core mappings and enable automatic multicore scaling,  dynamic
> load balancing etc.I will try to use an example in review section to show
> the method for removing "consumer core" in this case.

Yes, I agree, but unfortunately, for some tasks, distributing those tasks
across multiple cores can hurt performance overall do to resource contention.

> 
> > application can be configured for various numbers of flows and worker
> > cores. Run the application with -h for details.
> >
> > Signed-off-by: Gage Eads 
> > Signed-off-by: Bruce Richardson 
> > Signed-off-by: Harry van Haaren 
> > ---
> >  examples/eventdev_pipeline/Makefile |  49 +++
> >  examples/eventdev_pipeline/main.c   | 718
> 
> >  2 files changed, 767 insertions(+)
> >  create mode 100644 examples/eventdev_pipeline/Makefile
> >  create mode 100644 examples/eventdev_pipeline/main.c
> >
> > +static int sched_type = RTE_SCHED_TYPE_ATOMIC;
> 
> RTE_SCHED_TYPE_ORDERED makes sense as a default. Most common case will
> have ORDERD at first stage so that it can scale.
> 
> > +
> > +
> > +static int
> > +worker(void *arg)
> > +{
> > +   struct rte_event rcv_events[BATCH_SIZE];
> > +
> > +   struct worker_data *data = (struct worker_data *)arg;
> > +   uint8_t event_dev_id = data->event_dev_id;
> > +   uint8_t event_port_id = data->event_port_id;
> > +   int32_t qid = 

[dpdk-dev] [PATCH v2] i40e: Fix eth_i40e_dev_init sequence on ThunderX

2016-11-22 Thread Bruce Richardson
On Tue, Nov 22, 2016 at 03:46:38AM +0530, Jerin Jacob wrote:
> On Sun, Nov 20, 2016 at 11:21:43PM +, Ananyev, Konstantin wrote:
> > Hi
> > > 
> > > i40e_asq_send_command: rd32 & wr32 under ThunderX gives unpredictable
> > >results. To solve this include rte memory barriers
> > > 
> > > Signed-off-by: Satha Rao 
> > > ---
> > >  drivers/net/i40e/base/i40e_osdep.h | 14 ++
> > >  1 file changed, 14 insertions(+)
> > > 
> > > diff --git a/drivers/net/i40e/base/i40e_osdep.h 
> > > b/drivers/net/i40e/base/i40e_osdep.h
> > > index 38e7ba5..ffa3160 100644
> > > --- a/drivers/net/i40e/base/i40e_osdep.h
> > > +++ b/drivers/net/i40e/base/i40e_osdep.h
> > > @@ -158,7 +158,13 @@ do { 
> > >\
> > >   ((volatile uint32_t *)((char *)(a)->hw_addr + (reg)))
> > >  static inline uint32_t i40e_read_addr(volatile void *addr)
> > >  {
> > > +#if defined(RTE_ARCH_ARM64)
> > > + uint32_t val = rte_le_to_cpu_32(I40E_PCI_REG(addr));
> > > + rte_rmb();
> > > + return val;
> > 
> > If you really need an rmb/wmb with MMIO read/writes on ARM,
> > I think you can avoid #ifdefs here and use rte_smp_rmb/rte_smp_wmb.
> > BTW, I suppose if you need it for i40e, you would need it for other devices 
> > too.
> 
> Yes. ARM would need for all devices(typically, the devices on external PCI 
> bus).
> I guess rte_smp_rmb may not be the correct abstraction. So we need more of
> rte_rmb() as we need only non smp variant on IO side. I guess then it make 
> sense to
> create new abstraction in eal with following variants so that each arch
> gets opportunity to make what it makes sense that specific platform
> 
> rte_readb_relaxed
> rte_readw_relaxed
> rte_readl_relaxed
> rte_readq_relaxed
> rte_writeb_relaxed
> rte_writew_relaxed
> rte_writel_relaxed
> rte_writeq_relaxed
> rte_readb
> rte_readw
> rte_readl
> rte_readq
> rte_writeb
> rte_writew
> rte_writel
> rte_writeq
> 
> Thoughts ?
> 

That seems like a lot of API calls!
Perhaps you can clarify - why would the rte_smp_rmb() not work for you?

/Bruce


[dpdk-dev] [PATCH v2 1/8] eal: define container_of macro

2016-11-22 Thread Jan Viktorin
On Tue, 22 Nov 2016 11:26:50 +
Shreyansh Jain  wrote:

> > -Original Message-
> > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > Sent: Tuesday, November 22, 2016 3:50 PM
> > To: Shreyansh Jain ; Jan Blunck
> > 
> > Cc: dev at dpdk.org; david.marchand at 6wind.com; Jan Viktorin
> > 
> > Subject: Re: [dpdk-dev] [PATCH v2 1/8] eal: define container_of macro
> > 
> > 2016-11-22 15:33, Shreyansh Jain:  
> > > On Monday 21 November 2016 10:25 PM, Jan Blunck wrote:  
> > > > This macro is based on Jan Viktorin's original patch but also checks the
> > > > type of the passed pointer against the type of the member.
> > > >
> > > > Signed-off-by: Jan Viktorin 
> > > > Signed-off-by: Shreyansh Jain 
> > > > [jblunck at infradead.org: add type checking and __extension__]
> > > > Signed-off-by: Jan Blunck   
> > >
> > > I will start using this in my patchset.
> > >
> > > Acked-by: Shreyansh Jain   
> > 
> > It is a bit strange to have this patch in a series which do
> > not use it. I am in favor of getting it when it is used
> > (and included) in another series.  
> 
> I can add this patch to my series, if Jan is ok about this.

It's OK. Just merge it someday ;).

Jan

> 
> -
> Shreyansh



-- 
  Jan ViktorinE-mail: Viktorin at RehiveTech.com
  System ArchitectWeb:www.RehiveTech.com
  RehiveTech
  Brno, Czech Republic


[dpdk-dev] [PATCH v2 1/8] eal: define container_of macro

2016-11-22 Thread Shreyansh Jain
> -Original Message-
> From: Jan Viktorin [mailto:viktorin at rehivetech.com]
> Sent: Tuesday, November 22, 2016 6:03 PM
> To: Shreyansh Jain 
> Cc: Thomas Monjalon ; Jan Blunck
> ; dev at dpdk.org; david.marchand at 6wind.com
> Subject: Re: [dpdk-dev] [PATCH v2 1/8] eal: define container_of macro
> 
> On Tue, 22 Nov 2016 11:26:50 +
> Shreyansh Jain  wrote:
> 
> > > -Original Message-
> > > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > > Sent: Tuesday, November 22, 2016 3:50 PM
> > > To: Shreyansh Jain ; Jan Blunck
> > > 
> > > Cc: dev at dpdk.org; david.marchand at 6wind.com; Jan Viktorin
> > > 
> > > Subject: Re: [dpdk-dev] [PATCH v2 1/8] eal: define container_of macro
> > >
> > > 2016-11-22 15:33, Shreyansh Jain:
> > > > On Monday 21 November 2016 10:25 PM, Jan Blunck wrote:
> > > > > This macro is based on Jan Viktorin's original patch but also checks
> the
> > > > > type of the passed pointer against the type of the member.
> > > > >
> > > > > Signed-off-by: Jan Viktorin 
> > > > > Signed-off-by: Shreyansh Jain 
> > > > > [jblunck at infradead.org: add type checking and __extension__]
> > > > > Signed-off-by: Jan Blunck 
> > > >
> > > > I will start using this in my patchset.
> > > >
> > > > Acked-by: Shreyansh Jain 
> > >
> > > It is a bit strange to have this patch in a series which do
> > > not use it. I am in favor of getting it when it is used
> > > (and included) in another series.
> >
> > I can add this patch to my series, if Jan is ok about this.
> 
> It's OK. Just merge it someday ;).

Actually, I meant Jan Blunck ;D
I have already been using your patch since long.

> 
> Jan
> 
> >
> > -
> > Shreyansh
> 
> 
> 
> --
>   Jan ViktorinE-mail: Viktorin at RehiveTech.com
>   System ArchitectWeb:www.RehiveTech.com
>   RehiveTech
>   Brno, Czech Republic


[dpdk-dev] [PATCH] mk: remove make target for examples

2016-11-22 Thread Ferruh Yigit
On 11/22/2016 9:38 AM, Thomas Monjalon wrote:
> 2016-11-22 00:34, Ferruh Yigit:
>> On 11/21/2016 11:47 PM, Thomas Monjalon wrote:
>>> The command
>>>   make examples
>>> works only if target directories have the exact name of configs.
>>>
>>> It is more flexible to use
>>>   make -C examples RTE_SDK=$(pwd) RTE_TARGET=build
>>>
>>> Signed-off-by: Thomas Monjalon 
>>
>> Instead of removing examples & examples_clean targets, what do you think
>> keeping them as wrapper to suggested usage, for backward compatibility.
>>
>> Something like:
>> "
>> BUILDING_RTE_SDK :=
>> export BUILDING_RTE_SDK
>>
>> # Build directory is given with O=
>> O ?= $(RTE_SDK)/examples
>>
>> # Target for which examples should be built.
>> T ?= build
>>
>> .PHONY: examples
>> examples:
>> @echo == Build examples for $(T)
>> $(MAKE) -C examples O=$(abspath $(O)) RTE_TARGET=$(T);
>>
>> .PHONY: examples_clean
>> examples_clean:
>> @echo == Clean examples for $(T)
>> $(MAKE) -C examples O=$(abspath $(O)) RTE_TARGET=$(T) clean;
>> "
> 
> What is the benefit of this makefile? Just remove -C ?

To keep existing targets, in case somebody use them.

> It is not compatible with the old behaviour, so I'm afraid it would be
> confusing for no real benefit.

Right, not fully compatible, but still can do:
make examples / make examples_clean
make examples T=x86_64-native-linuxapp-gcc

Overall, if you believe keeping them is confusing, I am OK with it, just
may need to update doc/build-sdk-quick.txt to fix "make help" output.

Thanks,
ferruh


[dpdk-dev] [PATCH v2] drivers: advertise kmod dependencies in pmdinfo

2016-11-22 Thread Olivier Matz
Hi Adrien,

On 11/22/2016 11:27 AM, Adrien Mazarguil wrote:
> Hi Olivier,
> 
> Neither mlx4 nor mlx5 depend on igb/uio/vfio modules, please see below.
> 
> On Tue, Nov 22, 2016 at 10:50:57AM +0100, Olivier Matz wrote:
>> Add a new macro RTE_PMD_REGISTER_KMOD_DEP() that allows a driver to
>> declare the list of kernel modules required to run properly.
>>
>> Today, most PCI drivers require uio/vfio.
>>
>> Signed-off-by: Olivier Matz 
>> Acked-by: Fiona Trahe 
>> ---
> [...]
>> diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
>> index da61a85..a0065bf 100644
>> --- a/drivers/net/mlx4/mlx4.c
>> +++ b/drivers/net/mlx4/mlx4.c
>> @@ -5937,3 +5937,4 @@ rte_mlx4_pmd_init(void)
>>  
>>  RTE_PMD_EXPORT_NAME(net_mlx4, __COUNTER__);
>>  RTE_PMD_REGISTER_PCI_TABLE(net_mlx4, mlx4_pci_id_map);
>> +RTE_PMD_REGISTER_KMOD_DEP(net_mlx4, "* igb_uio | uio_pci_generic | vfio");
> 
> RTE_PMD_REGISTER_KMOD_DEP(net_mlx4, "* ib_uverbs & mlx4_en & mlx4_core & 
> mlx4_ib");
> 
>> diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
>> index 90cc35e..b0343f3 100644
>> --- a/drivers/net/mlx5/mlx5.c
>> +++ b/drivers/net/mlx5/mlx5.c
>> @@ -759,3 +759,4 @@ rte_mlx5_pmd_init(void)
>>
>>  RTE_PMD_EXPORT_NAME(net_mlx5, __COUNTER__);
>>  RTE_PMD_REGISTER_PCI_TABLE(net_mlx5, mlx5_pci_id_map);
>> +RTE_PMD_REGISTER_KMOD_DEP(net_mlx5, "* igb_uio | uio_pci_generic | vfio");
> 
> RTE_PMD_REGISTER_KMOD_DEP(net_mlx5, "* ib_uverbs & mlx5_core & mlx5_ib");
> 

Thank you for reviewing. I messed up in the rebase, the v1 was
closer to what you suggest, sorry. I'll send an update.

Olivier


[dpdk-dev] [PATCH v2] drivers: advertise kmod dependencies in pmdinfo

2016-11-22 Thread Adrien Mazarguil
Hi Olivier,

Neither mlx4 nor mlx5 depend on igb/uio/vfio modules, please see below.

On Tue, Nov 22, 2016 at 10:50:57AM +0100, Olivier Matz wrote:
> Add a new macro RTE_PMD_REGISTER_KMOD_DEP() that allows a driver to
> declare the list of kernel modules required to run properly.
> 
> Today, most PCI drivers require uio/vfio.
> 
> Signed-off-by: Olivier Matz 
> Acked-by: Fiona Trahe 
> ---
[...]
> diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
> index da61a85..a0065bf 100644
> --- a/drivers/net/mlx4/mlx4.c
> +++ b/drivers/net/mlx4/mlx4.c
> @@ -5937,3 +5937,4 @@ rte_mlx4_pmd_init(void)
>  
>  RTE_PMD_EXPORT_NAME(net_mlx4, __COUNTER__);
>  RTE_PMD_REGISTER_PCI_TABLE(net_mlx4, mlx4_pci_id_map);
> +RTE_PMD_REGISTER_KMOD_DEP(net_mlx4, "* igb_uio | uio_pci_generic | vfio");

RTE_PMD_REGISTER_KMOD_DEP(net_mlx4, "* ib_uverbs & mlx4_en & mlx4_core & 
mlx4_ib");

> diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
> index 90cc35e..b0343f3 100644
> --- a/drivers/net/mlx5/mlx5.c
> +++ b/drivers/net/mlx5/mlx5.c
> @@ -759,3 +759,4 @@ rte_mlx5_pmd_init(void)
> 
>  RTE_PMD_EXPORT_NAME(net_mlx5, __COUNTER__);
>  RTE_PMD_REGISTER_PCI_TABLE(net_mlx5, mlx5_pci_id_map);
> +RTE_PMD_REGISTER_KMOD_DEP(net_mlx5, "* igb_uio | uio_pci_generic | vfio");

RTE_PMD_REGISTER_KMOD_DEP(net_mlx5, "* ib_uverbs & mlx5_core & mlx5_ib");

-- 
Adrien Mazarguil
6WIND


[dpdk-dev] [PATCH v2 1/8] eal: define container_of macro

2016-11-22 Thread Shreyansh Jain
> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Tuesday, November 22, 2016 3:50 PM
> To: Shreyansh Jain ; Jan Blunck
> 
> Cc: dev at dpdk.org; david.marchand at 6wind.com; Jan Viktorin
> 
> Subject: Re: [dpdk-dev] [PATCH v2 1/8] eal: define container_of macro
> 
> 2016-11-22 15:33, Shreyansh Jain:
> > On Monday 21 November 2016 10:25 PM, Jan Blunck wrote:
> > > This macro is based on Jan Viktorin's original patch but also checks the
> > > type of the passed pointer against the type of the member.
> > >
> > > Signed-off-by: Jan Viktorin 
> > > Signed-off-by: Shreyansh Jain 
> > > [jblunck at infradead.org: add type checking and __extension__]
> > > Signed-off-by: Jan Blunck 
> >
> > I will start using this in my patchset.
> >
> > Acked-by: Shreyansh Jain 
> 
> It is a bit strange to have this patch in a series which do
> not use it. I am in favor of getting it when it is used
> (and included) in another series.

I can add this patch to my series, if Jan is ok about this.

-
Shreyansh


[dpdk-dev] [PATCH v2 1/8] eal: define container_of macro

2016-11-22 Thread Thomas Monjalon
2016-11-22 15:33, Shreyansh Jain:
> On Monday 21 November 2016 10:25 PM, Jan Blunck wrote:
> > This macro is based on Jan Viktorin's original patch but also checks the
> > type of the passed pointer against the type of the member.
> >
> > Signed-off-by: Jan Viktorin 
> > Signed-off-by: Shreyansh Jain 
> > [jblunck at infradead.org: add type checking and __extension__]
> > Signed-off-by: Jan Blunck 
> 
> I will start using this in my patchset.
> 
> Acked-by: Shreyansh Jain 

It is a bit strange to have this patch in a series which do
not use it. I am in favor of getting it when it is used
(and included) in another series.


[dpdk-dev] [PATCH] mempool: fix Api documentation

2016-11-22 Thread Olivier Matz
A previous commit changed the local_cache table into a
pointer, reducing the size of the rte_mempool structure.

Fix the API comment of rte_mempool_create() related to
this modification.

Fixes: 213af31e0960 ("mempool: reduce structure size if no cache needed")

Signed-off-by: Olivier Matz 
---
 lib/librte_mempool/rte_mempool.h | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 440f3b1..956ce04 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -610,9 +610,7 @@ typedef void (rte_mempool_ctor_t)(struct rte_mempool *, 
void *);
  *   never be used. The access to the per-lcore table is of course
  *   faster than the multi-producer/consumer pool. The cache can be
  *   disabled if the cache_size argument is set to 0; it can be useful to
- *   avoid losing objects in cache. Note that even if not used, the
- *   memory space for cache is always reserved in a mempool structure,
- *   except if CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE is set to 0.
+ *   avoid losing objects in cache.
  * @param private_data_size
  *   The size of the private data appended after the mempool
  *   structure. This is useful for storing some private data after the
-- 
2.8.1



[dpdk-dev] [PATCH v2] drivers: advertise kmod dependencies in pmdinfo

2016-11-22 Thread Olivier Matz
Add a new macro RTE_PMD_REGISTER_KMOD_DEP() that allows a driver to
declare the list of kernel modules required to run properly.

Today, most PCI drivers require uio/vfio.

Signed-off-by: Olivier Matz 
Acked-by: Fiona Trahe 
---

v1 -> v2:   
 
- do not advertise uio_pci_generic for vf drivers
- rebase on top of head: use new driver names and prefix
  macro with RTE_   


rfc -> v1:
- the kmod information can be per-device using a modalias-like
  pattern
- change syntax to use '&' and '|' instead of ',' and ':'
- remove useless prerequisites in kmod lis: no need to
  specify both uio and uio_pci_generic, only the latter is
  required
- update kmod list in szedata2 driver
- remove kmod list in qat driver: it requires more than just loading
  a kmod, which is described in documentation


 buildtools/pmdinfogen/pmdinfogen.c  |  1 +
 buildtools/pmdinfogen/pmdinfogen.h  |  1 +
 drivers/net/bnx2x/bnx2x_ethdev.c|  2 ++
 drivers/net/bnxt/bnxt_ethdev.c  |  1 +
 drivers/net/cxgbe/cxgbe_ethdev.c|  1 +
 drivers/net/e1000/em_ethdev.c   |  1 +
 drivers/net/e1000/igb_ethdev.c  |  2 ++
 drivers/net/ena/ena_ethdev.c|  1 +
 drivers/net/enic/enic_ethdev.c  |  1 +
 drivers/net/fm10k/fm10k_ethdev.c|  1 +
 drivers/net/i40e/i40e_ethdev.c  |  1 +
 drivers/net/i40e/i40e_ethdev_vf.c   |  1 +
 drivers/net/ixgbe/ixgbe_ethdev.c|  2 ++
 drivers/net/mlx4/mlx4.c |  1 +
 drivers/net/mlx5/mlx5.c |  1 +
 drivers/net/nfp/nfp_net.c   |  1 +
 drivers/net/qede/qede_ethdev.c  |  2 ++
 drivers/net/szedata2/rte_eth_szedata2.c |  2 ++
 drivers/net/thunderx/nicvf_ethdev.c |  1 +
 drivers/net/virtio/virtio_ethdev.c  |  1 +
 drivers/net/vmxnet3/vmxnet3_ethdev.c|  1 +
 lib/librte_eal/common/include/rte_dev.h | 25 +
 tools/dpdk-pmdinfo.py   |  5 -
 23 files changed, 55 insertions(+), 1 deletion(-)

diff --git a/buildtools/pmdinfogen/pmdinfogen.c 
b/buildtools/pmdinfogen/pmdinfogen.c
index 59ab956..5129c57 100644
--- a/buildtools/pmdinfogen/pmdinfogen.c
+++ b/buildtools/pmdinfogen/pmdinfogen.c
@@ -269,6 +269,7 @@ struct opt_tag {

 static const struct opt_tag opt_tags[] = {
{"_param_string_export", "params"},
+   {"_kmod_dep_export", "kmod"},
 };

 static int complete_pmd_entry(struct elf_info *info, struct pmd_driver *drv)
diff --git a/buildtools/pmdinfogen/pmdinfogen.h 
b/buildtools/pmdinfogen/pmdinfogen.h
index 1da2966..2fab2aa 100644
--- a/buildtools/pmdinfogen/pmdinfogen.h
+++ b/buildtools/pmdinfogen/pmdinfogen.h
@@ -85,6 +85,7 @@ else \

 enum opt_params {
PMD_PARAM_STRING = 0,
+   PMD_KMOD_DEP,
PMD_OPT_MAX
 };

diff --git a/drivers/net/bnx2x/bnx2x_ethdev.c b/drivers/net/bnx2x/bnx2x_ethdev.c
index 0eae433..0f1e4a2 100644
--- a/drivers/net/bnx2x/bnx2x_ethdev.c
+++ b/drivers/net/bnx2x/bnx2x_ethdev.c
@@ -643,5 +643,7 @@ static struct eth_driver rte_bnx2xvf_pmd = {

 RTE_PMD_REGISTER_PCI(net_bnx2x, rte_bnx2x_pmd.pci_drv);
 RTE_PMD_REGISTER_PCI_TABLE(net_bnx2x, pci_id_bnx2x_map);
+RTE_PMD_REGISTER_KMOD_DEP(net_bnx2x, "* igb_uio | uio_pci_generic | vfio");
 RTE_PMD_REGISTER_PCI(net_bnx2xvf, rte_bnx2xvf_pmd.pci_drv);
 RTE_PMD_REGISTER_PCI_TABLE(net_bnx2xvf, pci_id_bnx2xvf_map);
+RTE_PMD_REGISTER_KMOD_DEP(net_bnx2xvf, "* igb_uio | vfio");
diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index 035fe07..a24e153 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -1173,3 +1173,4 @@ static struct eth_driver bnxt_rte_pmd = {

 RTE_PMD_REGISTER_PCI(net_bnxt, bnxt_rte_pmd.pci_drv);
 RTE_PMD_REGISTER_PCI_TABLE(net_bnxt, bnxt_pci_id_map);
+RTE_PMD_REGISTER_KMOD_DEP(net_bnxt, "* igb_uio | uio_pci_generic | vfio");
diff --git a/drivers/net/cxgbe/cxgbe_ethdev.c b/drivers/net/cxgbe/cxgbe_ethdev.c
index b7f28eb..317598d 100644
--- a/drivers/net/cxgbe/cxgbe_ethdev.c
+++ b/drivers/net/cxgbe/cxgbe_ethdev.c
@@ -1050,3 +1050,4 @@ static struct eth_driver rte_cxgbe_pmd = {

 RTE_PMD_REGISTER_PCI(net_cxgbe, rte_cxgbe_pmd.pci_drv);
 RTE_PMD_REGISTER_PCI_TABLE(net_cxgbe, cxgb4_pci_tbl);
+RTE_PMD_REGISTER_KMOD_DEP(net_cxgbe, "* igb_uio | uio_pci_generic | vfio");
diff --git a/drivers/net/e1000/em_ethdev.c b/drivers/net/e1000/em_ethdev.c
index aee3d34..866a5cf 100644
--- a/drivers/net/e1000/em_ethdev.c
+++ b/drivers/net/e1000/em_ethdev.c
@@ -1807,3 +1807,4 @@ eth_em_set_mc_addr_list(struct rte_eth_dev *dev,

 RTE_PMD_REGISTER_PCI(net_e1000_em, rte_em_pmd.pci_drv);
 RTE_PMD_REGISTER_PCI_TABLE(net_e1000_em, pci_id_em_map);
+RTE_PMD_REGISTER_KMOD_DEP(net_e1000_em, "* igb_uio | uio_pci_generic | vfio");
diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
index 2fddf0c..08f2a68 100644
--- 

[dpdk-dev] [PATCH] mk: remove make target for examples

2016-11-22 Thread Thomas Monjalon
2016-11-22 00:34, Ferruh Yigit:
> On 11/21/2016 11:47 PM, Thomas Monjalon wrote:
> > The command
> >   make examples
> > works only if target directories have the exact name of configs.
> > 
> > It is more flexible to use
> >   make -C examples RTE_SDK=$(pwd) RTE_TARGET=build
> > 
> > Signed-off-by: Thomas Monjalon 
> 
> Instead of removing examples & examples_clean targets, what do you think
> keeping them as wrapper to suggested usage, for backward compatibility.
> 
> Something like:
> "
> BUILDING_RTE_SDK :=
> export BUILDING_RTE_SDK
> 
> # Build directory is given with O=
> O ?= $(RTE_SDK)/examples
> 
> # Target for which examples should be built.
> T ?= build
> 
> .PHONY: examples
> examples:
> @echo == Build examples for $(T)
> $(MAKE) -C examples O=$(abspath $(O)) RTE_TARGET=$(T);
> 
> .PHONY: examples_clean
> examples_clean:
> @echo == Clean examples for $(T)
> $(MAKE) -C examples O=$(abspath $(O)) RTE_TARGET=$(T) clean;
> "

What is the benefit of this makefile? Just remove -C ?
It is not compatible with the old behaviour, so I'm afraid it would be
confusing for no real benefit.


[dpdk-dev] [PATCH v3 2/2] mempool: pktmbuf pool default fallback for mempool ops error

2016-11-22 Thread Olivier Matz
Hi Hemant,

Back on this topic, please see some comments below.

On 11/07/2016 01:30 PM, Hemant Agrawal wrote:
> Hi Olivier,
>   
>> -Original Message-
>> From: Olivier Matz [mailto:olivier.matz at 6wind.com]
>> Sent: Friday, October 14, 2016 5:41 PM
>>> On 9/22/2016 6:42 PM, Hemant Agrawal wrote:
 Hi Olivier

 On 9/19/2016 7:27 PM, Olivier Matz wrote:
> Hi Hemant,
>
> On 09/16/2016 06:46 PM, Hemant Agrawal wrote:
>> In the rte_pktmbuf_pool_create, if the default external mempool is
>> not available, the implementation can default to "ring_mp_mc",
>> which is an software implementation.
>>
>> Signed-off-by: Hemant Agrawal 
>> ---
>> Changes in V3:
>> * adding warning message to say that falling back to default sw
>> pool
>> ---
>>  lib/librte_mbuf/rte_mbuf.c | 8 
>>  1 file changed, 8 insertions(+)
>>
>> diff --git a/lib/librte_mbuf/rte_mbuf.c
>> b/lib/librte_mbuf/rte_mbuf.c index 4846b89..8ab0eb1 100644
>> --- a/lib/librte_mbuf/rte_mbuf.c
>> +++ b/lib/librte_mbuf/rte_mbuf.c
>> @@ -176,6 +176,14 @@ rte_pktmbuf_pool_create(const char *name,
>> unsigned n,
>>
>>  rte_errno = rte_mempool_set_ops_byname(mp,
>>  RTE_MBUF_DEFAULT_MEMPOOL_OPS, NULL);
>> +
>> +/* on error, try falling back to the software based default
>> pool */
>> +if (rte_errno == -EOPNOTSUPP) {
>> +RTE_LOG(WARNING, MBUF, "Default HW Mempool not supported. "
>> +"falling back to sw mempool \"ring_mp_mc\"");
>> +rte_errno = rte_mempool_set_ops_byname(mp, "ring_mp_mc",
>> NULL);
>> +}
>> +
>>  if (rte_errno != 0) {
>>  RTE_LOG(ERR, MBUF, "error setting mempool handler\n");
>>  return NULL;
>>
>
> Without adding a new method ".supported()", the first call to
> rte_mempool_populate() could return the same error ENOTSUP. In this
> case, it is still possible to fallback.
>
 It will be bit late.

 On failure, than we have to set the default ops and do a goto before
 rte_pktmbuf_pool_init(mp, _priv);
>>
>> I still think we can do the job without adding the .supported() method.
>> The following code is just an (untested) example:
>>
>> struct rte_mempool *
>> rte_pktmbuf_pool_create(const char *name, unsigned n,
>> unsigned cache_size, uint16_t priv_size, uint16_t data_room_size,
>> int socket_id)
>> {
>> struct rte_mempool *mp;
>> struct rte_pktmbuf_pool_private mbp_priv;
>> unsigned elt_size;
>> int ret;
>> const char *ops[] = {
>> RTE_MBUF_DEFAULT_MEMPOOL_OPS, "ring_mp_mc", NULL,
>> };
>> const char **op;
>>
>> if (RTE_ALIGN(priv_size, RTE_MBUF_PRIV_ALIGN) != priv_size) {
>> RTE_LOG(ERR, MBUF, "mbuf priv_size=%u is not aligned\n",
>> priv_size);
>> rte_errno = EINVAL;
>> return NULL;
>> }
>> elt_size = sizeof(struct rte_mbuf) + (unsigned)priv_size +
>> (unsigned)data_room_size;
>> mbp_priv.mbuf_data_room_size = data_room_size;
>> mbp_priv.mbuf_priv_size = priv_size;
>>
>> for (op = [0]; *op != NULL; op++) {
>> mp = rte_mempool_create_empty(name, n, elt_size, cache_size,
>> sizeof(struct rte_pktmbuf_pool_private), socket_id, 0);
>> if (mp == NULL)
>> return NULL;
>>
>> ret = rte_mempool_set_ops_byname(mp, *op, NULL);
>> if (ret != 0) {
>> RTE_LOG(ERR, MBUF, "error setting mempool handler\n");
>> rte_mempool_free(mp);
>> if (ret == -ENOTSUP)
>> continue;
>> rte_errno = -ret;
>> return NULL;
>> }
>> rte_pktmbuf_pool_init(mp, _priv);
>>
>> ret = rte_mempool_populate_default(mp);
>> if (ret < 0) {
>> rte_mempool_free(mp);
>> if (ret == -ENOTSUP)
>> continue;
>> rte_errno = -ret;
>> return NULL;
>> }
>> }
>>
>> rte_mempool_obj_iter(mp, rte_pktmbuf_init, NULL);
>>
>> return mp;
>> }
>>
>>
> [Hemant]  This look fine to me. Please submit a patch for the same. 
> 
> I've just submitted an RFC, which I think is quite linked:
> http://dpdk.org/ml/archives/dev/2016-September/046974.html
> Assuming a new parameter "mempool_ops" is added to
> rte_pktmbuf_pool_create(), would it make sense to fallback to
> "ring_mp_mc"? What about just returning ENOTSUP? The application
> could do the job and decide which sw fallback to use.

 We ran into this issue when trying to run the standard DPDK examples
 (l3fwd) in VM. Do you think, is it practical to add fallback handling
 in each of the DPDK examples?
>>
>> OK. What is still unclear for me, is how the software is aware of the 
>> different
>> hardware-assisted handlers. Moreover, we could imagine more software
>> handlers, which could 

[dpdk-dev] [PATCH] virtio: tx with can_push when VERSION_1 is set

2016-11-22 Thread Pierre Pfister (ppfister)
Hello Maxime,

> Le 9 nov. 2016 ? 15:51, Maxime Coquelin  a 
> ?crit :
> 
> Hi Pierre,
> 
> On 11/09/2016 01:42 PM, Pierre Pfister (ppfister) wrote:
>> Hello Maxime,
>> 
>> Sorry for the late reply.
>> 
>> 
>>> Le 8 nov. 2016 ? 10:44, Maxime Coquelin  a 
>>> ?crit :
>>> 
>>> Hi Pierre,
>>> 
>>> On 11/08/2016 10:31 AM, Pierre Pfister (ppfister) wrote:
 Current virtio driver advertises VERSION_1 support,
 but does not handle device's VERSION_1 support when
 sending packets (it looks for ANY_LAYOUT feature,
 which is absent).
 
 This patch enables 'can_push' in tx path when VERSION_1
 is advertised by the device.
 
 This significantly improves small packets forwarding rate
 towards devices advertising VERSION_1 feature.
>>> I think it depends whether offloading is enabled or not.
>>> If no offloading enabled, I measured significant drop.
>>> Indeed, when no offloading is enabled, the Tx path in Virtio
>>> does not access the virtio header before your patch, as the header is 
>>> memset to zero at device init time.
>>> With your patch, it gets memset to zero at every transmit in the hot
>>> path.
>> 
>> Right. On the virtio side that is true, but on the device side, we have to 
>> access the header anyway.
> No more now, if no offload features have been negotiated.
> I have done a patch that landed in v16.11 to skip header parsing in
> this case.
> That said, we still have to access its descriptor.
> 
>> And accessing two descriptors (with the address resolution and memory fetch 
>> which comes with it)
>> is a costy operation compared to a single one.
>> In the case indirect descriptors are used, this is 1 desc access instead or 
>> 3.
> I agree this is far from being optimal.
> 
>> And in the case chained descriptors are used, this doubles the number of 
>> packets that you can put in your queue.
>> 
>> Those are the results in my PHY -> VM (testpmd) -> PHY setup
>> Traffic is flowing bidirectionally. Numbers are for lossless-rates.
>> 
>> When chained buffers are used for dpdk's TX: 2x2.13Mpps
>> When indirect descriptors are used for dpdk's TX: 2x2.38Mpps
>> When shallow buffers are used for dpdk's TX (with this patch): 2x2.42Mpps
> When I tried it, I also did PVP 0% benchmark, and I got opposite results. 
> Chained and indirect cases were significantly better.
> 
> My PVP setup was using a single NIC and single Virtio PMD, and NIC2VM
> forwarding was IO mode done with testpmd on host, and Rx->Tx forwarding
> was macswap mode on guest side.
> 
> I also saw some perf regression when running simple tespmd test on both
> ends.
> 
> Yuanhan, did you run some benchmark with your series enabling
> ANY_LAYOUT?

It was enabled. But the specs specify that VERSION_1 includes ANY_LAYOUT.
Therefor, Qemu removes ANY_LAYOUT when VERSION_1 is set.

We can keep arguing about which is fastest. I guess we have different setups 
and different results, so we probably are deadlocked here.
But in any case, the current code is inconsistent, as it uses single descriptor 
when ANY_LAYOUT is set, but not when VERSION_1 is set.

I believe it makes sense to use single-descriptor any time it is possible, but 
you are free to think otherwise.
Please make a call and make the code consistent (removes single-descriptors all 
together, or use them when VERSION_1 is set too). Otherwise it just creates 
yet-another testing headache. 

Thanks,

- Pierre 

> 
>> 
>> I must also note that qemu 2.5 does not seem to deal with VERSION_1 and 
>> ANY_LAYOUT correctly.
>> The patch I am proposing here works for qemu 2.7, but with qemu 2.5, testpmd 
>> still behaves as if ANY_LAYOUT (or VERSION_1) was not available. This is not 
>> catastrophic. But just note that you will not see performance in some cases 
>> with qemu 2.5.
> 
> Thanks for the info.
> 
> Regards,
> Maxime



[dpdk-dev] [PATCH v2] examples/ethtool: fix bug in drvinfo callback

2016-11-22 Thread Qiming Yang
Function pcmd_drvinfo_callback uses struct info to get
the ethtool information of each port. Struct info will
store the information of previous port until this
information be updated. This patch fixes this issue.

Fixes: bda68ab9d1e7 ("examples/ethtool: add user-space ethtool sample 
application")

Signed-off-by: Qiming Yang 
---
v2 changes:
* fixed the spelling mistake in commit log
---
---
 examples/ethtool/ethtool-app/ethapp.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/examples/ethtool/ethtool-app/ethapp.c 
b/examples/ethtool/ethtool-app/ethapp.c
index 9b77385..192d941 100644
--- a/examples/ethtool/ethtool-app/ethapp.c
+++ b/examples/ethtool/ethtool-app/ethapp.c
@@ -177,6 +177,7 @@ pcmd_drvinfo_callback(__rte_unused void *ptr_params,
int id_port;

for (id_port = 0; id_port < rte_eth_dev_count(); id_port++) {
+   memset(, 0, sizeof(info));
if (rte_ethtool_get_drvinfo(id_port, )) {
printf("Error getting info for port %i\n", id_port);
return;
-- 
2.7.4



[dpdk-dev] [RFC PATCH 0/7] RFC: EventDev Software PMD

2016-11-22 Thread Jerin Jacob
On Mon, Nov 21, 2016 at 09:48:56AM +, Bruce Richardson wrote:
> On Sat, Nov 19, 2016 at 03:53:25AM +0530, Jerin Jacob wrote:
> > On Thu, Nov 17, 2016 at 10:05:07AM +, Bruce Richardson wrote:
> > > > 2) device stats API can be based on capability, HW implementations may 
> > > > not
> > > > support all the stats
> > > 
> > > Yes, this is something we were thinking about. It would be nice if we
> > > could at least come up with a common set of stats - maybe even ones
> > > tracked at an eventdev API level, e.g. nb enqueues/dequeues. As well as
> > > that, we think the idea of an xstats API, like in ethdev, might work
> > > well. For our software implementation, having visibility into the
> > > scheduler behaviour can be important, so we'd like a way to report out
> > > things like internal queue depths etc.
> > >
> > 
> > Since these are not very generic hardware, I am not sure how much sense
> > to have generic stats API. But, Something similar to ethdev's xstat(any 
> > capability based)
> > the scheme works well. Look forward to seeing API proposal with common code.
> > 
> > Jerin
> > 
> Well, to start off with, some stats that could be tracked at the API
> level could be common. What about counts of number of enqueues and
> dequeues?
> 
> I suppose the other way we can look at this is: once we get a few
> implementations of the interface, we can look at the provided xstats
> values from each one, and see if there is anything common between them.

That makes more sense to me as we don't have proposed counts. I think,
Then we should not use stats for functional tests as proposed. We could
verify the functional test by embedding some value in event object on
enqueue and later check the same on dequeue kind of scheme.

Jerin



> 
> /Bruce


[dpdk-dev] [PATCH 0/4] libeventdev API and northbound implementation

2016-11-22 Thread Thomas Monjalon
2016-11-21 09:57, Bruce Richardson:
> On Mon, Nov 21, 2016 at 10:40:50AM +0100, Thomas Monjalon wrote:
> > Are you asking for a temporary tree?
> > If yes, please tell its name and its committers, it will be done.
> 
> Yes, we are asking for a new tree, but I would not assume it is
> temporary - it might be, but it also might not be, given how other
> threads are discussing having an increasing number of subtrees giving
> pull requests. :-)
> 
> Name: dpdk-eventdev-next

Named dpdk-next-eventdev for consistency.

> Committers: Bruce Richardson & Jerin Jacob

Access granted. Jerin could you send me a public SSH key please?


[dpdk-dev] [PATCH 2/4] eventdev: implement the northbound APIs

2016-11-22 Thread Jerin Jacob
On Tue, Nov 22, 2016 at 12:43:58AM +0530, Jerin Jacob wrote:
> On Mon, Nov 21, 2016 at 05:45:51PM +, Eads, Gage wrote:
> > Hi Jerin,
> > 
> > I did a quick review and overall this implementation looks good. I noticed 
> > just one issue in rte_event_queue_setup(): the check of 
> > nb_atomic_order_sequences is being applied to atomic-type queues, but that 
> > field applies to ordered-type queues.
> 
> Thanks Gage. I will fix that in v2.
> 
> > 
> > One open issue I noticed is the "typical workflow" description starting in 
> > rte_eventdev.h:204 conflicts with the centralized software PMD that Harry 
> > posted last week. Specifically, that PMD expects a single core to call the 
> > schedule function. We could extend the documentation to account for this 
> > alternative style of scheduler invocation, or discuss ways to make the 
> > software PMD work with the documented workflow. I prefer the former, but 
> > either way I think we ought to expose the scheduler's expected usage to the 
> > user -- perhaps through an RTE_EVENT_DEV_CAP flag?
> 
> I prefer former too, you can propose the documentation change required for 
> software PMD.
> 
> On same note, If software PMD based workflow need  a separate core(s) for
> schedule function then, Can we hide that from API specification and pass an
> argument to SW pmd to define the scheduling core(s)?
> 
> Something like --vdev=eventsw0,schedule_cmask=0x2

Just a thought,

Perhaps, We could introduce generic "service" cores concept to DPDK to hide the
requirement where the implementation needs dedicated core to do certain
work. I guess it would useful for other NPU integration in DPDK.

> 
> > 
> > Thanks,
> > Gage
> > 
> > >  -Original Message-
> > >  From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com]
> > >  Sent: Thursday, November 17, 2016 11:45 PM
> > >  To: dev at dpdk.org
> > >  Cc: Richardson, Bruce ; Van Haaren, Harry
> > >  ; hemant.agrawal at nxp.com; Eads, Gage
> > >  ; Jerin Jacob 
> > >  Subject: [dpdk-dev] [PATCH 2/4] eventdev: implement the northbound APIs
> > >  
> > >  This patch set defines the southbound driver interface
> > >  and implements the common code required for northbound
> > >  eventdev API interface.
> > >  
> > >  Signed-off-by: Jerin Jacob 
> > >  ---
> > >   config/common_base   |6 +
> > >   lib/Makefile |1 +
> > >   lib/librte_eal/common/include/rte_log.h  |1 +
> > >   lib/librte_eventdev/Makefile |   57 ++
> > >   lib/librte_eventdev/rte_eventdev.c   | 1211
> > >  ++
> > >   lib/librte_eventdev/rte_eventdev_pmd.h   |  504 +++
> > >   lib/librte_eventdev/rte_eventdev_version.map |   39 +
> > >   mk/rte.app.mk|1 +
> > >   8 files changed, 1820 insertions(+)
> > >   create mode 100644 lib/librte_eventdev/Makefile
> > >   create mode 100644 lib/librte_eventdev/rte_eventdev.c
> > >   create mode 100644 lib/librte_eventdev/rte_eventdev_pmd.h
> > >   create mode 100644 lib/librte_eventdev/rte_eventdev_version.map
> > >  
> > >  diff --git a/config/common_base b/config/common_base
> > >  index 4bff83a..7a8814e 100644
> > >  --- a/config/common_base
> > >  +++ b/config/common_base
> > >  @@ -411,6 +411,12 @@ CONFIG_RTE_LIBRTE_PMD_ZUC_DEBUG=n
> > >   CONFIG_RTE_LIBRTE_PMD_NULL_CRYPTO=y
> > >  
> > >   #
> > >  +# Compile generic event device library
> > >  +#
> > >  +CONFIG_RTE_LIBRTE_EVENTDEV=y
> > >  +CONFIG_RTE_LIBRTE_EVENTDEV_DEBUG=n
> > >  +CONFIG_RTE_EVENT_MAX_DEVS=16
> > >  +CONFIG_RTE_EVENT_MAX_QUEUES_PER_DEV=64
> > >   # Compile librte_ring
> > >   #
> > >   CONFIG_RTE_LIBRTE_RING=y
> > >  diff --git a/lib/Makefile b/lib/Makefile
> > >  index 990f23a..1a067bf 100644
> > >  --- a/lib/Makefile
> > >  +++ b/lib/Makefile
> > >  @@ -41,6 +41,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_CFGFILE) += librte_cfgfile
> > >   DIRS-$(CONFIG_RTE_LIBRTE_CMDLINE) += librte_cmdline
> > >   DIRS-$(CONFIG_RTE_LIBRTE_ETHER) += librte_ether
> > >   DIRS-$(CONFIG_RTE_LIBRTE_CRYPTODEV) += librte_cryptodev
> > >  +DIRS-$(CONFIG_RTE_LIBRTE_EVENTDEV) += librte_eventdev
> > >   DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
> > >   DIRS-$(CONFIG_RTE_LIBRTE_HASH) += librte_hash
> > >   DIRS-$(CONFIG_RTE_LIBRTE_LPM) += librte_lpm
> > >  diff --git a/lib/librte_eal/common/include/rte_log.h
> > >  b/lib/librte_eal/common/include/rte_log.h
> > >  index 29f7d19..9a07d92 100644
> > >  --- a/lib/librte_eal/common/include/rte_log.h
> > >  +++ b/lib/librte_eal/common/include/rte_log.h
> > >  @@ -79,6 +79,7 @@ extern struct rte_logs rte_logs;
> > >   #define RTE_LOGTYPE_PIPELINE 0x8000 /**< Log related to pipeline. */
> > >   #define RTE_LOGTYPE_MBUF0x0001 /**< Log related to mbuf. */
> > >   #define RTE_LOGTYPE_CRYPTODEV 0x0002 /**< Log related to
> > >  cryptodev. */
> > >  +#define RTE_LOGTYPE_EVENTDEV 0x0004 /**< Log related to eventdev.
> > >  */
> > >  
> > >   /* 

[dpdk-dev] [PATCH] mk: remove make target for examples

2016-11-22 Thread Thomas Monjalon
The command
  make examples
works only if target directories have the exact name of configs.

It is more flexible to use
  make -C examples RTE_SDK=$(pwd) RTE_TARGET=build

Signed-off-by: Thomas Monjalon 
---
 mk/rte.sdkexamples.mk | 77 ---
 mk/rte.sdkroot.mk |  4 ---
 2 files changed, 81 deletions(-)
 delete mode 100644 mk/rte.sdkexamples.mk

diff --git a/mk/rte.sdkexamples.mk b/mk/rte.sdkexamples.mk
deleted file mode 100644
index 111ce91..000
--- a/mk/rte.sdkexamples.mk
+++ /dev/null
@@ -1,77 +0,0 @@
-#   BSD LICENSE
-#
-#   Copyright(c) 2014 6WIND S.A.
-#
-#   Redistribution and use in source and binary forms, with or without
-#   modification, are permitted provided that the following conditions
-#   are met:
-#
-# * Redistributions of source code must retain the above copyright
-#   notice, this list of conditions and the following disclaimer.
-# * Redistributions in binary form must reproduce the above copyright
-#   notice, this list of conditions and the following disclaimer in
-#   the documentation and/or other materials provided with the
-#   distribution.
-# * Neither the name of 6WIND S.A. nor the names of its
-#   contributors may be used to endorse or promote products derived
-#   from this software without specific prior written permission.
-#
-#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
-#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
-#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
-#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
-#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
-#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
-#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
-#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
-#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
-#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
-#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-
-# examples application are seen as external applications which are
-# not part of SDK.
-BUILDING_RTE_SDK :=
-export BUILDING_RTE_SDK
-
-# Build directory is given with O=
-O ?= $(RTE_SDK)/examples
-
-# Target for which examples should be built.
-T ?= *
-
-# list all available configurations
-EXAMPLES_CONFIGS := $(patsubst $(RTE_SRCDIR)/config/defconfig_%,%,\
-   $(wildcard $(RTE_SRCDIR)/config/defconfig_$(T)))
-EXAMPLES_TARGETS := $(addsuffix _examples,\
-   $(filter-out %~,$(EXAMPLES_CONFIGS)))
-
-.PHONY: examples
-examples: $(EXAMPLES_TARGETS)
-
-%_examples:
-   @echo == Build examples for $*
-   $(Q)if [ ! -d "${RTE_SDK}/${*}" ]; then \
-   echo "Target ${*} does not exist in ${RTE_SDK}/${*}." ; \
-   echo -n "Please install DPDK first (make install) or use 
another " ; \
-   echo "target argument (T=target)." ; \
-   false ; \
-   else \
-   $(MAKE) -C examples O=$(abspath $(O)) RTE_TARGET=$(*); \
-   fi
-
-EXAMPLES_CLEAN_TARGETS := $(addsuffix _examples_clean,\
-   $(filter-out %~,$(EXAMPLES_CONFIGS)))
-
-.PHONY: examples_clean
-examples_clean: $(EXAMPLES_CLEAN_TARGETS)
-
-%_examples_clean:
-   @echo == Clean examples for $*
-   $(Q)if [ ! -d "${RTE_SDK}/${*}" ]; then \
-   echo "Target ${*} does not exist in ${RTE_SDK}/${*}." ; \
-   echo -n "Please install DPDK first (make install) or use 
another " ; \
-   echo "target argument (T=target)." ; \
-   false ; \
-   else \
-   $(MAKE) -C examples O=$(abspath $(O)) RTE_TARGET=$(*) clean; \
-   fi
diff --git a/mk/rte.sdkroot.mk b/mk/rte.sdkroot.mk
index 04ad523..81233ed 100644
--- a/mk/rte.sdkroot.mk
+++ b/mk/rte.sdkroot.mk
@@ -117,10 +117,6 @@ depdirs depgraph:
 gcov gcovclean:
$(Q)$(MAKE) -f $(RTE_SDK)/mk/rte.sdkgcov.mk $@

-.PHONY: examples examples_clean
-examples examples_clean:
-   $(Q)$(MAKE) -f $(RTE_SDK)/mk/rte.sdkexamples.mk $@
-
 # all other build targets
 %:
$(Q)$(MAKE) -f $(RTE_SDK)/mk/rte.sdkconfig.mk checkconfig
-- 
2.7.0



[dpdk-dev] [PATCH 2/4] eventdev: implement the northbound APIs

2016-11-22 Thread Jerin Jacob
On Mon, Nov 21, 2016 at 05:45:51PM +, Eads, Gage wrote:
> Hi Jerin,
> 
> I did a quick review and overall this implementation looks good. I noticed 
> just one issue in rte_event_queue_setup(): the check of 
> nb_atomic_order_sequences is being applied to atomic-type queues, but that 
> field applies to ordered-type queues.

Thanks Gage. I will fix that in v2.

> 
> One open issue I noticed is the "typical workflow" description starting in 
> rte_eventdev.h:204 conflicts with the centralized software PMD that Harry 
> posted last week. Specifically, that PMD expects a single core to call the 
> schedule function. We could extend the documentation to account for this 
> alternative style of scheduler invocation, or discuss ways to make the 
> software PMD work with the documented workflow. I prefer the former, but 
> either way I think we ought to expose the scheduler's expected usage to the 
> user -- perhaps through an RTE_EVENT_DEV_CAP flag?

I prefer former too, you can propose the documentation change required for 
software PMD.

On same note, If software PMD based workflow need  a separate core(s) for
schedule function then, Can we hide that from API specification and pass an
argument to SW pmd to define the scheduling core(s)?

Something like --vdev=eventsw0,schedule_cmask=0x2

> 
> Thanks,
> Gage
> 
> >  -Original Message-
> >  From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com]
> >  Sent: Thursday, November 17, 2016 11:45 PM
> >  To: dev at dpdk.org
> >  Cc: Richardson, Bruce ; Van Haaren, Harry
> >  ; hemant.agrawal at nxp.com; Eads, Gage
> >  ; Jerin Jacob 
> >  Subject: [dpdk-dev] [PATCH 2/4] eventdev: implement the northbound APIs
> >  
> >  This patch set defines the southbound driver interface
> >  and implements the common code required for northbound
> >  eventdev API interface.
> >  
> >  Signed-off-by: Jerin Jacob 
> >  ---
> >   config/common_base   |6 +
> >   lib/Makefile |1 +
> >   lib/librte_eal/common/include/rte_log.h  |1 +
> >   lib/librte_eventdev/Makefile |   57 ++
> >   lib/librte_eventdev/rte_eventdev.c   | 1211
> >  ++
> >   lib/librte_eventdev/rte_eventdev_pmd.h   |  504 +++
> >   lib/librte_eventdev/rte_eventdev_version.map |   39 +
> >   mk/rte.app.mk|1 +
> >   8 files changed, 1820 insertions(+)
> >   create mode 100644 lib/librte_eventdev/Makefile
> >   create mode 100644 lib/librte_eventdev/rte_eventdev.c
> >   create mode 100644 lib/librte_eventdev/rte_eventdev_pmd.h
> >   create mode 100644 lib/librte_eventdev/rte_eventdev_version.map
> >  
> >  diff --git a/config/common_base b/config/common_base
> >  index 4bff83a..7a8814e 100644
> >  --- a/config/common_base
> >  +++ b/config/common_base
> >  @@ -411,6 +411,12 @@ CONFIG_RTE_LIBRTE_PMD_ZUC_DEBUG=n
> >   CONFIG_RTE_LIBRTE_PMD_NULL_CRYPTO=y
> >  
> >   #
> >  +# Compile generic event device library
> >  +#
> >  +CONFIG_RTE_LIBRTE_EVENTDEV=y
> >  +CONFIG_RTE_LIBRTE_EVENTDEV_DEBUG=n
> >  +CONFIG_RTE_EVENT_MAX_DEVS=16
> >  +CONFIG_RTE_EVENT_MAX_QUEUES_PER_DEV=64
> >   # Compile librte_ring
> >   #
> >   CONFIG_RTE_LIBRTE_RING=y
> >  diff --git a/lib/Makefile b/lib/Makefile
> >  index 990f23a..1a067bf 100644
> >  --- a/lib/Makefile
> >  +++ b/lib/Makefile
> >  @@ -41,6 +41,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_CFGFILE) += librte_cfgfile
> >   DIRS-$(CONFIG_RTE_LIBRTE_CMDLINE) += librte_cmdline
> >   DIRS-$(CONFIG_RTE_LIBRTE_ETHER) += librte_ether
> >   DIRS-$(CONFIG_RTE_LIBRTE_CRYPTODEV) += librte_cryptodev
> >  +DIRS-$(CONFIG_RTE_LIBRTE_EVENTDEV) += librte_eventdev
> >   DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
> >   DIRS-$(CONFIG_RTE_LIBRTE_HASH) += librte_hash
> >   DIRS-$(CONFIG_RTE_LIBRTE_LPM) += librte_lpm
> >  diff --git a/lib/librte_eal/common/include/rte_log.h
> >  b/lib/librte_eal/common/include/rte_log.h
> >  index 29f7d19..9a07d92 100644
> >  --- a/lib/librte_eal/common/include/rte_log.h
> >  +++ b/lib/librte_eal/common/include/rte_log.h
> >  @@ -79,6 +79,7 @@ extern struct rte_logs rte_logs;
> >   #define RTE_LOGTYPE_PIPELINE 0x8000 /**< Log related to pipeline. */
> >   #define RTE_LOGTYPE_MBUF0x0001 /**< Log related to mbuf. */
> >   #define RTE_LOGTYPE_CRYPTODEV 0x0002 /**< Log related to
> >  cryptodev. */
> >  +#define RTE_LOGTYPE_EVENTDEV 0x0004 /**< Log related to eventdev.
> >  */
> >  
> >   /* these log types can be used in an application */
> >   #define RTE_LOGTYPE_USER1   0x0100 /**< User-defined log type 1. */
> >  diff --git a/lib/librte_eventdev/Makefile b/lib/librte_eventdev/Makefile
> >  new file mode 100644
> >  index 000..dac0663
> >  --- /dev/null
> >  +++ b/lib/librte_eventdev/Makefile
> >  @@ -0,0 +1,57 @@
> >  +#   BSD LICENSE
> >  +#
> >  +#   Copyright(c) 2016 Cavium networks. All rights reserved.
> >  +#
> >  +#   Redistribution and use in source and binary forms, with 

[dpdk-dev] [PATCH] mk: remove make target for examples

2016-11-22 Thread Ferruh Yigit
On 11/21/2016 11:47 PM, Thomas Monjalon wrote:
> The command
>   make examples
> works only if target directories have the exact name of configs.
> 
> It is more flexible to use
>   make -C examples RTE_SDK=$(pwd) RTE_TARGET=build
> 
> Signed-off-by: Thomas Monjalon 

Instead of removing examples & examples_clean targets, what do you think
keeping them as wrapper to suggested usage, for backward compatibility.

Something like:
"
BUILDING_RTE_SDK :=
export BUILDING_RTE_SDK

# Build directory is given with O=
O ?= $(RTE_SDK)/examples

# Target for which examples should be built.
T ?= build

.PHONY: examples
examples:
@echo == Build examples for $(T)
$(MAKE) -C examples O=$(abspath $(O)) RTE_TARGET=$(T);

.PHONY: examples_clean
examples_clean:
@echo == Clean examples for $(T)
$(MAKE) -C examples O=$(abspath $(O)) RTE_TARGET=$(T) clean;
"