date:20160715

[dpdk-dev] [PATCH] net/fm10k: fix FTAG mode on multi-queue

2016-07-15 Thread Thomas Monjalon

2016-07-15 18:38, Xiao Wang:
> In multi-queue + FTAG use case, we need to turn tx_ftag_en on for all
> the Tx queues.
> 
> Fixes: 7958b1310d5e ("fm10k: enable FTAG based forwarding")
> 
> Reported-by: Ricky Li 
> Signed-off-by: Xiao Wang 

Applied, thanks

[dpdk-dev] [PATCH] net/i40e: revert VLAN filtering fix

2016-07-15 Thread Thomas Monjalon

> > This reverts commit 4761f57d58c6f52543738dbe299f846d62d75895.
> > Introducing VLAN table by adding VLAN adminq command will cause NIC's
> > throughput drop obviously. It's a hardware issue.
> > With this revert, VLAN filtering can only work when promiscuous mode is
> > disabled.
> > 
> > Reverts: 4761f57d58c6 ("net/i40e: fix VLAN filtering in promiscuous mode")
> > 
> > Reported-by: Jeffrey Shaw 
> > Signed-off-by: Jingjing Wu 
> Acked-by : Jing Chen 

Applied, thanks

[dpdk-dev] [PATCH 0/2] Add ptype and xsum handling in i40e rx vpmd

2016-07-15 Thread Thomas Monjalon

2016-07-14 09:59, Jeff Shaw:
> Our testing suggests minimal (in some cases zero) impact to core-bound
> forwarding throughput as measured by testpmd. Throughput increase is
> observed in l3fwd as now the vpmd can be used with hw_ip_checksum
> enabled and without needing '--parse-ptype'.
> 
> The benefits to applications using this functionality is realized when
> Ethernet processing and L3/L4 checksum validation can be skipped.
> 
> We hope others can also test performance in their applications while
> conducting a review of this series.

Thanks for the patches. They need some careful review and are a bit late
for an integration in 16.07. Thus they are pending for 16.11.

[dpdk-dev] [PATCH] net/virtio-user: Fix missing brackets in if condition

2016-07-15 Thread Thomas Monjalon

2016-07-12 20:11, Yuanhan Liu:
> On Tue, Jul 12, 2016 at 11:30:25AM +0200, Maxime Coquelin wrote:
> > The error is reported using test build script:
> > 
> > $ scripts/test-build.sh x86_64-native-linuxapp-gcc
> > ...
> > drivers/net/virtio/virtio_user_ethdev.c: In function 
> > ?virtio_user_pmd_devinit?:
> > drivers/net/virtio/virtio_user_ethdev.c:345:2: error: this ?if? clause does 
> > not guard... [-Werror=misleading-indentation]
> >   if (rte_kvargs_count(kvlist, VIRTIO_USER_ARG_PATH) == 1)
> > ^~
> > 
> > Fixes: 404bd6bfe360 ("net/virtio-user: fix return value not checked")
> > 
> > Cc: Jianfeng Tan 
> > Signed-off-by: Maxime Coquelin 
> 
> Acked-by: Yuanhan Liu 
> 
> Thanks for the fix.

Applied, thanks

[dpdk-dev] [PATCH v4 00/10] Fix build errors related to exported headers

2016-07-15 Thread Bruce Richardson

On Wed, Jul 13, 2016 at 03:02:37PM +0200, Adrien Mazarguil wrote:
> DPDK uses GNU C language extensions in most of its code base. This is fine
> for internal source files whose compilation flags are controlled by DPDK,
> however user applications that use exported "public" headers may experience
> compilation failures when enabling strict error/standard checks (-std and
> -pedantic for instance).
> 
> Exported headers are installed system-wide and must be as clean as possible
> so applications do not have to resort to workarounds.
> 
> This patchset affects exported headers only, compilation problems are
> addressed as follows:
> 
> - Adding the __extension__ keyword to nonstandard constructs (same method
>   as existing libraries when there is no other choice).
> - Adding the __extension__ keyword to C11 constructs to remain compatible
>   with pure C99.
> - Adding missing includes so exported files can be included out of order
>   and on their own.
> - Fixing GNU printf-like variadic macros as there is no magic keyword for
>   these.
> 

Having upgraded to Fedora 24, I'm seeing quite a few errors compiling with gcc
6.1.1 in debug mode. Applying this patchset seems to really cut down on those
errors, so may need to be applied for 16.07 release.

The remaining error I'm seeing is, in mlx drivers, complaints about the
pedantic flag (the flag which I think was causing all the other errors to be
triggered too):

error: `-pedantic' is not an option that controls warnings

For this set though, I don't see any new errors introduced into gcc or clang
builds for the libs or drivers, and a number of errors cleared, so:

Tested-by: Bruce Richardson 

Regards,
/Bruce

[dpdk-dev] [PATCH v2] vhost_user: avoid crash when exeeding file descriptors

2016-07-15 Thread Thomas Monjalon

2016-07-12 16:37, Yuanhan Liu:
> On Wed, Jul 06, 2016 at 02:24:58PM +0200, Christian Ehrhardt wrote:
> > *update in v2*
> > - refreshing for DPDK 16.07
> > - Close fd on vserver->listenfd as suggested in discussion
> > 
> > Original From:
> > From: Patrik Andersson 
> > 
> > Protect against DPDK crash when allocation of listen fd >= 1023.
> > For events on fd:s >1023, the current implementation will trigger
> > an abort due to access outside of allocated bit mask.
> 
> Hmmm, I have no idea why I missed this email in the beginning,
> otherwise, it would have been in rc2 release.
> 
> Thanks for the re-posting, and we should have had it in last
> release.
> 
> Acked-by: Yuanhan Liu 

Applied, thanks

[dpdk-dev] [PATCH v3 0/2] vhost: fix segfault on bad descriptor address

2016-07-15 Thread Thomas Monjalon

2016-07-15 20:14, Yuanhan Liu:
> On Fri, Jul 15, 2016 at 02:15:03PM +0300, Ilya Maximets wrote:
> > Version 3:
> > * Patch splitted in two.
> > * Applied workaround from Rich Lane and added comment about
> >   performance issue with some compilers and 'unlikely' macro.
> 
> Thanks a lot for the patches.
> 
> Acked-by: Yuanhan Liu 

Applied, thanks

[dpdk-dev] [RFC] remove vhost-cuse

2016-07-15 Thread Yuanhan Liu

On Fri, Jul 15, 2016 at 11:14:52AM +0200, Thomas Monjalon wrote:
> 2016-07-15 16:42, Yuanhan Liu:
> > You see that no body cares it :)
> > 
> > So I will make a patch to mark vhost-cuse as deprecated shortly.
> > Thomas, works to you?
> 
> Perfect. Thanks Yuanhan

Great! deprecate note has been sent out.

--yliu

[dpdk-dev] [PATCH] doc: deprecate vhost-cuse

2016-07-15 Thread Yuanhan Liu

Vhost-cuse was invented before vhost-user exist. The both are actually
doing the same thing: a vhost-net implementation in user space. But they
are not exactly the same thing.

Firstly, vhost-cuse is harder for use; no one seems to care it, either.
Furthermore, since v2.1, a large majority of development effort has gone
to vhost-user. For example, we extended the vhost-user spec to add the
multiple queue support. We also added the vhost-user live migration at
v16.04 and the latest one, vhost-user reconnect that allows vhost app
restart without restarting the guest. Both of them are very important
features for product usage and none of them works for vhost-cuse.

You now see that the difference between vhost-user and vhost-cuse is
big (and will be bigger and bigger as time moves forward), that you
should never use vhost-cuse, that we should drop it completely.

The remove would also result to a much cleaner code base, allowing us
to do all kinds of extending easier.

So here to mark vhost-cuse as deprecated in this release and will be
removed in the next release (v16.11).

Signed-off-by: Yuanhan Liu 
---
 doc/guides/rel_notes/deprecation.rst | 4 
 1 file changed, 4 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index f502f86..ee99558 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -41,3 +41,7 @@ Deprecation Notices
 * The mempool functions for single/multi producer/consumer are deprecated and
   will be removed in 16.11.
   It is replaced by rte_mempool_generic_get/put functions.
+
+* The vhost-cuse will be removed in 16.11. Since v2.1, a large majority of
+  development effort has gone to vhost-user, such as multiple-queue, live
+  migration, reconnect etc. Therefore, vhost-user should be used instead.
-- 
1.9.0

[dpdk-dev] [PATCH v3 0/2] vhost: fix segfault on bad descriptor address

2016-07-15 Thread Yuanhan Liu

On Fri, Jul 15, 2016 at 02:15:03PM +0300, Ilya Maximets wrote:
> Version 3:
>   * Patch splitted in two.
>   * Applied workaround from Rich Lane and added comment about
> performance issue with some compilers and 'unlikely' macro.

Thanks a lot for the patches.

Acked-by: Yuanhan Liu 

--yliu
> 
> Version 2:
>   * Rebased on top of current master.
>   * host's address now checked in meargeable case,
> because needed refactoring already done.
>   * Commit-message changed because old issue with
> virtio reload accidentially fixed by commit
> 0823c1cb0a73.
> 
> Ilya Maximets (2):
>   vhost: fix using of bad return value on mergeable enqueue
>   vhost: do sanity check for ring descriptor address
> 
>  lib/librte_vhost/vhost_rxtx.c | 33 ++---
>  1 file changed, 26 insertions(+), 7 deletions(-)
> 
> -- 
> 2.7.4

[dpdk-dev] virtio PMD issue

2016-07-15 Thread Harish Patil

>
>it is fixed in 16.07-rc2 that I checked

Great, thanks.

>
>Vincent
>
>On Fri, Jul 15, 2016 at 11:48 AM, Harish Patil 
>wrote:
>> Hi Huawie/Yuanhan,
>> I encounter segfault issue in virtio PMD driver when I run any DPDK
>> application in the VM. The virtio devices should not have been probed in
>> the first place since they are not attached to igb_uio driver (and
>>managed
>> by kernel driver).
>>
>>
>> Network devices using DPDK-compatible driver
>> 
>> :00:09.0 'FastLinQ QL45000 Series 25GbE Controller' drv=igb_uio
>>unused=
>> :00:0a.0 'FastLinQ QL45000 Series 25GbE Controller' drv=igb_uio
>>unused=
>>
>> Network devices using kernel driver
>> ===
>> :00:03.0 'Virtio network device' if= drv=virtio-pci unused=igb_uio
>>
>>
>>
>> #0 0x004e5392 in vtpci_init (dev=0xb61970, hw=0x7fff76146600)
>> at /home2/hpatil/dpdk/drivers/net/virtio/virtio_pci.c:653
>> #1 0x004effc4 in eth_virtio_dev_init (eth_dev=0xacb7a0
>> )
>> at /home2/hpatil/dpdk/drivers/net/virtio/virtio_ethdev.c:1060
>> #2 0x004b730f in rte_eth_dev_init ()
>> #3 0x004cb940 in pci_probe_all_drivers ()
>> #4 0x004cbe88 in rte_eal_pci_probe ()
>> #5 0x004c0336 in rte_eal_init ()
>> #6 0x004404e2 in main ()
>> (gdb) p dev->devargs
>> $1 = (struct rte_devargs *) 0x0
>>
>>
>> Looks like its a known issue based on a old thread:
>> http://dpdk.org/dev/patchwork/patch/12462/
>>
>> Do we know a fix now?
>>
>> Thanks,
>> harish
>>
>>
>>
>>
>

[dpdk-dev] [PATCH v2] vfio: fix pci_vfio_map_resource

2016-07-15 Thread Thomas Monjalon

2016-07-15 18:42, Thomas Monjalon:
> 2016-07-15 17:32, Thomas Monjalon:
> > 2016-07-14 17:15, Yong Wang:
> > > - void *second_addr = RTE_PTR_ADD(bar_addr, 
> > > memreg[1].offset);
> > > + void *second_addr = RTE_PTR_ADD(bar_addr,
> > > + 
> > > memreg[1].offset - reg.offset);
> > 
> > There is an error for 32-bit:
> > error: cast to pointer from integer of different size
> > note: in expansion of macro ?RTE_PTR_ADD?
> 
> It can fixed like this:
> -memreg[1].offset - reg.offset);
> +memreg[1].offset -
> +(uintptr_t)reg.offset);
> 

Applied with above change, thanks

[dpdk-dev] virtio PMD issue

2016-07-15 Thread Harish Patil

Hi Huawie/Yuanhan,
I encounter segfault issue in virtio PMD driver when I run any DPDK
application in the VM. The virtio devices should not have been probed in
the first place since they are not attached to igb_uio driver (and managed
by kernel driver).


Network devices using DPDK-compatible driver

:00:09.0 'FastLinQ QL45000 Series 25GbE Controller' drv=igb_uio unused=
:00:0a.0 'FastLinQ QL45000 Series 25GbE Controller' drv=igb_uio unused=

Network devices using kernel driver
===
:00:03.0 'Virtio network device' if= drv=virtio-pci unused=igb_uio



#0 0x004e5392 in vtpci_init (dev=0xb61970, hw=0x7fff76146600)
at /home2/hpatil/dpdk/drivers/net/virtio/virtio_pci.c:653
#1 0x004effc4 in eth_virtio_dev_init (eth_dev=0xacb7a0
)
at /home2/hpatil/dpdk/drivers/net/virtio/virtio_ethdev.c:1060
#2 0x004b730f in rte_eth_dev_init ()
#3 0x004cb940 in pci_probe_all_drivers ()
#4 0x004cbe88 in rte_eal_pci_probe ()
#5 0x004c0336 in rte_eal_init ()
#6 0x004404e2 in main ()
(gdb) p dev->devargs
$1 = (struct rte_devargs *) 0x0


Looks like its a known issue based on a old thread:
http://dpdk.org/dev/patchwork/patch/12462/

Do we know a fix now?

Thanks,
harish

[dpdk-dev] [PATCH v6 12/17] pci: add a helper for device name

2016-07-15 Thread Shreyansh Jain

On Friday 15 July 2016 03:09 PM, Shreyansh jain wrote:
> On Thursday 14 July 2016 10:25 PM, Jan Viktorin wrote:
>> On Tue, 12 Jul 2016 11:31:17 +0530
>> Shreyansh Jain  wrote:
>>
>>> eal is a better place than crypto / ethdev for naming resources.
>>
>> s/for naming/to name/
> 
> OK.
> 
>>
>> What is meant by "resources" here?
> 
> This has historic context (from earlier version of this patch). 
> But I could relate the word 'resources' to EAL representation of devices - 
> whether PCI or Crypto.
> Or, Resource == Device.
> 

If I go by what Thomas is proposing for meaning of 'resource' [1], and the fact 
that all methods in this patchset refer to 'devices', I will change the patch 
context to 'EAL is a better place than crypto / ethdev to name devices'.

[1] http://dpdk.org/ml/archives/dev/2016-July/044056.html

>>
>>> Add a helper in eal and make use of it in crypto / ethdev.
>>>
>>> Signed-off-by: David Marchand 
>>> Signed-off-by: Shreyansh Jain 
>>> ---
>>>  lib/librte_cryptodev/rte_cryptodev.c| 27 ---
>>>  lib/librte_eal/common/include/rte_pci.h | 25 +
>>>  lib/librte_ether/rte_ethdev.c   | 24 
>>>  3 files changed, 33 insertions(+), 43 deletions(-)
>>>
>>> diff --git a/lib/librte_cryptodev/rte_cryptodev.c 
>>> b/lib/librte_cryptodev/rte_cryptodev.c
>>> index d7be111..60c6384 100644
>>> --- a/lib/librte_cryptodev/rte_cryptodev.c
>>> +++ b/lib/librte_cryptodev/rte_cryptodev.c
>>> @@ -367,23 +367,6 @@ rte_cryptodev_pmd_allocate(const char *name, int 
>>> socket_id)
>>> return cryptodev;
>>>  }
>>>  
>>> -static inline int
>>> -rte_cryptodev_create_unique_device_name(char *name, size_t size,
>>> -   struct rte_pci_device *pci_dev)
>>> -{
>>> -   int ret;
>>> -
>>> -   if ((name == NULL) || (pci_dev == NULL))
>>> -   return -EINVAL;
>>> -
>>> -   ret = snprintf(name, size, "%d:%d.%d",
>>> -   pci_dev->addr.bus, pci_dev->addr.devid,
>>> -   pci_dev->addr.function);
>>> -   if (ret < 0)
>>> -   return ret;
>>> -   return 0;
>>> -}
>>> -
>>>  int
>>>  rte_cryptodev_pmd_release_device(struct rte_cryptodev *cryptodev)
>>>  {
>>> @@ -446,9 +429,8 @@ rte_cryptodev_pci_probe(struct rte_pci_driver *pci_drv,
>>> if (cryptodrv == NULL)
>>> return -ENODEV;
>>>  
>>> -   /* Create unique Crypto device name using PCI address */
>>> -   rte_cryptodev_create_unique_device_name(cryptodev_name,
>>> -   sizeof(cryptodev_name), pci_dev);
>>> +   rte_eal_pci_device_name(_dev->addr, cryptodev_name,
>>> +   sizeof(cryptodev_name));
>>>  
>>> cryptodev = rte_cryptodev_pmd_allocate(cryptodev_name, rte_socket_id());
>>> if (cryptodev == NULL)
>>> @@ -503,9 +485,8 @@ rte_cryptodev_pci_remove(struct rte_pci_device *pci_dev)
>>> if (pci_dev == NULL)
>>> return -EINVAL;
>>>  
>>> -   /* Create unique device name using PCI address */
>>> -   rte_cryptodev_create_unique_device_name(cryptodev_name,
>>> -   sizeof(cryptodev_name), pci_dev);
>>> +   rte_eal_pci_device_name(_dev->addr, cryptodev_name,
>>> +   sizeof(cryptodev_name));
>>>  
>>> cryptodev = rte_cryptodev_pmd_get_named_dev(cryptodev_name);
>>> if (cryptodev == NULL)
>>> diff --git a/lib/librte_eal/common/include/rte_pci.h 
>>> b/lib/librte_eal/common/include/rte_pci.h
>>> index 3027adf..06508fa 100644
>>> --- a/lib/librte_eal/common/include/rte_pci.h
>>> +++ b/lib/librte_eal/common/include/rte_pci.h
>>> @@ -82,6 +82,7 @@ extern "C" {
>>>  #include 
>>>  #include 
>>>  
>>> +#include 
>>>  #include 
>>>  
>>>  TAILQ_HEAD(pci_device_list, rte_pci_device); /**< PCI devices in D-linked 
>>> Q. */
>>> @@ -95,6 +96,7 @@ const char *pci_get_sysfs_path(void);
>>>  
>>>  /** Formatting string for PCI device identifier: Ex: :00:01.0 */
>>>  #define PCI_PRI_FMT "%.4" PRIx16 ":%.2" PRIx8 ":%.2" PRIx8 ".%" PRIx8
>>> +#define PCI_PRI_STR_SIZE sizeof(":XX:XX.X")
>>>  
>>>  /** Short formatting string, without domain, for PCI device: Ex: 00:01.0 */
>>>  #define PCI_SHORT_PRI_FMT "%.2" PRIx8 ":%.2" PRIx8 ".%" PRIx8
>>> @@ -308,6 +310,29 @@ eal_parse_pci_DomBDF(const char *input, struct 
>>> rte_pci_addr *dev_addr)
>>>  }
>>>  #undef GET_PCIADDR_FIELD
>>>  
>>> +/**
>>> + * Utility function to write a pci device name, this device name can later 
>>> be
>>> + * used to retrieve the corresponding rte_pci_addr using above functions.
>>
>> What about saying "using functions eal_parse_pci_*BDF"? The
>> specification "above" is quite uncertain...
> 
> Agree that 'above' is positional word and should be avoided.
> I will change that to "... using eal_parse_pci_* BDF helpers". OK?
> 
>>
>>> + *
>>> + * @param addr
>>> + * The PCI Bus-Device-Function address
>>> + * @param output
>>> + * The output buffer string
>>> + * @param size
>>> + * The output buffer size
>>> + * @return
>>> + *  0 on success, negative on error.
>>> + */
>>> +static

[dpdk-dev] [PATCH v2] vfio: fix pci_vfio_map_resource

2016-07-15 Thread Thomas Monjalon

2016-07-15 17:32, Thomas Monjalon:
> 2016-07-14 17:15, Yong Wang:
> > -   void *second_addr = RTE_PTR_ADD(bar_addr, 
> > memreg[1].offset);
> > +   void *second_addr = RTE_PTR_ADD(bar_addr,
> > +   
> > memreg[1].offset - reg.offset);
> 
> There is an error for 32-bit:
>   error: cast to pointer from integer of different size
>   note: in expansion of macro ?RTE_PTR_ADD?

It can fixed like this:
-memreg[1].offset - reg.offset);
+memreg[1].offset -
+(uintptr_t)reg.offset);

[dpdk-dev] [PATCH] net/fm10k: fix FTAG mode on multi-queue

2016-07-15 Thread Xiao Wang

In multi-queue + FTAG use case, we need to turn tx_ftag_en on for all
the Tx queues.

Fixes: 7958b1310d5e ("fm10k: enable FTAG based forwarding")

Reported-by: Ricky Li 
Signed-off-by: Xiao Wang 
---
 drivers/net/fm10k/fm10k_ethdev.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/net/fm10k/fm10k_ethdev.c b/drivers/net/fm10k/fm10k_ethdev.c
index 217853f..144b2de 100644
--- a/drivers/net/fm10k/fm10k_ethdev.c
+++ b/drivers/net/fm10k/fm10k_ethdev.c
@@ -2738,10 +2738,8 @@ fm10k_set_tx_function(struct rte_eth_dev *dev)
txq = dev->data->tx_queues[i];
txq->tx_ftag_en = tx_ftag_en;
/* Check if Vector Tx is satisfied */
-   if (fm10k_tx_vec_condition_check(txq)) {
+   if (fm10k_tx_vec_condition_check(txq))
use_sse = 0;
-   break;
-   }
}

if (use_sse) {
-- 
1.9.3

[dpdk-dev] [PATCH v1 00/15] rte_driver/device infrastructure

2016-07-15 Thread Jan Viktorin

On Fri, 15 Jul 2016 15:19:14 +0200
Thomas Monjalon  wrote:

> 2016-07-08 21:09, Jan Viktorin:
> > Hello,
> > 
> > based on the discussions with Shreyansh, I propose a patchset with
> > the important EAL changes. It is incomplete and I suppose to extend
> > and change certain things in the foreseeable future.
> > 
> > Important notes:
> > 
> > * pmd_type is removed
> > * introduced rte_vdev_driver inheriting rte_driver
> > * PMD_REGISTER_DRIVER is replaced by RTE_EAL_VDRV_REGISTER
> > * rte_driver/device integrated into rte_pci_driver/device
> > * all drivers and devices are in 2 lists - general and bus-specific
> > 
> > Shreyansh, I hope I do not duplicate your work. I tried to avoid touching
> > pmd_type but it quite complicated... There is also an initial generalization
> > of rte_pci_resource. More such generalizations are to be done.
> > 
> > The init/uninit functions cannot be generalized easily, I think. Both PCI
> > and VDEV have different requirements.
> > 
> > No idea about hotplug...  
> 
> Please could you give a clear overview of how you split the work in
> your respective series?

Yes, it's a bit messy... My quick summary follows.

> 
> I take the opportunity to put my notes about some initial targets of
> this refactoring:
> 
> module/drivers
>   attached to 1 bus: pci_driver or vdev_driver

pci_driver is done by Shreyansh/David
vdev_driver is done by myself

>   rte_device = device resources / embedded in pci/vdev_driver

Extraction of rte_device is done by myself

>   attached to n device interfaces (ethdev, crypto)
>   1 device resource -> n device interfaces

I am not sure, what those two points means exactly.

>   hotplug resource -> lookup drivers list
>   per-bus lists or 1 list?

Not sure. At least partially done by Shreyansh/David.

>   devinit/devuninit generalized and moved to rte_driver
>   -> rename to probe/remove  

Some renames done by Shreyansh/David.

There will be no move. The vdev_driver has different kind of init/uninit
then PCI. So I cannot see a simple way how to generalize this.

I'll do the final init->probe, uninit->remove renames.

>   crypto.dev_type could be dropped

I am not sure about this.

>   drv_flags should be moved to rte_driver

I'll do.

>   intr_init can be moved earlier in init, before affinity set
> devices

Done by Shreyansh/David.

>   unique_device_name -> standard naming?

Done for PCI: rte_eal_pci_device_name by Shreyansh/David.

>   difference with port_id ? -> device id for ethdev

Not sure.

>   should be unique_resource_name -> 1 EAL resource may match 
> several devices

Not sure.

>   ethdev manage an interface, eal manage a hardware resource, device 
> object in between?

Not sure about the question.

>   need for bus object?

I don't think so at this stage.

>   no need of driver object, module object?

The current rte_driver will not exist. Same for any kind of module.

>   devargs, intr_handle, numa_node should be moved to rte_device

Yes, done by myself.

> hotplug notification to do
>   notify free-able ressource?
>   remove blacklist at EAL level and let application handle it
>   devargs still in hotplug function, must be moved in separate API
> devargs
>   new API
>   new command line parameters
> 



-- 
   Jan Viktorin  E-mail: Viktorin at RehiveTech.com
   System Architect  Web:www.RehiveTech.com
   RehiveTech
   Brno, Czech Republic

[dpdk-dev] [PATCH v2] vfio: fix pci_vfio_map_resource

2016-07-15 Thread Thomas Monjalon

2016-07-14 17:15, Yong Wang:
> - void *second_addr = RTE_PTR_ADD(bar_addr, 
> memreg[1].offset);
> + void *second_addr = RTE_PTR_ADD(bar_addr,
> + 
> memreg[1].offset - reg.offset);

There is an error for 32-bit:
error: cast to pointer from integer of different size
note: in expansion of macro ?RTE_PTR_ADD?

[dpdk-dev] [PATCH 2/2] app/test: filter out unavailable tests

2016-07-15 Thread Thomas Monjalon

> > Some tests can fail to run because they are not compiled.
> > It has been more visible recently when the PCI test became disabled
> > in the default configuration because of dependency on libarchive:
> > PCI autotest:Fail [Not found]
> >
> > The autotest script catch them and do not count them as an error anymore:
> > PCI autotest:Skipped [Not Available]
> >
> > The commands dump_ring and dump_mempool are removed from the autotest list
> > because they are some internal commands but not some registered tests.
> > Thus they cannot be parsed as available test commands.
> >
> > Signed-off-by: Thomas Monjalon 
> 
> Suggested-by: David Marchand 

Series applied

[dpdk-dev] [PATCH] mk: fix default rule of test subdirectory

2016-07-15 Thread Thomas Monjalon

> > When using "make -C app/test" (with RTE_SDK/RTE_TARGET adjusted) without
> > specifying the rule (all), the build is not done.
> > Indeed the default rule is not "all" anymore since there are some rules 
> > added for
> > external resources link.
> > 
> > It is fixed by adding a reference to "all" at the top of the file which 
> > makes it the
> > default rule.
> > 
> > Note that make app/test_sub (without environment variable) is preffered.
> > 
> > Fixes: ab64f5df8004 ("app/test: support resources externally linked")
> > 
> > Reported-by: Reshma Pattan 
> > Signed-off-by: Thomas Monjalon 
> 
> Acked-by: Reshma Pattan 

Applied

[dpdk-dev] [PATCH 2/2] mk: fix dependency on toolchain libraries

2016-07-15 Thread Thomas Monjalon

2016-07-13 09:42, Thomas Monjalon:
> The -l options specifying libraries to link with are in LDLIBS.
> But it can happen to have some libraries in other variables.
> In case of a low level dependency specified in some environments
> via EXTRA_LDFLAGS, there can be an unresolved issue due to a
> wrong linking order. Indeed the libraries must be specified from
> the higher level (dependency consumers) to the lower level (dependencies).
> 
> It is fixed by moving LDLIBS before LDFLAGS variables in the link
> command line.
> 
> Signed-off-by: Thomas Monjalon 
> Tested-by: Raslan Darawsheh 

Series applied

[dpdk-dev] [PATCH] scripts: fix libnuma dependency in build test

2016-07-15 Thread Thomas Monjalon

2016-07-12 17:55, Thomas Monjalon:
> The option CONFIG_RTE_LIBRTE_VHOST_NUMA depends on availability of
> libnuma in the system.
> The configuration option DPDK_DEP_NUMA can be set if available for
> the DPDK_TARGET being built.
> 
> Fixes: cd31ca579c0d ("scripts: add build tests")
> 
> Signed-off-by: Thomas Monjalon 

Applied

[dpdk-dev] [PATCH] scripts: remove old build option

2016-07-15 Thread Thomas Monjalon

2016-07-12 17:56, Thomas Monjalon:
> The config option CONFIG_RTE_PCI_CONFIG does not exist anymore.
> 
> Fixes: 7d619406f31d ("pci: remove deprecated specific config")
> 
> Signed-off-by: Thomas Monjalon 

Applied

[dpdk-dev] [PATCH] scripts: print failed directory when testing build

2016-07-15 Thread Thomas Monjalon

2016-07-15 15:55, Thomas Monjalon:
> The script test-build.sh can be used to test building several
> targets with different configurations. The directory name reflects
> the target and the customized configuration.
> When there is a failure, it is convenient to print this build
> directory to quickly know which case is failing without scrolling
> the build log history.
> 
> Signed-off-by: Thomas Monjalon 

Applied

[dpdk-dev] [PATCH] mempool: fix empty structure definition

2016-07-15 Thread Thomas Monjalon

> > This commit addresses the following warning reported by clang, which
> > happens by default, as long as CONFIG_RTE_LIBRTE_MEMPOOL_DEBUG is disabled:
> >
> >   warning: empty struct has size 0 in C, size 1 in C++
> >
> > C and C++ must use the same size for objects to avoid corruption during run
> > time.
> >
> > Fixes: 97e7e685bfcd ("mempool: add structure for object trailers")
> >
> > Signed-off-by: Adrien Mazarguil 
> 
> Acked-by: Olivier Matz 

Applied, thanks

[dpdk-dev] spinlock: Move constructor function out of header file

2016-07-15 Thread Thomas Monjalon

2016-07-15 16:37, Thomas Monjalon:
> I will apply it with trivial changes suggested by Jan and
> the small needed changes that I describe below:
> 
> 2016-07-14 20:03, Jan Viktorin:
> > On Thu, 14 Jul 2016 15:27:29 +0200
> > damarion at cisco.com wrote:
> > > --- /dev/null
> > > +++ b/lib/librte_eal/common/arch/x86/rte_spinlock.c
> [...]
> > > +uint8_t rte_rtm_supported; /* cache the flag to avoid the overhead
> > > +   of the rte_cpu_get_flag_enabled function */
> 
> This variable must be exported in the .map file.
> 
> > > --- a/lib/librte_eal/linuxapp/eal/Makefile
> > > +++ b/lib/librte_eal/linuxapp/eal/Makefile
> > > @@ -106,6 +106,7 @@ SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += 
> > > rte_keepalive.c
> > >  
> > >  # from arch dir
> > >  SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += rte_cpuflags.c
> > > +SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += rte_spinlock.c
> > 
> > This is not good, you provide rte_spinlock.c only for x86. Building
> > for any other arch would fail to find this file.
> 
> This change do the trick:
> 
> -SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += rte_spinlock.c
> +SRCS-$(CONFIG_RTE_ARCH_X86) += rte_spinlock.c
> 
> (Note that CONFIG_RTE_EXEC_ENV_LINUXAPP check is not needed inside linuxapp 
> EAL)
> 
> > Moreover, the bsdapp/eal/Makefile should reflect this situation as
> > well.
> 
> Yes

Applied with above changes, thanks

[dpdk-dev] [RFC] Generic flow director/filtering/classification API

2016-07-15 Thread Adrien Mazarguil

On Fri, Jul 15, 2016 at 09:23:26AM +, Chandran, Sugesh wrote:
> Thank you Adrien,
> Please find below for some more comments/inputs
> 
> Let me know your thoughts on this.

Thanks, stripping again non relevant parts.

[...]
> > > > > [Sugesh] Is it a limitation to use only 32 bit ID? Is it possible
> > > > > to have a
> > > > > 64 bit ID? So that application can use the control plane flow
> > > > > pointer Itself as an ID. Does it make sense?
> > > >
> > > > I've specified a 32 bit ID for now because this is what FDIR
> > > > supports and also what existing devices can report today AFAIK (i40e and
> > mlx5).
> > > >
> > > > We could use 64 bit for future-proofness in a separate action like 
> > > > "ID64"
> > > > when at least one device supports it.
> > > >
> > > > To PMD maintainers: please comment if you know devices that support
> > > > tagging matching packets with more than 32 bits of user-provided
> > > > data!
> > > [Sugesh] I guess the flow director ID is 64 bit , The XL710 datasheet 
> > > says so.
> > > And in the 'rte_mbuf' structure the 64 bit FDIR-ID is shared with rss
> > > hash. This can be a software driver limitation that expose only 32
> > > bit. Possibly because of cache alignment issues? Since the hardware
> > > can support 64 bit, I feel it make sense to support 64 bit as well.
> > 
> > I agree we need 64 bit support, but then we also need a solution for devices
> > that support only 32 bit. Possible methods I can think of:
> > 
> > - A separate "ID64" action (or a "ID32" one, perhaps with a better name).
> > 
> > - A single ID action with an unlimited number of bytes to return with
> >   packets (would actually be a string). PMDs can then refuse to create flow
> >   rules requesting an unsupported number of bytes. Devices supporting
> > fewer
> >   than 32 bits are also included this way without the need for yet another
> >   action.
> > 
> > Thoughts?
> [Sugesh] I feel the single ID approach is much better. But I would say a 
> fixed size ID
> is easy to handle at upper layers. Say PMD returns 64bit ID in which MSBs 
> are masked out, based on how many bits the hardware can support. 
> PMD can refuse the unsupported number of bytes when requested. So the size
> of ID going to be a parameter to program the flow.
> What do you think?

What you suggest if I am not mistaken is:

 struct rte_flow_action_id {
 uint64_t id;
 uint64_t mask; /* either a bit-mask or a prefix/suffix length? */
 };

I think in this case a mask is more versatile than a prefix/suffix length as
the value itself comes in an unknown endian (from PMD's POV). It also allows
specific bits to be taken into account, like when HW only supports 32 bit,
with some black magic the full original 64 bit value can be restored as long
as the application only cares about at most 32 bits anywhere.

However I do not think many applications "won't care" about specific bits in
a given value and having to provide a properly crafted mask will be a
hassle, they will just fill it with ones and hope for the best. As a result
they won't take advantage of this feature or will stick to 32 bits all the
time, or whatever happens to be the least common denominator.

My previous suggestion was:

 struct rte_flow_action_id {
 uint8_t size; /* number of bytes in id[] */
 uint8_t id[];
 };

It does not solve the issue if an application requests more bytes than
supported, however as a string, there is no endianness ambiguity and these
bytes are copied as-is to the related mbuf field as if done through memcpy()
possibly with some padding to fill the entire 64 bit field (copied bytes
thus starting from MSB for big-endian machines, LSB for little-endian
ones). The value itself remains opaque to the PMD.

One issue is the flexible array approach makes static initialization more
complicated. Maybe it is not worth the trouble since according to Andrey,
even X710 reports at most 32 bits of user data.

So what should we do? Fixed 32 bits ID for now to keep things simple, then
another action for 64 bits later when necessary?

> > [...]
> > > > > [Sugesh] Another concern is the cost and time of installing these
> > > > > rules in the hardware. Can we make these APIs time bound(or at
> > > > > least an option
> > > > to
> > > > > set the time limit to execute these APIs), so that Application
> > > > > doesn?t have to wait so long when installing and deleting flows
> > > > with
> > > > > slow hardware/NIC. What do you think? Most of the datapath flow
> > > > installations are
> > > > > dynamic and triggered only when there is an ingress traffic. Delay
> > > > > in flow insertion/deletion have unpredictable
> > > > consequences.
> > > >
> > > > This API is (currently) aimed at the control path only, and must
> > > > indeed be assumed to be slow. Creating million of rules may take
> > > > quite long as it may involve syscalls and other time-consuming
> > > > synchronization things on the PMD side.
> > > >
> > > > So currently there is no plan

[dpdk-dev] [PATCH 02/10] bnx2x: Remove unused preprocessor symbols and code

2016-07-15 Thread Bruce Richardson

On Tue, Jul 12, 2016 at 09:38:06AM -0400, Chas Williams wrote:
> ELINK_INCLUDE_EMUL and ELINK_INCLUDE_FPGA are never defined.  Remove them
> along with enumeration constants dependent on their inclusion.
> 
> Fixes: 540a211084a7 ("bnx2x: driver core")
> 
> Signed-off-by: Chas Williams <3chas3 at gmail.com>

Hi Chas,

the threading on this submission is very awkward - poor patch 1 got somehow
separated from the rest of it's patch family. :-)

Besides this, given where we are in the release cycle for 16.07, these cleanups
are being deferred for possible inclusion in 16.11 (once reviewed and acked,
obviously). To get cleanup like this in in future release, please submit by
the integration deadline date.

BTW: If there are any critical bug fixes in this set, please
sent them as separate patches clearing calling them out as fixes, so they
can get the attention and review they deserve.

Thanks,
/Bruce

[dpdk-dev] [RFC] remove vhost-cuse

2016-07-15 Thread Yuanhan Liu

You see that no body cares it :)

So I will make a patch to mark vhost-cuse as deprecated shortly.
Thomas, works to you?

--yliu

On Mon, Jul 11, 2016 at 11:59:55AM +0800, Yuanhan Liu wrote:
> It's something echoed around in my mind for a long while, and here I'm
> gonna make it public: a proposal to remove vhost-cuse.
> 
> Vhost-cuse was invented before vhost-user exist. The both are actually
> doing the same thing: a vhost-net implementation in user space. But they
> are not exactly the same thing.
> 
> Firstly, vhost-cuse is harder for use; no one seems to care it, either.
> Furthermore, since v2.1, a large majority of development effort has gone
> to vhost-user. For example, we extended the vhost-user spec to add the
> multiple queue support. We also added the vhost-user live migration at
> v16.04 and the latest one, vhost-user reconnect that allows vhost app
> restart without restarting the guest. Both of them are very important
> features for product usage and none of them works for vhost-cuse.
> 
> You now see that the difference between vhost-user and vhost-cuse is
> big (and will be bigger and bigger as time moves forward), that you
> should never use vhost-cuse, that we should drop it completely.
> 
> The remove would also result to a much cleaner code base, allowing us
> to do all kinds of extending easier.
> 
> A talk with Huawei offline showed that he backs this proposal. I was
> also told by Ciara that she actually had the same idea: she has already
> cooked a patch to remove vhost-cuse support from OVS:
> 
> http://openvswitch.org/pipermail/dev/2016-July/074696.html
> 
> So I'm proposing to mark vhost-cuse as deprecated in this release and
> remove it completely at the next release (v16.11).
> 
> Comments/thoughts, or objections?
> 
>   --yliu

[dpdk-dev] [PATCH v2] i40: fix the VXLAN TSO issue

2016-07-15 Thread Bruce Richardson

Since we are now heading to RC3 for 16.07 and there are quite a number of
open comments on this patch unresolved, it's going to be deferred till
16.11, when it can have more review and discussion.

/Bruce

On Thu, Jul 07, 2016 at 12:24:43PM +, Ananyev, Konstantin wrote:
> 
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Ananyev, Konstantin
> > Sent: Thursday, July 07, 2016 11:51 AM
> > To: Tao, Zhe; dev at dpdk.org
> > Cc: Tao, Zhe; Wu, Jingjing
> > Subject: Re: [dpdk-dev] [PATCH v2] i40: fix the VXLAN TSO issue
> > 
> > 
> > Hi Tao,
> > 
> > Sorry hit send button too early by accident :)
> > 
> > > >
> > > > Problem:
> > > > When using the TSO + VXLAN feature in i40e, the outer UDP length fields 
> > > > in
> > > > the multiple UDP segments which are TSOed by the i40e will have the
> > > > wrong value.
> > > >
> > > > Fix this problem by adding the tunnel type field in the i40e descriptor
> > > > which was missed before.
> > > >
> > > > Fixes: 77b8301733c3 ("i40e: VXLAN Tx checksum offload")
> > > >
> > > > Signed-off-by: Zhe Tao 
> > > > ---
> > > > V2: Edited some comments for mbuf structure and i40e driver.
> > > >
> > > >  app/test-pmd/csumonly.c  | 26 +++---
> > > >  drivers/net/i40e/i40e_rxtx.c | 12 +---
> > > >  lib/librte_mbuf/rte_mbuf.h   | 16 +++-
> > > >  3 files changed, 43 insertions(+), 11 deletions(-)
> > > >
> > > > diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
> > > > index ac4bd8f..d423c20 100644
> > > > --- a/app/test-pmd/csumonly.c
> > > > +++ b/app/test-pmd/csumonly.c
> > > > @@ -204,7 +204,8 @@ parse_ethernet(struct ether_hdr *eth_hdr, struct 
> > > > testpmd_offload_info *info)
> > > >  static void
> > > >  parse_vxlan(struct udp_hdr *udp_hdr,
> > > > struct testpmd_offload_info *info,
> > > > -   uint32_t pkt_type)
> > > > +   uint32_t pkt_type,
> > > > +   uint64_t *ol_flags)
> > > >  {
> > > > struct ether_hdr *eth_hdr;
> > > >
> > > > @@ -215,6 +216,7 @@ parse_vxlan(struct udp_hdr *udp_hdr,
> > > > RTE_ETH_IS_TUNNEL_PKT(pkt_type) == 0)
> > > > return;
> > > >
> > > > +   *ol_flags |= PKT_TX_TUNNEL_VXLAN;
> 
> Do we always have to setup tunnelling flags here?
> Obviously it would mean an extra CTD per packet and might slowdown things.
> In fact, I think current patch wouldn't work correctly if
> TESTPMD_TX_OFFLOAD_OUTER_IP_CKSUM is not set.
> So, can we do it only when TSO is enabled or outer IP checksum is enabled?
> 
> > > > info->is_tunnel = 1;
> > > > info->outer_ethertype = info->ethertype;
> > > > info->outer_l2_len = info->l2_len;
> > > > @@ -231,7 +233,9 @@ parse_vxlan(struct udp_hdr *udp_hdr,
> > > >
> > > >  /* Parse a gre header */
> > > >  static void
> > > > -parse_gre(struct simple_gre_hdr *gre_hdr, struct testpmd_offload_info 
> > > > *info)
> > > > +parse_gre(struct simple_gre_hdr *gre_hdr,
> > > > + struct testpmd_offload_info *info,
> > > > + uint64_t *ol_flags)
> > > >  {
> > > > struct ether_hdr *eth_hdr;
> > > > struct ipv4_hdr *ipv4_hdr;
> > > > @@ -242,6 +246,8 @@ parse_gre(struct simple_gre_hdr *gre_hdr, struct 
> > > > testpmd_offload_info *info)
> > > > if ((gre_hdr->flags & _htons(~GRE_SUPPORTED_FIELDS)) != 0)
> > > > return;
> > > >
> > > > +   *ol_flags |= PKT_TX_TUNNEL_GRE;
> > > > +
> > > > gre_len += sizeof(struct simple_gre_hdr);
> > > >
> > > > if (gre_hdr->flags & _htons(GRE_KEY_PRESENT))
> > > > @@ -417,7 +423,7 @@ process_inner_cksums(void *l3_hdr, const struct 
> > > > testpmd_offload_info *info,
> > > >   * packet */
> > > >  static uint64_t
> > > >  process_outer_cksums(void *outer_l3_hdr, struct testpmd_offload_info 
> > > > *info,
> > > > -   uint16_t testpmd_ol_flags)
> > > > +   uint16_t testpmd_ol_flags, uint64_t orig_ol_flags)
> > > >  {
> > > > struct ipv4_hdr *ipv4_hdr = outer_l3_hdr;
> > > > struct ipv6_hdr *ipv6_hdr = outer_l3_hdr;
> > > > @@ -442,6 +448,9 @@ process_outer_cksums(void *outer_l3_hdr, struct 
> > > > testpmd_offload_info *info,
> > > >  * hardware supporting it today, and no API for it. */
> > > >
> > > > udp_hdr = (struct udp_hdr *)((char *)outer_l3_hdr + 
> > > > info->outer_l3_len);
> > > > +   if ((orig_ol_flags & PKT_TX_TCP_SEG) &&
> > > > +   ((orig_ol_flags & PKT_TX_TUNNEL_MASK) == 
> > > > PKT_TX_TUNNEL_VXLAN))
> > > > +   udp_hdr->dgram_cksum = 0;
> > > > /* do not recalculate udp cksum if it was 0 */
> > > > if (udp_hdr->dgram_cksum != 0) {
> > > > udp_hdr->dgram_cksum = 0;
> > > > @@ -705,15 +714,18 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
> > > > if (info.l4_proto == IPPROTO_UDP) {
> > > > struct udp_hdr *udp_hdr;
> > > >

[dpdk-dev] [PATCH v2] vhost: fix segfault on bad descriptor address

2016-07-15 Thread Yuanhan Liu

On Fri, Jul 15, 2016 at 10:23:12AM +0300, Ilya Maximets wrote:
> On 15.07.2016 09:17, Yuanhan Liu wrote:
> > On Thu, Jul 14, 2016 at 11:18:39AM +0300, Ilya Maximets wrote:
> >> In current implementation vhost will crash with segmentation fault
> >> if malicious or buggy virtio application breaks addresses of descriptors.
> >>
> >> Before commit 0823c1cb0a73 this crash was reproducible even with
> >> normal DPDK application that tries to change number of virtqueues
> >> dynamically inside VM.
> >>
> >> Fix that by checking addresses of descriptors before using.
> >>
> >> Also fixed return value on error for 'copy_mbuf_to_desc_mergeable()'
> >> from '-1' to '0' because it returns unsigned value and it means
> >> number of used descriptors.
> > 
> > Yeah, that's a good fix. Thanks.
> > 
> > Maybe you'd better make it a standalone patch.
> 
> Ok. Maybe I should split this patch in two:
> 1. Fix return value + using of this value (vq->last_used_idx += nr_used;)
> 2. Check addresses of descriptors.
> What do you think?

Good to me.

> >> Signed-off-by: Ilya Maximets 
> >> ---
> >> Version 2:
> >>* Rebased on top of current master.
> >>* host's address now checked in meargeable case,
> >>  because needed refactoring already done.
> >>* Commit-message changed because old issue with
> >>  virtio reload accidentially fixed by commit
> >>  0823c1cb0a73.
> >>
> >>  lib/librte_vhost/vhost_rxtx.c | 28 +---
> >>  1 file changed, 21 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
> >> index 15ca956..31e8b58 100644
> >> --- a/lib/librte_vhost/vhost_rxtx.c
> >> +++ b/lib/librte_vhost/vhost_rxtx.c
> >> @@ -147,10 +147,10 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct 
> >> vhost_virtqueue *vq,
> >>struct virtio_net_hdr_mrg_rxbuf virtio_hdr = {{0, 0, 0, 0, 0, 0}, 0};
> >>  
> >>desc = >desc[desc_idx];
> >> -  if (unlikely(desc->len < dev->vhost_hlen))
> >> +  desc_addr = gpa_to_vva(dev, desc->addr);
> >> +  if (unlikely(desc->len < dev->vhost_hlen || !desc_addr))
> >>return -1;
> > 
> > So, you discards the workaround from Rich?
> 
> I can apply it, if you wish. Should I?

Yeah, it's hard to tell. The performace regression is weird after all.
I'm thinking we should appy it anyway: it saves 10% regression, which
is worthwhile. I think we should also add comments there.

--yliu

[dpdk-dev] [PATCH 1/2] bnx2x: disable fast path interrupts

2016-07-15 Thread Rasesh Mody

> From: Chas Williams [mailto:3chas3 at gmail.com]
> Sent: Friday, July 15, 2016 3:35 AM
> 
> On Thu, 2016-07-14 at 10:03 -0700, Rasesh Mody wrote:
> > Disable fastpath interrupts and remove unneeded delay in
> > bnx2x_interrupt_action(). This patch fixes and prevents performance
> > degradation for BNX2X PMD.
> >
> > Fixes: 540a2110 ("bnx2x: driver core")
> >
> > Signed-off-by: Rasesh Mody 
> >
> > diff --git a/drivers/net/bnx2x/bnx2x_ethdev.c
> > b/drivers/net/bnx2x/bnx2x_ethdev.c
> > index c8d2bf2..f97dd5b 100644
> > --- a/drivers/net/bnx2x/bnx2x_ethdev.c
> > +++ b/drivers/net/bnx2x/bnx2x_ethdev.c
> > @@ -107,8 +107,8 @@ bnx2x_interrupt_action(struct rte_eth_dev *dev)
> >
> > ?   PMD_DEBUG_PERIODIC_LOG(INFO, "Interrupt handled");
> >
> > -   if (bnx2x_intr_legacy(sc, 0))
> > -   DELAY_MS(250);
> > +   bnx2x_intr_legacy(sc, 0)
> > +
> 
> Trailing semicolon?

My bad, sorry, I didn?t send the latest patch, will send v2.

[dpdk-dev] spinlock: Move constructor function out of header file

2016-07-15 Thread Thomas Monjalon

I will apply it with trivial changes suggested by Jan and
the small needed changes that I describe below:

2016-07-14 20:03, Jan Viktorin:
> On Thu, 14 Jul 2016 15:27:29 +0200
> damarion at cisco.com wrote:
> > --- /dev/null
> > +++ b/lib/librte_eal/common/arch/x86/rte_spinlock.c
[...]
> > +uint8_t rte_rtm_supported; /* cache the flag to avoid the overhead
> > + of the rte_cpu_get_flag_enabled function */

This variable must be exported in the .map file.

> > --- a/lib/librte_eal/linuxapp/eal/Makefile
> > +++ b/lib/librte_eal/linuxapp/eal/Makefile
> > @@ -106,6 +106,7 @@ SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += rte_keepalive.c
> >  
> >  # from arch dir
> >  SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += rte_cpuflags.c
> > +SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += rte_spinlock.c
> 
> This is not good, you provide rte_spinlock.c only for x86. Building
> for any other arch would fail to find this file.

This change do the trick:

-SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += rte_spinlock.c
+SRCS-$(CONFIG_RTE_ARCH_X86) += rte_spinlock.c

(Note that CONFIG_RTE_EXEC_ENV_LINUXAPP check is not needed inside linuxapp EAL)

> Moreover, the bsdapp/eal/Makefile should reflect this situation as
> well.

Yes

[dpdk-dev] [PATCH 2/2] app/test: filter out unavailable tests

2016-07-15 Thread Thomas Monjalon

2016-07-15 16:23, David Marchand:
> On Wed, Jul 13, 2016 at 11:24 PM, Thomas Monjalon
>  wrote:
> > Some tests can fail to run because they are not compiled.
> > It has been more visible recently when the PCI test became disabled
> > in the default configuration because of dependency on libarchive:
> > PCI autotest:Fail [Not found]
> >
> > The autotest script catch them and do not count them as an error anymore:
> > PCI autotest:Skipped [Not Available]
> >
> > The commands dump_ring and dump_mempool are removed from the autotest list
> > because they are some internal commands but not some registered tests.
> > Thus they cannot be parsed as available test commands.
> >
> > Signed-off-by: Thomas Monjalon 
> 
> Suggested-by: David Marchand 

Thanks for giving me the idea of the nm trick.

> and I trust you for the python ;-)

You should not ;)

[dpdk-dev] [PATCH 2/2] app/test: filter out unavailable tests

2016-07-15 Thread David Marchand

On Wed, Jul 13, 2016 at 11:24 PM, Thomas Monjalon
 wrote:
> Some tests can fail to run because they are not compiled.
> It has been more visible recently when the PCI test became disabled
> in the default configuration because of dependency on libarchive:
> PCI autotest:Fail [Not found]
>
> The autotest script catch them and do not count them as an error anymore:
> PCI autotest:Skipped [Not Available]
>
> The commands dump_ring and dump_mempool are removed from the autotest list
> because they are some internal commands but not some registered tests.
> Thus they cannot be parsed as available test commands.
>
> Signed-off-by: Thomas Monjalon 

Suggested-by: David Marchand 

and I trust you for the python ;-)

-- 
David Marchand

[dpdk-dev] [PATCH v6 05/17] eal: introduce init macros

2016-07-15 Thread Shreyansh jain

On Thursday 14 July 2016 09:27 PM, Jan Viktorin wrote:
> On Thu, 14 Jul 2016 10:57:55 +0530
> Shreyansh jain  wrote:
> 
>> Hi Jan,
>>
>> On Wednesday 13 July 2016 11:04 PM, Jan Viktorin wrote:
>>> On Wed, 13 Jul 2016 11:20:43 +0200
>>> Jan Viktorin  wrote:
>>>   
 Hello Shreyansh,

 On Tue, 12 Jul 2016 11:31:10 +0530
 Shreyansh Jain  wrote:
  
> Introduce a RTE_INIT macro used to mark an init function as a constructor.
> Current eal macros have been converted to use this (no functional impact).
> DRIVER_REGISTER_PCI is added as a helper for pci drivers.
>
> Suggested-by: Jan Viktorin 
> Signed-off-by: David Marchand 
> Signed-off-by: Shreyansh Jain 
> ---

 [...]
  
> +#define RTE_INIT(func) \
> +static void __attribute__((constructor, used)) func(void)
> +
>  #ifdef __cplusplus
>  }
>  #endif
> diff --git a/lib/librte_eal/common/include/rte_pci.h 
> b/lib/librte_eal/common/include/rte_pci.h
> index fa74962..3027adf 100644
> --- a/lib/librte_eal/common/include/rte_pci.h
> +++ b/lib/librte_eal/common/include/rte_pci.h
> @@ -470,6 +470,14 @@ void rte_eal_pci_dump(FILE *f);
>   */
>  void rte_eal_pci_register(struct rte_pci_driver *driver);
>  
> +/** Helper for PCI device registeration from driver (eth, crypto) 
> instance */
> +#define DRIVER_REGISTER_PCI(nm, drv) \
> +RTE_INIT(pciinitfn_ ##nm); \
> +static void pciinitfn_ ##nm(void) \
> +{ \

 You are missing setting the name here like PMD_REGISTER_DRIVER does
 now. Or should I include it in my patch set?

(drv).name = RTE_STR(nm);  
>>
>> That is a miss from my side.
>> I will publish v7 with this. You want this right away or should I wait a 
>> little while (more reviews, or any pending additions as per Thomas's notes) 
>> before publishing?
> 
> Please. The time is almost gone. 18/7/2016 is the release (according
> to the roadmap)... I have to fix it in my patchset, otherwise it
> does not build (after moving the .name from rte_pci_driver to
> rte_driver).
> 

I didn't consider 18/Jul.
Please go ahead. I will continue to send v7 _without_ the above change so that 
your patchset doesn't break. This way you will not get blocked because of me.

>>
>>>
>>> Moreover, it should accept the rte_pci_driver *, shouldn't it? Here, it
>>> expects a wrapper around it (eth_driver)... I now, my SoC patches were
>>> supposing the some... but I think it is wrong.
>>>
>>> The original David's patch set contains calls like this:
>>>
>>>   RTE_EAL_PCI_REGISTER(bnx2xvf, rte_bnx2xvf_pmd.pci_drv);
>>>
>>> So, I think, we should go the original way.  
>>
>> I have a slightly different opinion of the above.
>> IMO, aim of the helpers is to hide the PCI details and continue to make 
>> driver consider itself as a generic ETH driver. In that case, dereferencing 
>> pci_drv would be done by macro.
> 
> In this case, I'd prefer to see DRIVER_REGISTER_PCI_ETH.
> 
> At first, this was also my way of thinking. But I've changed my mind. I
> find it to be a bit overdesigned.

There is:

DRIVER_REGISTER_PCI(...)
DRIVER_REGISTER_PCI_TABLE(...)

Wouldn't DRIVER_REGISTER_PCI_ETH look out-of-place?

> 
>>
>> Also, considering that in future pci_drv would also have soc_drv, the 
>> helpers can effectively hide the intra-structure naming of these. It would 
>> help when more such device types (would there be?) are introduced - in which 
>> case, driver framework has a consistent coding convention.
> 
> Hide? I am afraid, I don't understand clearly what you mean.

DRIVER_REGISTER_PCI(eth_driver)
DRIVER_REGISTER_SOC(eth_driver)
DRIVER_REGISTER_XXX(eth_driver)
...

In either case, the caller always creates the eth_driver and populates internal 
specific driver structure (pci_drv) as a sub-part of eth_driver specification. 
Macro 'hides' the internal structure name (pci_drv, soc_drv...).
But again, nothing critical. Just a way of usage. We might not even have a 
'XXX' in near future.

> 
>>
>> But, I am ok switching back to David's way as well - I don't have any strong 
>> argument against that.
> 
> I'd like to preserve the clear semantics. That is DRIVER_REGISTER_PCI
> -> give a pci device.
> 
> Has anybody a different opinion? David? Thomas?

Yes please.
Or else, if nothing comes up soon, I will simply go ahead and change to 
DRIVER_REGISTER_PCI(eth_driver.pci_drv) as this trivial issue shouldn't hold 
back this series.

> 
>>
>>>
>>> Jan
>>>   
  
> + rte_eal_pci_register(_drv); \
> +}
> +
>  /**
>   * Unregister a PCI driver.
>   *  
>> [...]
>>
>> -
>> Shreyansh
>>
> 
> 
>

[dpdk-dev] [PATCH v6 16/17] ethdev: convert to eal hotplug

2016-07-15 Thread Shreyansh jain

On Thursday 14 July 2016 10:21 PM, Jan Viktorin wrote:
> On Tue, 12 Jul 2016 11:31:21 +0530
> Shreyansh Jain  wrote:
> 
>> Remove bus logic from ethdev hotplug by using eal for this.
>>
>> Current api is preserved:
>> - the last port that has been created is tracked to return it to the
>>   application when attaching,
>> - the internal device name is reused when detaching.
>>
>> We can not get rid of ethdev hotplug yet since we still need some mechanism
>> to inform applications of port creation/removal to substitute for ethdev
>> hotplug api.
>>
>> dev_type field in struct rte_eth_dev and rte_eth_dev_allocate are kept as
>> is, but this information is not needed anymore and is removed in the 
>> following
>> commit.
>>
>> Signed-off-by: David Marchand 
>> Signed-off-by: Shreyansh Jain 
>> ---
>>  lib/librte_ether/rte_ethdev.c | 207 
>> +++---
>>  1 file changed, 33 insertions(+), 174 deletions(-)
>>
>> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
>> index a667012..8d14fd7 100644
>> --- a/lib/librte_ether/rte_ethdev.c
>> +++ b/lib/librte_ether/rte_ethdev.c
>> @@ -72,6 +72,7 @@
>>  static const char *MZ_RTE_ETH_DEV_DATA = "rte_eth_dev_data";
>>  struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS];
>>  static struct rte_eth_dev_data *rte_eth_dev_data;
>> +static uint8_t eth_dev_last_created_port;
>>  static uint8_t nb_ports;
>>  
> 
> [...]
> 
>> -
>>  /* attach the new device, then store port_id of the device */
>>  int
>>  rte_eth_dev_attach(const char *devargs, uint8_t *port_id)
>>  {
>> -struct rte_pci_addr addr;
>>  int ret = -1;
>> +int current = eth_dev_last_created_port;
>> +char *name = NULL;
>> +char *args = NULL;
>>  
>>  if ((devargs == NULL) || (port_id == NULL)) {
>>  ret = -EINVAL;
>>  goto err;
>>  }
>>  
>> -if (eal_parse_pci_DomBDF(devargs, ) == 0) {
>> -ret = rte_eth_dev_attach_pdev(, port_id);
>> -if (ret < 0)
>> -goto err;
>> -} else {
>> -ret = rte_eth_dev_attach_vdev(devargs, port_id);
>> -if (ret < 0)
>> -goto err;
>> +/* parse devargs, then retrieve device name and args */
>> +if (rte_eal_parse_devargs_str(devargs, , ))
>> +goto err;
>> +
>> +ret = rte_eal_dev_attach(name, args);
>> +if (ret < 0)
>> +goto err;
>> +
>> +/* no point looking at eth_dev_last_created_port if no port exists */
> 
> I am not sure about this comment. What is "no point"?
> 
> Isn't this also a potential bug? (Like the one below.) How could it
> happen there is no port after a successful attach?

Yes, even searching through code path I couldn't find a positive case where 
control would reach here without nb_ports>0.
Though, i am not sure if some rough application attempts to mix-up calls - and 
that, in my opinion, is not worth checking.
Should I remove it?

> 
>> +if (!nb_ports) {
>> +ret = -1;
>> +goto err;
>> +}
>> +/* if nothing happened, there is a bug here, since some driver told us
>> + * it did attach a device, but did not create a port */
>> +if (current == eth_dev_last_created_port) {
>> +ret = -1;
>> +goto err;
> 
> Should we log this? Or call some kind panic?

I will place a log because applications should have a chance to ignore a device 
which cannot be attached for whatever reason.

> 
>>  }
>> +*port_id = eth_dev_last_created_port;
>> +ret = 0;
>>  
>> -return 0;
>>  err:
>> -RTE_LOG(ERR, EAL, "Driver, cannot attach the device\n");
>> +free(name);
>> +free(args);
>>  return ret;
>>  }
>>  
>> @@ -590,7 +464,6 @@ err:
>>  int
>>  rte_eth_dev_detach(uint8_t port_id, char *name)
>>  {
>> -struct rte_pci_addr addr;
>>  int ret = -1;
>>  
>>  if (name == NULL) {
>> @@ -598,33 +471,19 @@ rte_eth_dev_detach(uint8_t port_id, char *name)
>>  goto err;
>>  }
>>  
>> -/* check whether the driver supports detach feature, or not */
>> +/* FIXME: move this to eal, once device flags are relocated there */
>>  if (rte_eth_dev_is_detachable(port_id))
>>  goto err;
>>  
>> -if (rte_eth_dev_get_device_type(port_id) == RTE_ETH_DEV_PCI) {
>> -ret = rte_eth_dev_get_addr_by_port(port_id, );
>> -if (ret < 0)
>> -goto err;
>> -
>> -ret = rte_eth_dev_detach_pdev(port_id, );
>> -if (ret < 0)
>> -goto err;
>> -
>> -snprintf(name, RTE_ETH_NAME_MAX_LEN,
>> -"%04x:%02x:%02x.%d",
>> -addr.domain, addr.bus,
>> -addr.devid, addr.function);
>> -} else {
>> -ret = rte_eth_dev_detach_vdev(port_id, name);
>> -if (ret < 0)
>> -goto err;
>> -}
>> +snprintf(name, sizeof(rte_eth_devices[port_id].data->name),
>> +

[dpdk-dev] [PATCH] scripts: print failed directory when testing build

2016-07-15 Thread Thomas Monjalon

The script test-build.sh can be used to test building several
targets with different configurations. The directory name reflects
the target and the customized configuration.
When there is a failure, it is convenient to print this build
directory to quickly know which case is failing without scrolling
the build log history.

Signed-off-by: Thomas Monjalon 
---
 scripts/test-build.sh | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/scripts/test-build.sh b/scripts/test-build.sh
index 30dfdf5..d2cafc1 100755
--- a/scripts/test-build.sh
+++ b/scripts/test-build.sh
@@ -96,12 +96,13 @@ configs=${*:-$DPDK_BUILD_TEST_CONFIGS}
 success=false
 on_exit ()
 {
-   if [ "$DPDK_NOTIFY" = notify-send ] ; then
-   if $success ; then
+   if $success ; then
+   [ "$DPDK_NOTIFY" != notify-send ] || \
notify-send -u low --icon=dialog-information 'DPDK 
build' 'finished'
-   elif [ -z "$signal" ] ; then
+   elif [ -z "$signal" ] ; then
+   [ -z "$dir" ] || echo "failed to build $dir" >&2
+   [ "$DPDK_NOTIFY" != notify-send ] || \
notify-send -u low --icon=dialog-error 'DPDK build' 
'failed'
-   fi
fi
 }
 # catch manual interrupt to ignore notification
@@ -232,6 +233,7 @@ for conf in $configs ; do
O=$(readlink -m $dir/examples/performance-thread)
unset RTE_TARGET
echo "## $dir done."
+   unset dir
 done

 if ! $short ; then
-- 
2.7.0

[dpdk-dev] [PATCH v6 15/17] eal: add hotplug operations for pci and vdev

2016-07-15 Thread Shreyansh jain

On Thursday 14 July 2016 10:20 PM, Jan Viktorin wrote:
> On Tue, 12 Jul 2016 11:31:20 +0530
> Shreyansh Jain  wrote:
> 
>> Hotplug which deals with resources should come from the layer that already
>> handles them, i.e. EAL.
>>
>> For both attach and detach operations, 'name' is used to select the bus
>> that will handle the request.
>>
>> Signed-off-by: David Marchand 
>> Signed-off-by: Shreyansh Jain 
>> ---
>>  lib/librte_eal/bsdapp/eal/rte_eal_version.map   |  2 ++
>>  lib/librte_eal/common/eal_common_dev.c  | 47 
>> +
>>  lib/librte_eal/common/include/rte_dev.h | 25 +
>>  lib/librte_eal/linuxapp/eal/rte_eal_version.map |  2 ++
>>  4 files changed, 76 insertions(+)
>>
>> diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map 
>> b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
>> index 1852c4a..6f9324f 100644
>> --- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
>> +++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
>> @@ -159,5 +159,7 @@ DPDK_16.07 {
>>  rte_keepalive_mark_sleep;
>>  rte_keepalive_register_relay_callback;
>>  rte_thread_setname;
>> +rte_eal_dev_attach;
>> +rte_eal_dev_detach;
>>  
>>  } DPDK_16.04;
>> diff --git a/lib/librte_eal/common/eal_common_dev.c 
>> b/lib/librte_eal/common/eal_common_dev.c
>> index a8a4146..14c6cf1 100644
>> --- a/lib/librte_eal/common/eal_common_dev.c
>> +++ b/lib/librte_eal/common/eal_common_dev.c
>> @@ -150,3 +150,50 @@ rte_eal_vdev_uninit(const char *name)
>>  RTE_LOG(ERR, EAL, "no driver found for %s\n", name);
>>  return -EINVAL;
>>  }
>> +
>> +int rte_eal_dev_attach(const char *name, const char *devargs)
>> +{
>> +struct rte_pci_addr addr;
>> +int ret = -1;
>> +
>> +if (name == NULL || devargs == NULL) {
>> +RTE_LOG(ERR, EAL, "Invalid device arguments provided\n");
>> +return ret;
>> +}
>> +
>> +if (eal_parse_pci_DomBDF(name, ) == 0) {
>> +if (rte_eal_pci_probe_one() < 0)
>> +goto err;
>> +
>> +} else {
>> +if (rte_eal_vdev_init(name, devargs))
>> +goto err;
>> +}
>> +
>> +return 0;
>> +
>> +err:
>> +RTE_LOG(ERR, EAL, "Driver cannot attach the device\n");
> 
> I think this log can be more specific. We have the name of the device.
> It might be confusing to the user when the attach fails and there is
> no simple way how to determine which one.

Agree. I will update with 'name'.

> 
>> +return ret;
>> +}
>> +
>> +int rte_eal_dev_detach(const char *name)
>> +{
>> +struct rte_pci_addr addr;
>> +
>> +if (name == NULL)
>> +goto err;
>> +
>> +if (eal_parse_pci_DomBDF(name, ) == 0) {
>> +if (rte_eal_pci_detach() < 0)
>> +goto err;
>> +} else {
>> +if (rte_eal_vdev_uninit(name))
>> +goto err;
>> +}
>> +return 0;
>> +
>> +err:
>> +RTE_LOG(ERR, EAL, "Driver cannot detach the device\n");
> 
> Same as for attach.

OK.

> 
>> +return -1;
>> +}
>> diff --git a/lib/librte_eal/common/include/rte_dev.h 
>> b/lib/librte_eal/common/include/rte_dev.h
>> index 994650b..2f0579c 100644
>> --- a/lib/librte_eal/common/include/rte_dev.h
>> +++ b/lib/librte_eal/common/include/rte_dev.h
>> @@ -178,6 +178,31 @@ int rte_eal_vdev_init(const char *name, const char 
>> *args);
>>   */
>>  int rte_eal_vdev_uninit(const char *name);
>>  
>> +/**
>> + * Attach a resource to a registered driver.
> 
> From my POV, the term "resource" is not good here. We have
> rte_pci_resource but we refer to a device here. The function is called
> rte_eal_DEV_attach. No idea why to call it a resource.

Your point of 'resource' for 'rte_eal_*dev*_attach' makes sense.
Ideally I wouldn't have given much heed to this - for me 'resource' is just a 
device representation within EAL.
But, now that you have highlighted, name<=>comment is certainly mismatch.

I will change.

> 
>> + *
>> + * @param name
>> + *   The resource name, that refers to a pci resource or some private
>> + *   way of designating a resource for vdev drivers. Based on this
>> + *   resource name, eal will identify a driver capable of handling
>> + *   this resource and pass this resource to the driver probing
>> + *   function.
>> + * @param devargs
>> + *   Device arguments to be passed to the driver.
>> + * @return
>> + *   0 on success, negative on error.
>> + */
>> +int rte_eal_dev_attach(const char *name, const char *devargs);
>> +
>> +/**
>> + * Detach a resource from its driver.
> 
> Same here for resource.

I will change to "Detach a device from its driver".

> 
> Jan
> 
>> + *
>> + * @param name
>> + *   Same description as for rte_eal_dev_attach().
>> + *   Here, eal will call the driver detaching function.
>> + */
>> +int rte_eal_dev_detach(const char *name);
>> +
>>  #define DRIVER_EXPORT_NAME_ARRAY(n, idx) n##idx[]
>>  
>>  #define DRIVER_EXPORT_NAME(name, idx) \
>> diff --git

[dpdk-dev] [PATCH v1 00/15] rte_driver/device infrastructure

2016-07-15 Thread Thomas Monjalon

2016-07-08 21:09, Jan Viktorin:
> Hello,
> 
> based on the discussions with Shreyansh, I propose a patchset with
> the important EAL changes. It is incomplete and I suppose to extend
> and change certain things in the foreseeable future.
> 
> Important notes:
> 
> * pmd_type is removed
> * introduced rte_vdev_driver inheriting rte_driver
> * PMD_REGISTER_DRIVER is replaced by RTE_EAL_VDRV_REGISTER
> * rte_driver/device integrated into rte_pci_driver/device
> * all drivers and devices are in 2 lists - general and bus-specific
> 
> Shreyansh, I hope I do not duplicate your work. I tried to avoid touching
> pmd_type but it quite complicated... There is also an initial generalization
> of rte_pci_resource. More such generalizations are to be done.
> 
> The init/uninit functions cannot be generalized easily, I think. Both PCI
> and VDEV have different requirements.
> 
> No idea about hotplug...

Please could you give a clear overview of how you split the work in
your respective series?

I take the opportunity to put my notes about some initial targets of
this refactoring:

module/drivers
attached to 1 bus: pci_driver or vdev_driver
rte_device = device resources / embedded in pci/vdev_driver
attached to n device interfaces (ethdev, crypto)
1 device resource -> n device interfaces
hotplug resource -> lookup drivers list
per-bus lists or 1 list?
devinit/devuninit generalized and moved to rte_driver
-> rename to probe/remove
crypto.dev_type could be dropped
drv_flags should be moved to rte_driver
intr_init can be moved earlier in init, before affinity set
devices
unique_device_name -> standard naming?
difference with port_id ? -> device id for ethdev
should be unique_resource_name -> 1 EAL resource may match 
several devices
ethdev manage an interface, eal manage a hardware resource, device 
object in between?
need for bus object?
no need of driver object, module object?
devargs, intr_handle, numa_node should be moved to rte_device
hotplug notification to do
notify free-able ressource?
remove blacklist at EAL level and let application handle it
devargs still in hotplug function, must be moved in separate API
devargs
new API
new command line parameters

[dpdk-dev] [PATCH v6 13/17] pci: add a helper to update a device

2016-07-15 Thread Shreyansh jain

On Thursday 14 July 2016 10:31 PM, Jan Viktorin wrote:
> On Tue, 12 Jul 2016 11:31:18 +0530
> Shreyansh Jain  wrote:
> 
>> This helper updates a pci device object with latest information it can
>> find.
>> It will be used mainly for hotplug code.
>>
>> Signed-off-by: David Marchand 
>> Signed-off-by: Shreyansh Jain 
>> ---
>>  lib/librte_eal/bsdapp/eal/eal_pci.c | 49 
>> +
>>  lib/librte_eal/common/eal_common_pci.c  |  2 --
>>  lib/librte_eal/common/eal_private.h | 13 +
>>  lib/librte_eal/common/include/rte_pci.h |  3 ++
>>  lib/librte_eal/linuxapp/eal/eal_pci.c   | 13 +
>>  5 files changed, 78 insertions(+), 2 deletions(-)
>>
>> diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c 
>> b/lib/librte_eal/bsdapp/eal/eal_pci.c
>> index a73cbb0..1d91c78 100644
>> --- a/lib/librte_eal/bsdapp/eal/eal_pci.c
>> +++ b/lib/librte_eal/bsdapp/eal/eal_pci.c
>> @@ -406,6 +406,55 @@ error:
>>  return -1;
>>  }
>>  
>> +int
>> +pci_update_device(const struct rte_pci_addr *addr)
>> +{
>> +int fd;
>> +struct pci_conf matches[2];
>> +struct pci_match_conf match = {
>> +.pc_sel = {
>> +.pc_domain = addr->domain,
>> +.pc_bus = addr->bus,
>> +.pc_dev = addr->devid,
>> +.pc_func = addr->function,
>> +},
>> +};
>> +struct pci_conf_io conf_io = {
>> +.pat_buf_len = 0,
>> +.num_patterns = 1,
>> +.patterns = ,
>> +.match_buf_len = sizeof(matches),
>> +.matches = [0],
>> +};
>> +
>> +fd = open("/dev/pci", O_RDONLY);
>> +if (fd < 0) {
>> +RTE_LOG(ERR, EAL, "%s(): error opening /dev/pci\n", __func__);
>> +goto error;
>> +}
>> +
>> +if (ioctl(fd, PCIOCGETCONF, _io) < 0) {
>> +RTE_LOG(ERR, EAL, "%s(): error with ioctl on /dev/pci: %s\n",
>> +__func__, strerror(errno));
>> +goto error;
>> +}
>> +
>> +if (conf_io.num_matches != 1)
>> +goto error;
>> +
>> +if (pci_scan_one(fd, [0]) < 0)
>> +goto error;
>> +
>> +close(fd);
>> +
>> +return 0;
>> +
>> +error:
>> +if (fd >= 0)
>> +close(fd);
>> +return -1;
>> +}
>> +
>>  /* Read PCI config space. */
>>  int rte_eal_pci_read_config(const struct rte_pci_device *dev,
>>  void *buf, size_t len, off_t offset)
>> diff --git a/lib/librte_eal/common/eal_common_pci.c 
>> b/lib/librte_eal/common/eal_common_pci.c
>> index 6a0f6ac..58f0c74 100644
>> --- a/lib/librte_eal/common/eal_common_pci.c
>> +++ b/lib/librte_eal/common/eal_common_pci.c
>> @@ -87,8 +87,6 @@ struct pci_driver_list pci_driver_list =
>>  struct pci_device_list pci_device_list =
>>  TAILQ_HEAD_INITIALIZER(pci_device_list);
>>  
>> -#define SYSFS_PCI_DEVICES "/sys/bus/pci/devices"
>> -
> 
> Is this delete correct? The SYSFS_PCI_DEVICES was moved here from
> rte_pci.h. A relict of some rebase?

My miss - I had though of using pci_get_sysfs_path post v5, (as you have 
mentioned below) but while re-creating the patches I missed out.

> 
>>  const char *pci_get_sysfs_path(void)
>>  {
>>  const char *path = NULL;
>> diff --git a/lib/librte_eal/common/eal_private.h 
>> b/lib/librte_eal/common/eal_private.h
>> index 06a68f6..b8ff9c5 100644
>> --- a/lib/librte_eal/common/eal_private.h
>> +++ b/lib/librte_eal/common/eal_private.h
>> @@ -152,6 +152,19 @@ struct rte_pci_driver;
>>  struct rte_pci_device;
>>  
>>  /**
>> + * Update a pci device object by asking the kernel for the latest 
>> information.
>> + *
>> + * This function is private to EAL.
>> + *
>> + * @param addr
>> + *  The PCI Bus-Device-Function address to look for
>> + * @return
>> + *   - 0 on success.
>> + *   - negative on error.
>> + */
>> +int pci_update_device(const struct rte_pci_addr *addr);
>> +
>> +/**
>>   * Unbind kernel driver for this device
>>   *
>>   * This function is private to EAL.
>> diff --git a/lib/librte_eal/common/include/rte_pci.h 
>> b/lib/librte_eal/common/include/rte_pci.h
>> index 06508fa..5c2062c 100644
>> --- a/lib/librte_eal/common/include/rte_pci.h
>> +++ b/lib/librte_eal/common/include/rte_pci.h
>> @@ -107,6 +107,9 @@ const char *pci_get_sysfs_path(void);
>>  /** Nb. of values in PCI resource format. */
>>  #define PCI_RESOURCE_FMT_NVAL 3
>>  
>> +/** Default sysfs path for PCI device search. */
>> +#define SYSFS_PCI_DEVICES "/sys/bus/pci/devices"
>> +
> 
> Why adding SYSFS_PCI_DEVICES here? The pci_get_sysfs_path is used for
> this purpose. Again, a rebase issue?

Once pci_get_sysfs_path is used, this change would be redundant.

> 
>>  /**
>>   * A structure describing a PCI resource.
>>   */
>> diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c 
>> b/lib/librte_eal/linuxapp/eal/eal_pci.c
>> index f0215ee..8f3ef20 100644
>> --- a/lib/librte_eal/linuxapp/eal/eal_pci.c
>> +++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
>> @@ -417,6

[dpdk-dev] [PATCH v6 12/17] pci: add a helper for device name

2016-07-15 Thread Shreyansh jain

On Thursday 14 July 2016 10:25 PM, Jan Viktorin wrote:
> On Tue, 12 Jul 2016 11:31:17 +0530
> Shreyansh Jain  wrote:
> 
>> eal is a better place than crypto / ethdev for naming resources.
> 
> s/for naming/to name/

OK.

> 
> What is meant by "resources" here?

This has historic context (from earlier version of this patch). 
But I could relate the word 'resources' to EAL representation of devices - 
whether PCI or Crypto.
Or, Resource == Device.

> 
>> Add a helper in eal and make use of it in crypto / ethdev.
>>
>> Signed-off-by: David Marchand 
>> Signed-off-by: Shreyansh Jain 
>> ---
>>  lib/librte_cryptodev/rte_cryptodev.c| 27 ---
>>  lib/librte_eal/common/include/rte_pci.h | 25 +
>>  lib/librte_ether/rte_ethdev.c   | 24 
>>  3 files changed, 33 insertions(+), 43 deletions(-)
>>
>> diff --git a/lib/librte_cryptodev/rte_cryptodev.c 
>> b/lib/librte_cryptodev/rte_cryptodev.c
>> index d7be111..60c6384 100644
>> --- a/lib/librte_cryptodev/rte_cryptodev.c
>> +++ b/lib/librte_cryptodev/rte_cryptodev.c
>> @@ -367,23 +367,6 @@ rte_cryptodev_pmd_allocate(const char *name, int 
>> socket_id)
>>  return cryptodev;
>>  }
>>  
>> -static inline int
>> -rte_cryptodev_create_unique_device_name(char *name, size_t size,
>> -struct rte_pci_device *pci_dev)
>> -{
>> -int ret;
>> -
>> -if ((name == NULL) || (pci_dev == NULL))
>> -return -EINVAL;
>> -
>> -ret = snprintf(name, size, "%d:%d.%d",
>> -pci_dev->addr.bus, pci_dev->addr.devid,
>> -pci_dev->addr.function);
>> -if (ret < 0)
>> -return ret;
>> -return 0;
>> -}
>> -
>>  int
>>  rte_cryptodev_pmd_release_device(struct rte_cryptodev *cryptodev)
>>  {
>> @@ -446,9 +429,8 @@ rte_cryptodev_pci_probe(struct rte_pci_driver *pci_drv,
>>  if (cryptodrv == NULL)
>>  return -ENODEV;
>>  
>> -/* Create unique Crypto device name using PCI address */
>> -rte_cryptodev_create_unique_device_name(cryptodev_name,
>> -sizeof(cryptodev_name), pci_dev);
>> +rte_eal_pci_device_name(_dev->addr, cryptodev_name,
>> +sizeof(cryptodev_name));
>>  
>>  cryptodev = rte_cryptodev_pmd_allocate(cryptodev_name, rte_socket_id());
>>  if (cryptodev == NULL)
>> @@ -503,9 +485,8 @@ rte_cryptodev_pci_remove(struct rte_pci_device *pci_dev)
>>  if (pci_dev == NULL)
>>  return -EINVAL;
>>  
>> -/* Create unique device name using PCI address */
>> -rte_cryptodev_create_unique_device_name(cryptodev_name,
>> -sizeof(cryptodev_name), pci_dev);
>> +rte_eal_pci_device_name(_dev->addr, cryptodev_name,
>> +sizeof(cryptodev_name));
>>  
>>  cryptodev = rte_cryptodev_pmd_get_named_dev(cryptodev_name);
>>  if (cryptodev == NULL)
>> diff --git a/lib/librte_eal/common/include/rte_pci.h 
>> b/lib/librte_eal/common/include/rte_pci.h
>> index 3027adf..06508fa 100644
>> --- a/lib/librte_eal/common/include/rte_pci.h
>> +++ b/lib/librte_eal/common/include/rte_pci.h
>> @@ -82,6 +82,7 @@ extern "C" {
>>  #include 
>>  #include 
>>  
>> +#include 
>>  #include 
>>  
>>  TAILQ_HEAD(pci_device_list, rte_pci_device); /**< PCI devices in D-linked 
>> Q. */
>> @@ -95,6 +96,7 @@ const char *pci_get_sysfs_path(void);
>>  
>>  /** Formatting string for PCI device identifier: Ex: :00:01.0 */
>>  #define PCI_PRI_FMT "%.4" PRIx16 ":%.2" PRIx8 ":%.2" PRIx8 ".%" PRIx8
>> +#define PCI_PRI_STR_SIZE sizeof(":XX:XX.X")
>>  
>>  /** Short formatting string, without domain, for PCI device: Ex: 00:01.0 */
>>  #define PCI_SHORT_PRI_FMT "%.2" PRIx8 ":%.2" PRIx8 ".%" PRIx8
>> @@ -308,6 +310,29 @@ eal_parse_pci_DomBDF(const char *input, struct 
>> rte_pci_addr *dev_addr)
>>  }
>>  #undef GET_PCIADDR_FIELD
>>  
>> +/**
>> + * Utility function to write a pci device name, this device name can later 
>> be
>> + * used to retrieve the corresponding rte_pci_addr using above functions.
> 
> What about saying "using functions eal_parse_pci_*BDF"? The
> specification "above" is quite uncertain...

Agree that 'above' is positional word and should be avoided.
I will change that to "... using eal_parse_pci_* BDF helpers". OK?

> 
>> + *
>> + * @param addr
>> + *  The PCI Bus-Device-Function address
>> + * @param output
>> + *  The output buffer string
>> + * @param size
>> + *  The output buffer size
>> + * @return
>> + *  0 on success, negative on error.
>> + */
>> +static inline void
>> +rte_eal_pci_device_name(const struct rte_pci_addr *addr,
>> +char *output, size_t size)
>> +{
>> +RTE_VERIFY(size >= PCI_PRI_STR_SIZE);
>> +RTE_VERIFY(snprintf(output, size, PCI_PRI_FMT,
>> +addr->domain, addr->bus,
>> +addr->devid, addr->function) >= 0);
>> +}
>> +
>>  /* Compare two PCI device addresses. */
>>  /**
> 
> [...]
> 
>

[dpdk-dev] [PATCH v6 05/17] eal: introduce init macros

2016-07-15 Thread Thomas Monjalon

Hi guys,

2016-07-15 16:18, Shreyansh jain:
> On Thursday 14 July 2016 09:27 PM, Jan Viktorin wrote:
> > Please. The time is almost gone. 18/7/2016 is the release (according
> > to the roadmap)... I have to fix it in my patchset, otherwise it
> > does not build (after moving the .name from rte_pci_driver to
> > rte_driver).
> 
> I didn't consider 18/Jul.
> Please go ahead. I will continue to send v7 _without_ the above change
> so that your patchset doesn't break. This way you will not get blocked
> because of me.
[...]
> > I'd like to preserve the clear semantics. That is DRIVER_REGISTER_PCI
> > -> give a pci device.
> > 
> > Has anybody a different opinion? David? Thomas?
> 
> Yes please.
> Or else, if nothing comes up soon, I will simply go ahead and change to
> DRIVER_REGISTER_PCI(eth_driver.pci_drv) as this trivial issue shouldn't
> hold back this series.

I'm sorry I have no time to review these series shortly.

I think it is too late to consider an integration in 16.07 unfortunately.
I feel you are doing a great job and we can target to have these changes
in the early days of 16.11 (in August).
As this is a big refactoring, it is probably a good idea to integrate it
in the beginning of the next release integration cycle.

[dpdk-dev] [PATCH 02/10] bnx2x: Remove unused preprocessor symbols and code

2016-07-15 Thread Chas Williams

On Fri, 2016-07-15 at 16:56 +0100, Bruce Richardson wrote:
> On Tue, Jul 12, 2016 at 09:38:06AM -0400, Chas Williams wrote:
> > ELINK_INCLUDE_EMUL and ELINK_INCLUDE_FPGA are never defined.??Remove them
> > along with enumeration constants dependent on their inclusion.
> >?
> > Fixes: 540a211084a7 ("bnx2x: driver core")
> >?
> > Signed-off-by: Chas Williams <3chas3 at gmail.com>
> 
> Hi Chas,
> 
> the threading on this submission is very awkward - poor patch 1 got somehow
> separated from the rest of it's patch family. :-)

Yeah, I don't know what happened there.

> Besides this, given where we are in the release cycle for 16.07, these 
> cleanups
> are being deferred for possible inclusion in 16.11 (once reviewed and acked,
> obviously). To get cleanup like this in in future release, please submit by
> the integration deadline date.

Not a problem. ?These are not urgent changes. ?There are some
real issues fixed but I suspect no one runs into them.

[dpdk-dev] [PATCH v2] vhost: fix segfault on bad descriptor address

2016-07-15 Thread Yuanhan Liu

On Thu, Jul 14, 2016 at 11:18:39AM +0300, Ilya Maximets wrote:
> In current implementation vhost will crash with segmentation fault
> if malicious or buggy virtio application breaks addresses of descriptors.
> 
> Before commit 0823c1cb0a73 this crash was reproducible even with
> normal DPDK application that tries to change number of virtqueues
> dynamically inside VM.
> 
> Fix that by checking addresses of descriptors before using.
> 
> Also fixed return value on error for 'copy_mbuf_to_desc_mergeable()'
> from '-1' to '0' because it returns unsigned value and it means
> number of used descriptors.

Yeah, that's a good fix. Thanks.

Maybe you'd better make it a standalone patch.

> Signed-off-by: Ilya Maximets 
> ---
> Version 2:
>   * Rebased on top of current master.
>   * host's address now checked in meargeable case,
> because needed refactoring already done.
>   * Commit-message changed because old issue with
> virtio reload accidentially fixed by commit
> 0823c1cb0a73.
> 
>  lib/librte_vhost/vhost_rxtx.c | 28 +---
>  1 file changed, 21 insertions(+), 7 deletions(-)
> 
> diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
> index 15ca956..31e8b58 100644
> --- a/lib/librte_vhost/vhost_rxtx.c
> +++ b/lib/librte_vhost/vhost_rxtx.c
> @@ -147,10 +147,10 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct 
> vhost_virtqueue *vq,
>   struct virtio_net_hdr_mrg_rxbuf virtio_hdr = {{0, 0, 0, 0, 0, 0}, 0};
>  
>   desc = >desc[desc_idx];
> - if (unlikely(desc->len < dev->vhost_hlen))
> + desc_addr = gpa_to_vva(dev, desc->addr);
> + if (unlikely(desc->len < dev->vhost_hlen || !desc_addr))
>   return -1;

So, you discards the workaround from Rich?

>  
> - desc_addr = gpa_to_vva(dev, desc->addr);
>   rte_prefetch0((void *)(uintptr_t)desc_addr);
>  
>   virtio_enqueue_offload(m, _hdr.hdr);
> @@ -182,7 +182,10 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct 
> vhost_virtqueue *vq,
>   return -1;
>  
>   desc = >desc[desc->next];
> - desc_addr   = gpa_to_vva(dev, desc->addr);
> + desc_addr = gpa_to_vva(dev, desc->addr);
> + if (unlikely(!desc_addr))
> + return -1;
> +
>   desc_offset = 0;
>   desc_avail  = desc->len;
>   }
> @@ -387,10 +390,10 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, 
> struct vhost_virtqueue *vq,
>   LOG_DEBUG(VHOST_DATA, "(%d) current index %d | end index %d\n",
>   dev->vid, cur_idx, end_idx);
>  
> - if (buf_vec[vec_idx].buf_len < dev->vhost_hlen)
> - return -1;
> -
>   desc_addr = gpa_to_vva(dev, buf_vec[vec_idx].buf_addr);
> + if (buf_vec[vec_idx].buf_len < dev->vhost_hlen || !desc_addr)
> + return 0;
> +
>   rte_prefetch0((void *)(uintptr_t)desc_addr);
>  
>   virtio_hdr.num_buffers = end_idx - start_idx;
> @@ -425,6 +428,8 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, 
> struct vhost_virtqueue *vq,
>  
>   vec_idx++;
>   desc_addr = gpa_to_vva(dev, buf_vec[vec_idx].buf_addr);
> + if (unlikely(!desc_addr))
> + return 0;
>  
>   /* Prefetch buffer address. */
>   rte_prefetch0((void *)(uintptr_t)desc_addr);
> @@ -507,7 +512,7 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t 
> queue_id,
>   *(volatile uint16_t *)>used->idx += nr_used;
>   vhost_log_used_vring(dev, vq, offsetof(struct vring_used, idx),
>   sizeof(vq->used->idx));
> - vq->last_used_idx = end;
> + vq->last_used_idx += nr_used;

Ditto, this may deserve another patch, too.

--yliu

[dpdk-dev] [PATCH v3 2/2] vhost: do sanity check for ring descriptor address

2016-07-15 Thread Ilya Maximets

In current implementation vhost will crash with segmentation fault
if malicious or buggy virtio application breaks addresses of descriptors.

Before commit 0823c1cb0a73 ("vhost: workaround stale vring base")
this crash was reproducible even with normal DPDK application that tries
to change number of virtqueues dynamically inside VM.

Fix that by checking addresses of descriptors before using.

Signed-off-by: Ilya Maximets 
---
 lib/librte_vhost/vhost_rxtx.c | 29 -
 1 file changed, 24 insertions(+), 5 deletions(-)

diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
index a9b04df..bc00518 100644
--- a/lib/librte_vhost/vhost_rxtx.c
+++ b/lib/librte_vhost/vhost_rxtx.c
@@ -147,10 +147,15 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct 
vhost_virtqueue *vq,
struct virtio_net_hdr_mrg_rxbuf virtio_hdr = {{0, 0, 0, 0, 0, 0}, 0};

desc = >desc[desc_idx];
-   if (unlikely(desc->len < dev->vhost_hlen))
+   desc_addr = gpa_to_vva(dev, desc->addr);
+   /*
+* Checking of 'desc_addr' placed outside of 'unlikely' macro to avoid
+* performance issue with some versions of gcc (4.8.4 and 5.3.0) which
+* otherwise stores offset on the stack instead of in a register.
+*/
+   if (unlikely(desc->len < dev->vhost_hlen) || !desc_addr)
return -1;

-   desc_addr = gpa_to_vva(dev, desc->addr);
rte_prefetch0((void *)(uintptr_t)desc_addr);

virtio_enqueue_offload(m, _hdr.hdr);
@@ -182,7 +187,10 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct 
vhost_virtqueue *vq,
return -1;

desc = >desc[desc->next];
-   desc_addr   = gpa_to_vva(dev, desc->addr);
+   desc_addr = gpa_to_vva(dev, desc->addr);
+   if (unlikely(!desc_addr))
+   return -1;
+
desc_offset = 0;
desc_avail  = desc->len;
}
@@ -387,10 +395,10 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, 
struct vhost_virtqueue *vq,
LOG_DEBUG(VHOST_DATA, "(%d) current index %d | end index %d\n",
dev->vid, cur_idx, end_idx);

-   if (buf_vec[vec_idx].buf_len < dev->vhost_hlen)
+   desc_addr = gpa_to_vva(dev, buf_vec[vec_idx].buf_addr);
+   if (buf_vec[vec_idx].buf_len < dev->vhost_hlen || !desc_addr)
return 0;

-   desc_addr = gpa_to_vva(dev, buf_vec[vec_idx].buf_addr);
rte_prefetch0((void *)(uintptr_t)desc_addr);

virtio_hdr.num_buffers = end_idx - start_idx;
@@ -425,6 +433,8 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct 
vhost_virtqueue *vq,

vec_idx++;
desc_addr = gpa_to_vva(dev, buf_vec[vec_idx].buf_addr);
+   if (unlikely(!desc_addr))
+   return 0;

/* Prefetch buffer address. */
rte_prefetch0((void *)(uintptr_t)desc_addr);
@@ -688,6 +698,9 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct 
vhost_virtqueue *vq,
return -1;

desc_addr = gpa_to_vva(dev, desc->addr);
+   if (unlikely(!desc_addr))
+   return -1;
+
hdr = (struct virtio_net_hdr *)((uintptr_t)desc_addr);
rte_prefetch0(hdr);

@@ -701,6 +714,9 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct 
vhost_virtqueue *vq,
desc = >desc[desc->next];

desc_addr = gpa_to_vva(dev, desc->addr);
+   if (unlikely(!desc_addr))
+   return -1;
+
rte_prefetch0((void *)(uintptr_t)desc_addr);

desc_offset = 0;
@@ -737,6 +753,9 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct 
vhost_virtqueue *vq,
desc = >desc[desc->next];

desc_addr = gpa_to_vva(dev, desc->addr);
+   if (unlikely(!desc_addr))
+   return -1;
+
rte_prefetch0((void *)(uintptr_t)desc_addr);

desc_offset = 0;
-- 
2.7.4

[dpdk-dev] [PATCH v3 1/2] vhost: fix using of bad return value on mergeable enqueue

2016-07-15 Thread Ilya Maximets

Return value on error changed from '-1' to '0' because it returns
unsigned value and it means number of used descriptors.

Also fixed updating of 'last_used_idx' by using actual number of
used descriptors.

Fixes: 623bc47054d0 ("vhost: do sanity check for ring descriptor length")

Signed-off-by: Ilya Maximets 
---
 lib/librte_vhost/vhost_rxtx.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
index 15ca956..a9b04df 100644
--- a/lib/librte_vhost/vhost_rxtx.c
+++ b/lib/librte_vhost/vhost_rxtx.c
@@ -388,7 +388,7 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct 
vhost_virtqueue *vq,
dev->vid, cur_idx, end_idx);

if (buf_vec[vec_idx].buf_len < dev->vhost_hlen)
-   return -1;
+   return 0;

desc_addr = gpa_to_vva(dev, buf_vec[vec_idx].buf_addr);
rte_prefetch0((void *)(uintptr_t)desc_addr);
@@ -507,7 +507,7 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t 
queue_id,
*(volatile uint16_t *)>used->idx += nr_used;
vhost_log_used_vring(dev, vq, offsetof(struct vring_used, idx),
sizeof(vq->used->idx));
-   vq->last_used_idx = end;
+   vq->last_used_idx += nr_used;
}

if (likely(pkt_idx)) {
-- 
2.7.4

[dpdk-dev] [PATCH v3 0/2] vhost: fix segfault on bad descriptor address

2016-07-15 Thread Ilya Maximets

Version 3:
* Patch splitted in two.
* Applied workaround from Rich Lane and added comment about
  performance issue with some compilers and 'unlikely' macro.

Version 2:
* Rebased on top of current master.
* host's address now checked in meargeable case,
  because needed refactoring already done.
* Commit-message changed because old issue with
  virtio reload accidentially fixed by commit
  0823c1cb0a73.

Ilya Maximets (2):
  vhost: fix using of bad return value on mergeable enqueue
  vhost: do sanity check for ring descriptor address

 lib/librte_vhost/vhost_rxtx.c | 33 ++---
 1 file changed, 26 insertions(+), 7 deletions(-)

-- 
2.7.4

[dpdk-dev] capture packets on VM

2016-07-15 Thread Pattan, Reshma



> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Raja Jayapal
> Sent: Friday, July 15, 2016 6:55 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] capture packets on VM
> 
> Hi All,
> 
> I have installed dpdk on VM and would like to know how to capture the packets
> on dpdk ports.
> I am sending traffic from host? and want to know how to confirm whether the
> packets are flowing via dpdk ports.
> I tried with tcpdump and wireshark but could not capture the packets inside 
> VM.
> setup : bridge1(Host)--- VM(Guest with DPDK) - bridge2(Host)
> 

Hi,

On DPDK you can capture packets with app/pdump/ tool. This tool is available 
for use from 16.07RC1. 
What you can do is run testpmd and see if packets are seen in testpmd, that 
confirms if packets are landing on dpdk ports or not.
If you also want to capture packet for analysis, you need to run app/pdump/ 
tool along with testpmd. 
The pdump tool captures the packet to pcap file, so you can use tcpdump -ni 
 to view the packets. 
More about the tool usage can be found under doc/guides/sample_app_ug/pdump.rst

Let me know if you need further help on this.

Thanks,
Reshma

[dpdk-dev] [PATCH v6 12/17] pci: add a helper for device name

2016-07-15 Thread Jan Viktorin

On Fri, 15 Jul 2016 11:56:38 +0200
Thomas Monjalon  wrote:

> 2016-07-15 15:09, Shreyansh jain:
> > On Thursday 14 July 2016 10:25 PM, Jan Viktorin wrote:  
> > > What is meant by "resources" here?  
> > 
> > This has historic context (from earlier version of this patch). 
> > But I could relate the word 'resources' to EAL representation of devices - 
> > whether PCI or Crypto.
> > Or, Resource == Device.  
> 
> We need to define a precise wording, especially for the words
>   - interface
>   - resource
>   - device
>   - driver
>   - module
> Following are my thoughts:
> 
> An interface is a view of a resource through an API (e.g. an ethdev port).
> A resource is a well identified hardware (e.g. a PCI device id).
> A device is a hardware function (e.g. a networking port).
> Note that some PCI NICs have only one PCI device id for several ports
> (Chelsio and Mellanox cases). That's why we need to distinguish device
> and resource.
> 
> A driver is the code handling a device.
> I have read the word module in some patch proposals but I don't really know
> what it refers to. Is it a group of drivers in the same file/directory?

Well, yes, initially there was proposed a kind of module. However, after David
as sent the patchset prepare for rte_device/driver, you can see that there
is no need for such object (probably). A module might be helpful when thinking
about the metadata (as done by Neil Horman). But as far as I know there was no
need for this... So a module has no data abstraction now and I am not going to
make any.

Jan

[dpdk-dev] [PATCH v6 16/17] ethdev: convert to eal hotplug

2016-07-15 Thread Jan Viktorin

On Fri, 15 Jul 2016 16:06:25 +0530
Shreyansh jain  wrote:

> On Thursday 14 July 2016 10:21 PM, Jan Viktorin wrote:
> > On Tue, 12 Jul 2016 11:31:21 +0530
> > Shreyansh Jain  wrote:
> >   
> >> Remove bus logic from ethdev hotplug by using eal for this.
> >>
> >> Current api is preserved:
> >> - the last port that has been created is tracked to return it to the
> >>   application when attaching,
> >> - the internal device name is reused when detaching.
> >>
> >> We can not get rid of ethdev hotplug yet since we still need some mechanism
> >> to inform applications of port creation/removal to substitute for ethdev
> >> hotplug api.
> >>
> >> dev_type field in struct rte_eth_dev and rte_eth_dev_allocate are kept as
> >> is, but this information is not needed anymore and is removed in the 
> >> following
> >> commit.
> >>
> >> Signed-off-by: David Marchand 
> >> Signed-off-by: Shreyansh Jain 
> >> ---
> >>  lib/librte_ether/rte_ethdev.c | 207 
> >> +++---
> >>  1 file changed, 33 insertions(+), 174 deletions(-)
> >>
> >> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> >> index a667012..8d14fd7 100644
> >> --- a/lib/librte_ether/rte_ethdev.c
> >> +++ b/lib/librte_ether/rte_ethdev.c
> >> @@ -72,6 +72,7 @@
> >>  static const char *MZ_RTE_ETH_DEV_DATA = "rte_eth_dev_data";
> >>  struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS];
> >>  static struct rte_eth_dev_data *rte_eth_dev_data;
> >> +static uint8_t eth_dev_last_created_port;
> >>  static uint8_t nb_ports;
> >>
> > 
> > [...]
> >   
> >> -
> >>  /* attach the new device, then store port_id of the device */
> >>  int
> >>  rte_eth_dev_attach(const char *devargs, uint8_t *port_id)
> >>  {
> >> -  struct rte_pci_addr addr;
> >>int ret = -1;
> >> +  int current = eth_dev_last_created_port;
> >> +  char *name = NULL;
> >> +  char *args = NULL;
> >>  
> >>if ((devargs == NULL) || (port_id == NULL)) {
> >>ret = -EINVAL;
> >>goto err;
> >>}
> >>  
> >> -  if (eal_parse_pci_DomBDF(devargs, ) == 0) {
> >> -  ret = rte_eth_dev_attach_pdev(, port_id);
> >> -  if (ret < 0)
> >> -  goto err;
> >> -  } else {
> >> -  ret = rte_eth_dev_attach_vdev(devargs, port_id);
> >> -  if (ret < 0)
> >> -  goto err;
> >> +  /* parse devargs, then retrieve device name and args */
> >> +  if (rte_eal_parse_devargs_str(devargs, , ))
> >> +  goto err;
> >> +
> >> +  ret = rte_eal_dev_attach(name, args);
> >> +  if (ret < 0)
> >> +  goto err;
> >> +
> >> +  /* no point looking at eth_dev_last_created_port if no port exists */  
> > 
> > I am not sure about this comment. What is "no point"?
> > 
> > Isn't this also a potential bug? (Like the one below.) How could it
> > happen there is no port after a successful attach?  
> 
> Yes, even searching through code path I couldn't find a positive case where 
> control would reach here without nb_ports>0.
> Though, i am not sure if some rough application attempts to mix-up calls - 
> and that, in my opinion, is not worth checking.
> Should I remove it?

At least a loud log might be helpful. If there is really no path to reach this 
point, I'd put RTE_VERIFY here.

> 
> >   
> >> +  if (!nb_ports) {
> >> +  ret = -1;
> >> +  goto err;
> >> +  }
> >> +  /* if nothing happened, there is a bug here, since some driver told us
> >> +   * it did attach a device, but did not create a port */
> >> +  if (current == eth_dev_last_created_port) {
> >> +  ret = -1;
> >> +  goto err;  
> > 
> > Should we log this? Or call some kind panic?  
> 
> I will place a log because applications should have a chance to ignore a 
> device which cannot be attached for whatever reason.

OK, we just should shout loudly if it means a bug...

[...]

-- 
   Jan Viktorin  E-mail: Viktorin at RehiveTech.com
   System Architect  Web:www.RehiveTech.com
   RehiveTech
   Brno, Czech Republic

[dpdk-dev] dpdk daily build error on dpdk16.07-rc2

2016-07-15 Thread Jan Mędala

Hello Yu Liu,

I've sent second version of patch to test on icc16 compiler, Yongjie is
helping me to test issues on icc.

  Jan

[dpdk-dev] [PATCH v2] vfio: fix pci_vfio_map_resource

2016-07-15 Thread Burakov, Anatoly

> The offset of the 2nd mmap() when mapping the region after msix_bar
> needs to take region address into consideration as mmap() takes address
> that is resource-relative instead of bar-relative.  This is exposed when
> binding vmxnet3 to vfio-pci.
> 
> Fixes: 90a1633b2347 ("eal/linux: allow to map BARs with MSI-X tables")
> 
> Signed-off-by: Yong Wang 
> Signed-off-by: Ronghua Zhang 
> ---
> v2:
> * Addressed review comment from Dan
> 
>  lib/librte_eal/linuxapp/eal/eal_pci_vfio.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_vfio.c
> b/lib/librte_eal/linuxapp/eal/eal_pci_vfio.c
> index 46cd683..bf7cf61 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_pci_vfio.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_pci_vfio.c
> @@ -431,7 +431,7 @@ pci_vfio_map_resource(struct rte_pci_device *dev)
>   } else {
>   memreg[0].offset = reg.offset;
>   memreg[0].size = table_start;
> - memreg[1].offset = table_end;
> + memreg[1].offset = reg.offset + table_end;
>   memreg[1].size = reg.size - table_end;
> 
>   RTE_LOG(DEBUG, EAL,
> @@ -474,7 +474,8 @@ pci_vfio_map_resource(struct rte_pci_device *dev)
>   /* if there's a second part, try to map it */
>   if (map_addr != MAP_FAILED
>   && memreg[1].offset && memreg[1].size) {
> - void *second_addr =
> RTE_PTR_ADD(bar_addr, memreg[1].offset);
> + void *second_addr =
> RTE_PTR_ADD(bar_addr,
> +
>   memreg[1].offset - reg.offset);
>   map_addr =
> pci_map_resource(second_addr,
>   vfio_dev_fd,
> memreg[1].offset,
>   memreg[1].size,
> --
> 1.9.1

Thomas, I guess we can put it in? It doesn't affect anything outside the stated 
use case, and code wise there's nothing preventing this from being merged 
either. I've done cursory testing with an ixgbe device just in case, nothing 
exploded. So,

Acked-by: Anatoly  Burakov

[dpdk-dev] virtio PMD issue

2016-07-15 Thread Vincent Li

it is fixed in 16.07-rc2 that I checked

Vincent

On Fri, Jul 15, 2016 at 11:48 AM, Harish Patil  
wrote:
> Hi Huawie/Yuanhan,
> I encounter segfault issue in virtio PMD driver when I run any DPDK
> application in the VM. The virtio devices should not have been probed in
> the first place since they are not attached to igb_uio driver (and managed
> by kernel driver).
>
>
> Network devices using DPDK-compatible driver
> 
> :00:09.0 'FastLinQ QL45000 Series 25GbE Controller' drv=igb_uio unused=
> :00:0a.0 'FastLinQ QL45000 Series 25GbE Controller' drv=igb_uio unused=
>
> Network devices using kernel driver
> ===
> :00:03.0 'Virtio network device' if= drv=virtio-pci unused=igb_uio
>
>
>
> #0 0x004e5392 in vtpci_init (dev=0xb61970, hw=0x7fff76146600)
> at /home2/hpatil/dpdk/drivers/net/virtio/virtio_pci.c:653
> #1 0x004effc4 in eth_virtio_dev_init (eth_dev=0xacb7a0
> )
> at /home2/hpatil/dpdk/drivers/net/virtio/virtio_ethdev.c:1060
> #2 0x004b730f in rte_eth_dev_init ()
> #3 0x004cb940 in pci_probe_all_drivers ()
> #4 0x004cbe88 in rte_eal_pci_probe ()
> #5 0x004c0336 in rte_eal_init ()
> #6 0x004404e2 in main ()
> (gdb) p dev->devargs
> $1 = (struct rte_devargs *) 0x0
>
>
> Looks like its a known issue based on a old thread:
> http://dpdk.org/dev/patchwork/patch/12462/
>
> Do we know a fix now?
>
> Thanks,
> harish
>
>
>
>

[dpdk-dev] [PATCH] net/enic: remove print of internal driver version

2016-07-15 Thread John Daley

The enic PMD code has diverged from code that was once
shared with the enic kernel mode driver for performance
reasons. It is confusing and misleading to print the
internal version number. Remove it.

Fixes: fefed3d1e62c ("enic: new driver")
Signed-off-by: John Daley 
---
 drivers/net/enic/enic.h  | 1 -
 drivers/net/enic/enic_main.c | 2 +-
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/enic/enic.h b/drivers/net/enic/enic.h
index a5e2e38..4c16ef1 100644
--- a/drivers/net/enic/enic.h
+++ b/drivers/net/enic/enic.h
@@ -51,7 +51,6 @@

 #define DRV_NAME   "enic_pmd"
 #define DRV_DESCRIPTION"Cisco VIC Ethernet NIC Poll-mode 
Driver"
-#define DRV_VERSION"1.0.0.6"
 #define DRV_COPYRIGHT  "Copyright 2008-2015 Cisco Systems, Inc"

 #define ENIC_WQ_MAX8
diff --git a/drivers/net/enic/enic_main.c b/drivers/net/enic/enic_main.c
index d8669cc..89afb9f 100644
--- a/drivers/net/enic/enic_main.c
+++ b/drivers/net/enic/enic_main.c
@@ -1128,7 +1128,7 @@ int enic_probe(struct enic *enic)
struct rte_pci_device *pdev = enic->pdev;
int err = -1;

-   dev_debug(enic, " Initializing ENIC PMD version %s\n", DRV_VERSION);
+   dev_debug(enic, " Initializing ENIC PMD\n");

enic->bar0.vaddr = (void *)pdev->mem_resource[0].addr;
enic->bar0.len = pdev->mem_resource[0].len;
-- 
2.7.0

[dpdk-dev] [PATCH] lib: change rte_ring dequeue to guarantee ordering before tail update

2016-07-15 Thread Ananyev, Konstantin

> 
> Consumer queue dequeuing must be guaranteed to be done fully before the tail 
> is updated. This is not guaranteed with a read barrier,
> changed to a write barrier just before tail update which in practice 
> guarantees correct order of reads and writes.
> 
> Signed-off-by: Juhamatti Kuusisaari 
> ---
>  lib/librte_ring/rte_ring.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h index 
> eb45e41..14920af 100644
> --- a/lib/librte_ring/rte_ring.h
> +++ b/lib/librte_ring/rte_ring.h
> @@ -748,7 +748,7 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void 
> **obj_table,
> 
> /* copy in table */
> DEQUEUE_PTRS();
> -   rte_smp_rmb();
> +   rte_smp_wmb();
> 
> __RING_STAT_ADD(r, deq_success, n);
> r->cons.tail = cons_next;
> --

Acked-by: Konstantin Ananyev 

> 2.9.0
>

[dpdk-dev] spinlock: Move constructor function out of header file

2016-07-15 Thread Thomas Monjalon

2016-07-15 09:54, Damjan Marion:
> So we don?t have much pending beside 2 patches for i40e which 
> Jeff submitted yesterday and they will i guess need to wait for 16.11.

Yes these i40e patches will probably have to wait 16.11.

> Only one which I have on my mind is:
> 
> https://git.fd.io/cgit/vpp/tree/dpdk/dpdk-16.04_patches/0005-Allow-applications-to-override-rte_delay_us.patch
> 
> This is big issue for us when running single-core, as some
> drivers tend to call rte_delay_us for a long time, and that is 
> causing packet drops. I.e. if you do stop/start on one interface
> and you are running BFD on another one, BFD will fail?
> 
> Current patch is hack, it basically allows us to override 
> delay function so we can de-schedule it,
> do some other useful work while waiting for delay to finish
> and then give control back to original function?
> 
> Maybe we can fix this by providing a delay callback functionality...

Yes it could be an idea.
Please send a RFC patch.

[dpdk-dev] [PATCH] lib: move rte_ring read barrier to correct location

2016-07-15 Thread Jerin Jacob

On Thu, Jul 14, 2016 at 12:56:11PM +, Ananyev, Konstantin wrote:
> 
> 
> > The CPU also
> > knows already the value that will be written to cons.tail and that value 
> > does not depend on the previous read either. The CPU does not
> > know we are planning to do a spinlock there, so it might do things 
> > out-of-order without proper dependencies.
> > 
> > > For  __rte_ring_sc_do_dequeue(), I think you right, we might need
> > > something stronger.
> > > I don't want to put rte_smp_mb() here as it would cause full HW
> > > barrier even on machines with strong memory order (IA).
> > > I think that rte_smp_wmb() might be enough here:
> > > it would force cpu to wait till writes in DEQUEUE_PTRS() are become
> > > visible, which means reads have to be completed too.
> > 
> > In practice I think that rte_smp_wmb() would work fine, even though it is 
> > not strictly according to the book. Below solution would be my
> > proposal as a fix to the issue of sc dequeueing (and also to mc dequeueing, 
> > if we have the problem of CPU completely ignoring the spinlock
> > in reality there):
> > 
> > DEQUEUE_PTRS();
> > ..
> > rte_smp_wmb();
> > r->cons.tail = cons_next;
> 
> As I said in previous email - it looks good for me for 
> _rte_ring_sc_do_dequeue(),
> but I am interested to hear what  ARM and PPC maintainers think about it.
> Jan, Jerin do you have any comments on it?

Actually it is NOT performance effective and difficult to capture the
ORDER dependency with plane store and load barriers on WEAK ordered machines.
Beyond plane store and load barriers, We need to express  #LoadLoad,
#LoadStore,#StoreStore barrier dependency with Acquire and Release
Semantics in Arch neutral code(Looks like this is compiler barrier on IA)
http://preshing.com/20120913/acquire-and-release-semantics/

For instance, Full barrier CAS(__sync_bool_compare_and_swap) will not be
required for weak ordered machine in MP case.
I can send out a RFC version of ring implementation changes required
with acquire-and-release-semantics.
If it has performance degradation on IA then we can separate it out
through conditional compilation flag.

GCC Built-in Functions for Memory Model Aware Atomic Operations
https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html

Thoughts ?

Jerin

> Chao, sorry but I still not sure why PPC is considered as architecture with 
> strong memory ordering?
> Might be I am missing something obvious here.
> Thank
> Konstantin
>

[dpdk-dev] [PATCH v6 12/17] pci: add a helper for device name

2016-07-15 Thread Jan Viktorin

On Fri, 15 Jul 2016 15:09:58 +0530
Shreyansh jain  wrote:

> On Thursday 14 July 2016 10:25 PM, Jan Viktorin wrote:
> > On Tue, 12 Jul 2016 11:31:17 +0530
> > Shreyansh Jain  wrote:
> >   
> >> eal is a better place than crypto / ethdev for naming resources.  
> > 
> > s/for naming/to name/  
> 
> OK.
> 
> > 
> > What is meant by "resources" here?  
> 
> This has historic context (from earlier version of this patch). 
> But I could relate the word 'resources' to EAL representation of devices - 
> whether PCI or Crypto.
> Or, Resource == Device.

If it is possible I'd like more "device". But I think it's not that critical 
thing...

> 
> >   
> >> Add a helper in eal and make use of it in crypto / ethdev.

[...]

> >>  TAILQ_HEAD(pci_device_list, rte_pci_device); /**< PCI devices in D-linked 
> >> Q. */
> >> @@ -95,6 +96,7 @@ const char *pci_get_sysfs_path(void);
> >>  
> >>  /** Formatting string for PCI device identifier: Ex: :00:01.0 */
> >>  #define PCI_PRI_FMT "%.4" PRIx16 ":%.2" PRIx8 ":%.2" PRIx8 ".%" PRIx8
> >> +#define PCI_PRI_STR_SIZE sizeof(":XX:XX.X")
> >>  
> >>  /** Short formatting string, without domain, for PCI device: Ex: 00:01.0 
> >> */
> >>  #define PCI_SHORT_PRI_FMT "%.2" PRIx8 ":%.2" PRIx8 ".%" PRIx8
> >> @@ -308,6 +310,29 @@ eal_parse_pci_DomBDF(const char *input, struct 
> >> rte_pci_addr *dev_addr)
> >>  }
> >>  #undef GET_PCIADDR_FIELD
> >>  
> >> +/**
> >> + * Utility function to write a pci device name, this device name can 
> >> later be
> >> + * used to retrieve the corresponding rte_pci_addr using above functions. 
> >>  
> > 
> > What about saying "using functions eal_parse_pci_*BDF"? The
> > specification "above" is quite uncertain...  
> 
> Agree that 'above' is positional word and should be avoided.
> I will change that to "... using eal_parse_pci_* BDF helpers". OK?

OK.

[...]


-- 
   Jan Viktorin  E-mail: Viktorin at RehiveTech.com
   System Architect  Web:www.RehiveTech.com
   RehiveTech
   Brno, Czech Republic

[dpdk-dev] [PATCH v6 12/17] pci: add a helper for device name

2016-07-15 Thread Thomas Monjalon

2016-07-15 15:09, Shreyansh jain:
> On Thursday 14 July 2016 10:25 PM, Jan Viktorin wrote:
> > What is meant by "resources" here?
> 
> This has historic context (from earlier version of this patch). 
> But I could relate the word 'resources' to EAL representation of devices - 
> whether PCI or Crypto.
> Or, Resource == Device.

We need to define a precise wording, especially for the words
- interface
- resource
- device
- driver
- module
Following are my thoughts:

An interface is a view of a resource through an API (e.g. an ethdev port).
A resource is a well identified hardware (e.g. a PCI device id).
A device is a hardware function (e.g. a networking port).
Note that some PCI NICs have only one PCI device id for several ports
(Chelsio and Mellanox cases). That's why we need to distinguish device
and resource.

A driver is the code handling a device.
I have read the word module in some patch proposals but I don't really know
what it refers to. Is it a group of drivers in the same file/directory?

[dpdk-dev] [PATCH v3 1/2] doc: live migration of VM with Virtio and VF

2016-07-15 Thread Iremonger, Bernard

Hi John,



> > Subject: [PATCH v3 1/2] doc: live migration of VM with Virtio and VF
> >
> > This patch describes the procedure to be be followed to perform Live
> > Migration of a VM with Virtio and VF PMD's using the bonding PMD.
> >
> > It includes sample host and VM scripts used in the procedure, and a
> > sample switch configuration.
> 
> Hi Bernard,
> 
> Thanks for the doc. It is a complicated process so it is good to have this
> detailed how-to. Some comments below.
> 
> 
> > +
> > +Live Migration of VM with SR-IOV VF:
> > +
> 
> 
> I wouldn't include a colon in the heading. Something like this may be better:

Ok, I will remove colons from headings.

>  Live Migration of a VM with SR-IOV VF
>  =
> 
> 
> > +
> > +Live Migration overview for VM with Virtio and VF PMD's:
> > +
> 
> 
> Also, I wouldn't include the "Live Migration ... for VM with Virtio ..."
> in each heading. The context should be clear from the main sections of the
> doc. I this case I would just call the heading "Overview"

Ok, will change.

> 
> > +
> > +It is not possible to migrate a Virtual Machine which has an SR-IOV
> > Virtual Function.
> 
> Add (VF) in brackets at the first mention of Virtual Function here. Also for 
> PF
> below.

Ok, I will add (VF) and (PF)

> 
> > +To get around this problem the bonding PMD is used.
> 
> I would add to the end of this paragraph:
> 
> The following sections show an example of how to do this.

Ok, will add.
> 
> 
> Also I think the "Test Setup" header from below should be moved up to this
> point.
> 
> > +
> > +A bonded device is created in the VM.
> > +The virtio and VF PMD's are added as slaves to the bonded device.
> > +The VF is set as the primary slave of the bonded device.
> > +
> > +A bridge must be set up on the Host connecting the tap device, which
> > +is the backend of the Virtio device and the Physical Function device.
> > +
> > +To test the Live Migration two servers with identical operating
> > +systems
> > installed are used.
> > +KVM and Qemu 2.3 is also required on the servers.
> > +
> > +The servers have Niantic and or Fortville NIC's installed.
> > +The NIC's on both servers are connected to a switch which is also
> > +connected to the traffic generator.
> 
> I would prefix this paragraph with: "In this example, the servers have Niantic
> and .."

Ok, will do.
> 
> 
> > +The switch is configured to broadcast traffic on all the NIC ports.
> > +
> > +Live Migration with SR-IOV VF test setup:
> > +-
> 
> Just "Test Setup" would be better. And move up, I think.

Ok, will do.
> 
> 
> > +
> > +Live Migration steps for VM with Virtio and VF PMD's:
> > +-
> 
> Again "Live Migration steps" would be fine.

Ok, will change.

> 
> 
> Also I would change this paragraph to link forward to the scripts like:
> 
> 
> The sample scripts mentioned in the steps below can be found in the
> :ref:`lm_bond_virtio_sriov_host_scripts`
> and :ref:`lm_bond_virtio_sriov_host_scripts` sections.
> 
> 
> And then put the targets in the relevant sections of the docs.
> 
> .. _lm_bond_virtio_sriov_host_scripts:
> 
> .. _lm_bond_virtio_sriov_vm_scripts:
> 
> 
Ok will do.

> > +
> > +The host is running the Kernel Physical Function driver (ixgbe or i40e).
> > +
> > +The ip address of host_server_1 is 10.237.212.46
> > +
> > +The ip address of host_server_2 is 10.237.212.131
> > +
> > +The sample scripts mentioned in the steps below can be found in the
> > +host_scripts and vm_scripts sections.
> > +
> > +On host_server_1: Terminal 1
> > +
> 
> The documentation guidelines say to use ~~ as the third level heading
> underline. It doesn't really matter since the RST parsers figure it out.
> Up to you if you want to change it.

I will leave as is.

> 
> http://dpdk.org/doc/guides/contributing/documentation.html#section-
> headers
> 
> > +The following command only works with kernel PF for Niantic
> > +
> > +.. code-block:: console
> > +
> > +testpmd> mac_addr add port 1 vf 0 AA:BB:CC:DD:EE:FF
> > +
> > +create bonded device(mode) (socket)
> 
> Should be a space after device.

Ok, will add a space.

> Also these lines describing the state or state
> change should be written have capital letters and full stops so they read like
> sentences. Also, if describing a state change you could add "now" so the
> reader is aware of the change. For example:
> 
>   - primary is port 1, 2 active slaves
>   + Primary is now port1. There are 2 active slaves.

Ok, will change.

> 
> > +
> > +Setup Virtual Machine on host_server_1
> > +
> > +.. code-block:: sh
> > +
> > +  #!/bin/sh
> > +
> 
> 
> I thought that the code block had to be indented 3 spaces but the RST
> parsers didn't complain. Might be worth changing them anyway to a
> consistent 3

[dpdk-dev] capture packets on VM

2016-07-15 Thread Raja Jayapal

Hi All,

I have installed dpdk on VM and would like to know how to capture the packets 
on dpdk ports.
I am sending traffic from host? and want to know how to confirm whether the 
packets are flowing via dpdk ports.
I tried with tcpdump and wireshark but could not capture the packets inside VM.
setup : bridge1(Host)--- VM(Guest with DPDK) - bridge2(Host)

Please suggest.

Thanks,
Raja

=-=-=
Notice: The information contained in this e-mail
message and/or attachments to it may contain 
confidential or privileged information. If you are 
not the intended recipient, any dissemination, use, 
review, distribution, printing or copying of the 
information contained in this e-mail message 
and/or attachments to it are strictly prohibited. If 
you have received this communication in error, 
please notify us by reply e-mail or telephone and 
immediately and permanently delete the message 
and any attachments. Thank you

[dpdk-dev] [PATCH v2 0/6] enable lpm, acl and other missing libraries in ppc64le

2016-07-15 Thread Chao Zhu

Gowrishankar,

When I tried the patches, I got some compilation error:

In file included from
/root/test/sub/dpdk/lib/librte_acl/acl_run_altivec.c:34:0:
/root/test/sub/dpdk/lib/librte_acl/acl_run_altivec.h: In function
'transition4':
/root/test/sub/dpdk/lib/librte_acl/acl_run_altivec.h:198:2: error:
dereferencing type-punned pointer will break strict-aliasing rules
[-Werror=strict-aliasing]
  *indices1 = (xmm_t){((uint32_t *))[0], ((uint32_t *))[1],
  ^
/root/test/sub/dpdk/lib/librte_acl/acl_run_altivec.h:202:2: error:
dereferencing type-punned pointer will break strict-aliasing rules
[-Werror=strict-aliasing]
  *indices2 = (xmm_t){((uint32_t *))[0], ((uint32_t *))[1],

Can you help to take a look?


-Original Message-
From: Gowrishankar [mailto:gowrishanka...@linux.vnet.ibm.com] 
Sent: 2016?7?10? 15:51
To: dev at dpdk.org
Cc: Chao Zhu ; Bruce Richardson
; Konstantin Ananyev
; Thomas Monjalon ;
Cristian Dumitrescu ; Pradeep
; gowrishankar 
Subject: [PATCH v2 0/6] enable lpm, acl and other missing libraries in
ppc64le

From: gowrishankar 

This patchset enables LPM, ACL and other few missing libs in ppc64le and
also address few patches in related examples (ip_pipeline and l3fwd).

Test report:
LPM and ACL unit tests verified as in patch set v1.
Same results as before observed.

v2 changes:
- enabling libs in config included as part of lib changes itself.

gowrishankar (6):
  lpm: add altivec intrinsics for dpdk lpm on ppc_64
  acl: add altivec intrinsics for dpdk acl on ppc_64
  ip_pipeline: fix lcore mapping for varying SMT threads as in ppc64
  table: cache align rte_bucket_4_8
  sched: enable sched library for ppc64le
  l3fwd: add altivec support for em_hash_key

 app/test-acl/main.c|   4 +
 app/test/test_xmmt_ops.h   |  16 +
 config/defconfig_ppc_64-power8-linuxapp-gcc|   7 -
 examples/ip_pipeline/cpu_core_map.c|  12 +-
 examples/ip_pipeline/init.c|   4 +
 examples/l3fwd/l3fwd_em.c  |   8 +
 lib/librte_acl/Makefile|   2 +
 lib/librte_acl/acl.h   |   4 +
 lib/librte_acl/acl_run.h   |   2 +
 lib/librte_acl/acl_run_altivec.c   |  47 +++
 lib/librte_acl/acl_run_altivec.h   | 328
+
 lib/librte_acl/rte_acl.c   |  13 +
 lib/librte_acl/rte_acl.h   |   1 +
 .../common/include/arch/ppc_64/rte_vect.h  |  60 
 lib/librte_lpm/Makefile|   2 +
 lib/librte_lpm/rte_lpm.h   |   2 +
 lib/librte_lpm/rte_lpm_altivec.h   | 154 ++
 lib/librte_table/rte_table_hash_key8.c |   2 +-
 18 files changed, 649 insertions(+), 19 deletions(-)  create mode 100644
lib/librte_acl/acl_run_altivec.c  create mode 100644
lib/librte_acl/acl_run_altivec.h  create mode 100644
lib/librte_eal/common/include/arch/ppc_64/rte_vect.h
 create mode 100644 lib/librte_lpm/rte_lpm_altivec.h

--
1.9.1

[dpdk-dev] [RFC] remove vhost-cuse

2016-07-15 Thread Thomas Monjalon

2016-07-15 16:42, Yuanhan Liu:
> You see that no body cares it :)
> 
> So I will make a patch to mark vhost-cuse as deprecated shortly.
> Thomas, works to you?

Perfect. Thanks Yuanhan


> On Mon, Jul 11, 2016 at 11:59:55AM +0800, Yuanhan Liu wrote:
> > It's something echoed around in my mind for a long while, and here I'm
> > gonna make it public: a proposal to remove vhost-cuse.
> > 
> > Vhost-cuse was invented before vhost-user exist. The both are actually
> > doing the same thing: a vhost-net implementation in user space. But they
> > are not exactly the same thing.
> > 
> > Firstly, vhost-cuse is harder for use; no one seems to care it, either.
> > Furthermore, since v2.1, a large majority of development effort has gone
> > to vhost-user. For example, we extended the vhost-user spec to add the
> > multiple queue support. We also added the vhost-user live migration at
> > v16.04 and the latest one, vhost-user reconnect that allows vhost app
> > restart without restarting the guest. Both of them are very important
> > features for product usage and none of them works for vhost-cuse.
> > 
> > You now see that the difference between vhost-user and vhost-cuse is
> > big (and will be bigger and bigger as time moves forward), that you
> > should never use vhost-cuse, that we should drop it completely.
> > 
> > The remove would also result to a much cleaner code base, allowing us
> > to do all kinds of extending easier.
> > 
> > A talk with Huawei offline showed that he backs this proposal. I was
> > also told by Ciara that she actually had the same idea: she has already
> > cooked a patch to remove vhost-cuse support from OVS:
> > 
> > http://openvswitch.org/pipermail/dev/2016-July/074696.html
> > 
> > So I'm proposing to mark vhost-cuse as deprecated in this release and
> > remove it completely at the next release (v16.11).
> > 
> > Comments/thoughts, or objections?
> > 
> > --yliu

[dpdk-dev] dpdk daily build error on dpdk16.07-rc2

2016-07-15 Thread Liu, Yu Y

Hi Jan,

Any update on ena related error?

Thanks,
Yu Liu

From: Gu, YongjieX
Sent: Thursday, July 14, 2016 5:23 PM
To: thomas.monjalon at 6wind.com; Richardson, Bruce ; Yigit, Ferruh ; Gonzalez Monroy, Sergio 
; Jan Medala ; dev at 
dpdk.org
Cc: Cao, Waterman ; Liu, Yu Y ; Chen, WeichunX ; Xu, HuilongX 
; Gu, YongjieX 
Subject: dpdk daily build error on dpdk16.07-rc2

Hi,All,
  There are 4 build errors reported by our dpdk daily report,and can you take a 
look on these issues.
  Jan is working on ena issues,and I will help to test it.

OS, Kernel , GCC , ICC

Affected Targets

Build errors

comments

FC18_64,3.6.10-4,4.7.2,14.0.0

x86_64-ivshmem-linuxapp-gcc

CC test_kvargs.o

UBT124_64,3.8.0-29,4.6.3,14.0.0

  LD test

SUSE11SP2_64,3.0.13-0,4.5.1,14.0.0

/jenkins/workspace/DPDK_AUTO_IDT_VM_FC18_64_BUILD2/DPDK/x86_64-ivshmem-linuxapp-gcc/lib/librte_eal.a(eal_alarm.o):
 In function `eal_alarm_callback':


eal_alarm.c:(.text+0xd7): undefined reference to `clock_gettime'


/jenkins/workspace/DPDK_AUTO_IDT_VM_FC18_64_BUILD2/DPDK/x86_64-ivshmem-linuxapp-gcc/lib/librte_eal.a(eal_alarm.o):
 In function `rte_eal_alarm_set':


eal_alarm.c:(.text+0x20f): undefined reference to `clock_gettime'


/jenkins/workspace/DPDK_AUTO_IDT_VM_FC18_64_BUILD2/DPDK/x86_64-ivshmem-linuxapp-gcc/lib/librte_eal.a(eal_timer.o):
 In function `get_tsc_freq':


eal_timer.c:(.text+0x108): undefined reference to `clock_gettime'


eal_timer.c:(.text+0x146): undefined reference to `clock_gettime'


collect2: error: ld returned 1 exit status


make[5]: *** [test] Error 1


make[4]: *** [test] Error 2


make[3]: *** [app] Error 2


make[2]: *** [all] Error 2


make[1]: *** [pre_install] Error 2


make: *** [install] Error 2



SUSE11SP2_64,3.0.13-0,4.5.1,14.0.0

x86_64-native-linuxapp-gcc-examples

== ipsec-secgw


  CC ipsec.o


  CC esp.o


  CC sp4.o


  CC sp6.o


  CC sa.o


/jenkins/workspace/DPDK_AUTO_IDT_VM_SUSE11SP2_64_BUILD/DPDK/examples/ipsec-secgw/sa.c:56:2:
 error: unknown field 'ip4' specified in initializer


compilation terminated due to -Wfatal-errors.


make[4]: *** [sa.o] Error 1


make[3]: *** [all] Error 2


make[2]: *** [ipsec-secgw] Error 2


make[1]: *** [x86_64-native-linuxapp-gcc_examples] Error 2


make: *** [examples] Error 2



x86_64-native-linuxapp-icc-examples

== ipsec-secgw




x86_64-ivshmem-linuxapp-icc-examples

  CC ipsec.o




  CC esp.o




  CC sp4.o




  CC sp6.o




  CC sa.o




icc: command line warning #10158: ignoring option '-diag-disable'; argument 
must be separate




/jenkins/workspace/DPDK_AUTO_IDT_VM_SUSE11SP2_64_BUILD/DPDK/examples/ipsec-secgw/sa.c(56):
 error: a designator for an anonymous union member can only appear within 
braces corresponding to that anonymous union




.src.ip4 = IPv4(172, 16, 1, 5),




 ^







compilation aborted for 
/jenkins/workspace/DPDK_AUTO_IDT_VM_SUSE11SP2_64_BUILD/DPDK/examples/ipsec-secgw/sa.c
 (code 4)




make[4]: *** [sa.o] Error 4




make[3]: *** [all] Error 2




make[2]: *** [ipsec-secgw] Error 2




make[1]: *** [x86_64-native-linuxapp-icc_examples] Error 2




make: *** [examples] Error 2





UBT144_32,3.13.0-30,4.8.2,14.0.0

i686-native-linuxapp-icc

== Build drivers/net/ena






  CC ena_ethdev.o






  PMDINFO ena_ethdev.o.pmd.c






  CC ena_ethdev.o.pmd.o






  LD ena_ethdev.o






  CC ena_com.o






/jenkins/workspace/DPDK_AUTO_IDT_VM_UBT144_32_BUILD/DPDK/drivers/net/ena/base/ena_com.c(346):
 error #3656: variable "dev_node" may be used before its value is set






ENA_MEM_ALLOC_COHERENT_NODE(ena_dev->dmadev,






^









compilation aborted for 
/jenkins/workspace/DPDK_AUTO_IDT_VM_UBT144_32_BUILD/DPDK/drivers/net/ena/base/ena_com.c
 (code 4)






make[6]: *** [ena_com.o] Error 4






make[5]: *** [ena] Error 2






make[4]: *** [net] Error 2






make[3]: *** [drivers] Error 2






make[2]: *** [all] Error 2






make[1]: *** [pre_install] Error 2






make: *** [install] Error 2






freebsd10.3,10.3-RELEASE,4.8.5

x86_64-native-linuxapp-gcc

sed: 1: "/usr/home/xugang/dpdk-1 ...": extra characters at the end of h command


sed: 1: "/usr/home/xugang/dpdk-1 ...": extra characters at the end of h command


sed: 1: "/usr/home/xugang/dpdk-1 ...": extra characters at the end of h command


sed: 1: "/usr/home/xugang/dpdk-1 ...": extra characters at the end of h command


sed: 1: "/usr/home/xugang/dpdk-1 ...": extra characters at the end of h command


sed: 1: "/usr/home/xugang/dpdk-1 ...": extra characters at the end of h command


sed: 1: "/usr/home/xugang/dpdk-1 ...": extra characters at the end of h command


sed: 1: "/usr/home/xugang/dpdk-1 ...": extra characters at the end of h command


sed: 1: "/usr/home/xugang/dpdk-1 ...": extra characters at the end of h command


sed: 1: "/usr/home/xugang/dpdk-1 ...": extra characters at the end of h command


sed: 1: "/usr/home/xugang/dpdk-1 ...": extra characters at the end of h command


sed: 1: "/usr/home/xugang/dpdk-1 ...": extra characters at the end of h command

[dpdk-dev] [PATCH 2/2] doc: add vhost_user live migration image

2016-07-15 Thread Mcnamara, John

> -Original Message-
> From: Iremonger, Bernard
> Sent: Monday, July 11, 2016 4:05 PM
> To: Mcnamara, John ; dev at dpdk.org
> Cc: Liu, Yong ; Xu, Qian Q ;
> yuanhan.liu at linux.intel.com; Iremonger, Bernard
> 
> Subject: [PATCH 2/2] doc: add vhost_user live migration image
> 
> This patch adds an image of the Live Migration of a VM using vhost_user on
> the host, test configuration.
> 
> Signed-off-by: Bernard Iremonger 

Not my choice of colours. :-) Nevertheless:

Acked-by: John McNamara

[dpdk-dev] [PATCH v3 1/2] doc: live migration of VM with Virtio and VF

2016-07-15 Thread Mcnamara, John

> -Original Message-
> From: Iremonger, Bernard
> Sent: Thursday, July 7, 2016 11:43 AM
> To: Mcnamara, John ; dev at dpdk.org
> Cc: Liu, Yong ; Xu, Qian Q ;
> Iremonger, Bernard 
> Subject: [PATCH v3 1/2] doc: live migration of VM with Virtio and VF
> 
> This patch describes the procedure to be be followed to perform Live
> Migration of a VM with Virtio and VF PMD's using the bonding PMD.
> 
> It includes sample host and VM scripts used in the procedure, and a sample
> switch configuration.

Hi Bernard,

Thanks for the doc. It is a complicated process so it is good to have this
detailed how-to. Some comments below.


> +
> +Live Migration of VM with SR-IOV VF:
> +


I wouldn't include a colon in the heading. Something like this may be better:

 Live Migration of a VM with SR-IOV VF
 =


> +
> +Live Migration overview for VM with Virtio and VF PMD's:
> +


Also, I wouldn't include the "Live Migration ... for VM with Virtio ..."
in each heading. The context should be clear from the main sections of
the doc. I this case I would just call the heading "Overview"

> +
> +It is not possible to migrate a Virtual Machine which has an SR-IOV
> Virtual Function.

Add (VF) in brackets at the first mention of Virtual Function here. Also
for PF below.

> +To get around this problem the bonding PMD is used.

I would add to the end of this paragraph:

The following sections show an example of how to do this.


Also I think the "Test Setup" header from below should be moved up to this 
point.

> +
> +A bonded device is created in the VM.
> +The virtio and VF PMD's are added as slaves to the bonded device.
> +The VF is set as the primary slave of the bonded device.
> +
> +A bridge must be set up on the Host connecting the tap device, which is
> +the backend of the Virtio device and the Physical Function device.
> +
> +To test the Live Migration two servers with identical operating systems
> installed are used.
> +KVM and Qemu 2.3 is also required on the servers.
> +
> +The servers have Niantic and or Fortville NIC's installed.
> +The NIC's on both servers are connected to a switch which is also
> +connected to the traffic generator.

I would prefix this paragraph with: "In this example, the servers have
Niantic and .."


> +The switch is configured to broadcast traffic on all the NIC ports.
> +
> +Live Migration with SR-IOV VF test setup:
> +-

Just "Test Setup" would be better. And move up, I think.


> +
> +Live Migration steps for VM with Virtio and VF PMD's:
> +-

Again "Live Migration steps" would be fine.


Also I would change this paragraph to link forward to the scripts like:


The sample scripts mentioned in the steps below can be found in the
:ref:`lm_bond_virtio_sriov_host_scripts`
and :ref:`lm_bond_virtio_sriov_host_scripts` sections.


And then put the targets in the relevant sections of the docs.

.. _lm_bond_virtio_sriov_host_scripts:

.. _lm_bond_virtio_sriov_vm_scripts:


> +
> +The host is running the Kernel Physical Function driver (ixgbe or i40e).
> +
> +The ip address of host_server_1 is 10.237.212.46
> +
> +The ip address of host_server_2 is 10.237.212.131
> +
> +The sample scripts mentioned in the steps below can be found in the
> +host_scripts and vm_scripts sections.
> +
> +On host_server_1: Terminal 1
> +

The documentation guidelines say to use ~~ as the third level heading
underline. It doesn't really matter since the RST parsers figure it out.
Up to you if you want to change it.

http://dpdk.org/doc/guides/contributing/documentation.html#section-headers

> +The following command only works with kernel PF for Niantic
> +
> +.. code-block:: console
> +
> +testpmd> mac_addr add port 1 vf 0 AA:BB:CC:DD:EE:FF
> +
> +create bonded device(mode) (socket)

Should be a space after device. Also these lines describing the state
or state change should be written have capital letters and full stops
so they read like sentences. Also, if describing a state change you
could add "now" so the reader is aware of the change. For example:

  - primary is port 1, 2 active slaves
  + Primary is now port1. There are 2 active slaves.

> +
> +Setup Virtual Machine on host_server_1
> +
> +.. code-block:: sh
> +
> +  #!/bin/sh
> +


I thought that the code block had to be indented 3 spaces but the RST
parsers didn't complain. Might be worth changing them anyway to a
consistent 3 or 4 space indent.

John

[dpdk-dev] [PATCH] mk: fix default rule of test subdirectory

2016-07-15 Thread Pattan, Reshma



> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas Monjalon
> Sent: Tuesday, July 12, 2016 11:22 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH] mk: fix default rule of test subdirectory
> 
> When using "make -C app/test" (with RTE_SDK/RTE_TARGET adjusted) without
> specifying the rule (all), the build is not done.
> Indeed the default rule is not "all" anymore since there are some rules added 
> for
> external resources link.
> 
> It is fixed by adding a reference to "all" at the top of the file which makes 
> it the
> default rule.
> 
> Note that make app/test_sub (without environment variable) is preffered.
> 
> Fixes: ab64f5df8004 ("app/test: support resources externally linked")
> 
> Reported-by: Reshma Pattan 
> Signed-off-by: Thomas Monjalon 

Acked-by: Reshma Pattan

[dpdk-dev] [PATCH] lib: move rte_ring read barrier to correct location

2016-07-15 Thread Ananyev, Konstantin

Hi Jerin,

> >
> >
> > > The CPU also
> > > knows already the value that will be written to cons.tail and that
> > > value does not depend on the previous read either. The CPU does not know 
> > > we are planning to do a spinlock there, so it might do things
> out-of-order without proper dependencies.
> > >
> > > > For  __rte_ring_sc_do_dequeue(), I think you right, we might need
> > > > something stronger.
> > > > I don't want to put rte_smp_mb() here as it would cause full HW
> > > > barrier even on machines with strong memory order (IA).
> > > > I think that rte_smp_wmb() might be enough here:
> > > > it would force cpu to wait till writes in DEQUEUE_PTRS() are
> > > > become visible, which means reads have to be completed too.
> > >
> > > In practice I think that rte_smp_wmb() would work fine, even though
> > > it is not strictly according to the book. Below solution would be my
> > > proposal as a fix to the issue of sc dequeueing (and also to mc 
> > > dequeueing, if we have the problem of CPU completely ignoring the
> spinlock in reality there):
> > >
> > > DEQUEUE_PTRS();
> > > ..
> > > rte_smp_wmb();
> > > r->cons.tail = cons_next;
> >
> > As I said in previous email - it looks good for me for
> > _rte_ring_sc_do_dequeue(), but I am interested to hear what  ARM and PPC 
> > maintainers think about it.
> > Jan, Jerin do you have any comments on it?
> 
> Actually it is NOT performance effective and difficult to capture the ORDER 
> dependency with plane store and load barriers on WEAK
> ordered machines.
> Beyond plane store and load barriers, We need to express  #LoadLoad, 
> #LoadStore,#StoreStore barrier dependency with Acquire and
> Release Semantics in Arch neutral code(Looks like this is compiler barrier on 
> IA) http://preshing.com/20120913/acquire-and-release-
> semantics/
> 
> For instance, Full barrier CAS(__sync_bool_compare_and_swap) will not be 
> required for weak ordered machine in MP case.
> I can send out a RFC version of ring implementation changes required with 
> acquire-and-release-semantics.
> If it has performance degradation on IA then we can separate it out through 
> conditional compilation flag.
> 
> GCC Built-in Functions for Memory Model Aware Atomic Operations 
> https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html

I am not sure what exactly changes you are planning,
but I suppose I'd just wait for your RFC here.
Though my question was: what do you think about current 
_rte_ring_sc_do_dequeue()? 
Do you agree that rmb() is not sufficient here and does Juhamatti patch:
http://dpdk.org/dev/patchwork/patch/14846/
looks good to you?
It looks good to me ,and I am going to ACK it, but thought you'd better
have a look too. 
Thanks
Konstantin


> 
> Thoughts ?
> 
> Jerin
> 
> > Chao, sorry but I still not sure why PPC is considered as architecture with 
> > strong memory ordering?
> > Might be I am missing something obvious here.
> > Thank
> > Konstantin
> >

[dpdk-dev] spinlock: Move constructor function out of header file

2016-07-15 Thread Thomas Monjalon

2016-07-14 22:20, Damjan Marion:
> 
> > On 15 Jul 2016, at 00:06, Thomas Monjalon  
> > wrote:
> > 
> > 2016-07-14 18:10, Damjan Marion:
> >> Dear Jan,
> >> 
> >> Thank you for your comments. A bit too much overhead to submit simple patch
> >> so let?s forget about it. I will just add it as it is to our private
> >> collection of patches.
> > 
> > These are changes trivial to fix when applying.
> > I strongly prefer that you upstream patches instead of keeping patches
> > in the VPP repository. I will help you in this task.
> > Thanks for the effort.
> 
> Yeah, I agree with you, unfortunately it is all about time,
> for me it is still cheaper to rebase them.
> 
> I respect your rules, but I just don?t have enough free cycles
> to spend learning all of them, for my occasional patch submission.

We know there is a learning curve for the submission process.
That's why we do not expect it to be fully satisfied, especially
from occasional contributors.
I am used to do some not significant changes before applying to
help newcomers patches being accepted. That's what I will do in
your case.
As I said previously I will help you to drop your local patches.

Please continue sending other patches with a detailed explanation
and we will take care of them.

[dpdk-dev] [PATCH] EAL:fix memory barrier implementation on IBM POWER

2016-07-15 Thread Chao Zhu

On weak memory order architecture like POWER, rte_smp_wmb/rte_smp_rmb
need to use CPU instructions, not compiler barrier. This patch fixes
this. Also, to improve performance on PPC64, use light weight sync
instruction instead of sync instruction.

Signed-off-by: Chao Zhu 
---
 .../common/include/arch/ppc_64/rte_atomic.h|   12 ++--
 1 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h 
b/lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h
index feae486..924e894 100644
--- a/lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h
+++ b/lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h
@@ -62,7 +62,11 @@ extern "C" {
  * Guarantees that the STORE operations generated before the barrier
  * occur before the STORE operations generated after.
  */
+#ifdef RTE_ARCH_64
+#definerte_wmb() {asm volatile("lwsync" : : : "memory"); }
+#else
 #definerte_wmb() {asm volatile("sync" : : : "memory"); }
+#endif

 /**
  * Read memory barrier.
@@ -70,13 +74,17 @@ extern "C" {
  * Guarantees that the LOAD operations generated before the barrier
  * occur before the LOAD operations generated after.
  */
+#ifdef RTE_ARCH_64
+#definerte_rmb() {asm volatile("lwsync" : : : "memory"); }
+#else
 #definerte_rmb() {asm volatile("sync" : : : "memory"); }
+#endif

 #define rte_smp_mb() rte_mb()

-#define rte_smp_wmb() rte_compiler_barrier()
+#define rte_smp_wmb() rte_wmb()

-#define rte_smp_rmb() rte_compiler_barrier()
+#define rte_smp_rmb() rte_rmb()

 /*- 16 bit atomic operations 
-*/
 /* To be compatible with Power7, use GCC built-in functions for 16 bit
-- 
1.7.1

[dpdk-dev] [PATCH v2] vhost: fix segfault on bad descriptor address

2016-07-15 Thread Ilya Maximets

On 15.07.2016 09:17, Yuanhan Liu wrote:
> On Thu, Jul 14, 2016 at 11:18:39AM +0300, Ilya Maximets wrote:
>> In current implementation vhost will crash with segmentation fault
>> if malicious or buggy virtio application breaks addresses of descriptors.
>>
>> Before commit 0823c1cb0a73 this crash was reproducible even with
>> normal DPDK application that tries to change number of virtqueues
>> dynamically inside VM.
>>
>> Fix that by checking addresses of descriptors before using.
>>
>> Also fixed return value on error for 'copy_mbuf_to_desc_mergeable()'
>> from '-1' to '0' because it returns unsigned value and it means
>> number of used descriptors.
> 
> Yeah, that's a good fix. Thanks.
> 
> Maybe you'd better make it a standalone patch.

Ok. Maybe I should split this patch in two:
1. Fix return value + using of this value (vq->last_used_idx += nr_used;)
2. Check addresses of descriptors.
What do you think?

>> Signed-off-by: Ilya Maximets 
>> ---
>> Version 2:
>>  * Rebased on top of current master.
>>  * host's address now checked in meargeable case,
>>because needed refactoring already done.
>>  * Commit-message changed because old issue with
>>virtio reload accidentially fixed by commit
>>0823c1cb0a73.
>>
>>  lib/librte_vhost/vhost_rxtx.c | 28 +---
>>  1 file changed, 21 insertions(+), 7 deletions(-)
>>
>> diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
>> index 15ca956..31e8b58 100644
>> --- a/lib/librte_vhost/vhost_rxtx.c
>> +++ b/lib/librte_vhost/vhost_rxtx.c
>> @@ -147,10 +147,10 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct 
>> vhost_virtqueue *vq,
>>  struct virtio_net_hdr_mrg_rxbuf virtio_hdr = {{0, 0, 0, 0, 0, 0}, 0};
>>  
>>  desc = >desc[desc_idx];
>> -if (unlikely(desc->len < dev->vhost_hlen))
>> +desc_addr = gpa_to_vva(dev, desc->addr);
>> +if (unlikely(desc->len < dev->vhost_hlen || !desc_addr))
>>  return -1;
> 
> So, you discards the workaround from Rich?

I can apply it, if you wish. Should I?

>>  
>> -desc_addr = gpa_to_vva(dev, desc->addr);
>>  rte_prefetch0((void *)(uintptr_t)desc_addr);
>>  
>>  virtio_enqueue_offload(m, _hdr.hdr);
>> @@ -182,7 +182,10 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct 
>> vhost_virtqueue *vq,
>>  return -1;
>>  
>>  desc = >desc[desc->next];
>> -desc_addr   = gpa_to_vva(dev, desc->addr);
>> +desc_addr = gpa_to_vva(dev, desc->addr);
>> +if (unlikely(!desc_addr))
>> +return -1;
>> +
>>  desc_offset = 0;
>>  desc_avail  = desc->len;
>>  }
>> @@ -387,10 +390,10 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, 
>> struct vhost_virtqueue *vq,
>>  LOG_DEBUG(VHOST_DATA, "(%d) current index %d | end index %d\n",
>>  dev->vid, cur_idx, end_idx);
>>  
>> -if (buf_vec[vec_idx].buf_len < dev->vhost_hlen)
>> -return -1;
>> -
>>  desc_addr = gpa_to_vva(dev, buf_vec[vec_idx].buf_addr);
>> +if (buf_vec[vec_idx].buf_len < dev->vhost_hlen || !desc_addr)
>> +return 0;
>> +
>>  rte_prefetch0((void *)(uintptr_t)desc_addr);
>>  
>>  virtio_hdr.num_buffers = end_idx - start_idx;
>> @@ -425,6 +428,8 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, 
>> struct vhost_virtqueue *vq,
>>  
>>  vec_idx++;
>>  desc_addr = gpa_to_vva(dev, buf_vec[vec_idx].buf_addr);
>> +if (unlikely(!desc_addr))
>> +return 0;
>>  
>>  /* Prefetch buffer address. */
>>  rte_prefetch0((void *)(uintptr_t)desc_addr);
>> @@ -507,7 +512,7 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t 
>> queue_id,
>>  *(volatile uint16_t *)>used->idx += nr_used;
>>  vhost_log_used_vring(dev, vq, offsetof(struct vring_used, idx),
>>  sizeof(vq->used->idx));
>> -vq->last_used_idx = end;
>> +vq->last_used_idx += nr_used;
> 
> Ditto, this may deserve another patch, too.
> 
>   --yliu
> 
>

[dpdk-dev] [PATCH] app/testpmd: fix redundant time update

2016-07-15 Thread Reshma Pattan

Inside flush_fwd_rx_queues removed redundant prev_tsc update with cur_tsc,
as prev_tsc value is always updated with rte_rdtsc() in for loop.

Coverity issue: 127797

Fixes: f487715f36f5 ("app/testpmd: add timeout in Rx queue flushing")

Signed-off-by: Reshma Pattan 
---
 app/test-pmd/testpmd.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index b7f28e9..1428974 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -907,7 +907,6 @@ flush_fwd_rx_queues(void)
timer_tsc += diff_tsc;
} while ((nb_rx > 0) &&
(timer_tsc < timer_period));
-   prev_tsc = cur_tsc;
timer_tsc = 0;
}
}
-- 
2.5.0

[dpdk-dev] [PATCH] spinlock:move constructor function out of header

2016-07-15 Thread Thomas Monjalon

2016-07-14 22:45, Wiles, Keith:
> > On Jul 14, 2016, at 2:59 PM, Thomas Monjalon  
> > wrote:
> > 
> > Thanks Keith for continuing work.
> > 
> > 2016-07-14 14:31, Keith Wiles:
> >> lib/librte_eal/bsdapp/eal/Makefile |  1 +
> >> lib/librte_eal/common/arch/arm/rte_spinlock.c  | 46 
> >> ++
> >> lib/librte_eal/common/arch/ppc_64/rte_spinlock.c   | 46 
> >> ++
> >> lib/librte_eal/common/arch/tile/rte_spinlock.c | 46 
> >> ++
> >> lib/librte_eal/common/arch/x86/rte_spinlock.c  | 46 
> >> ++
> >> .../common/include/arch/x86/rte_spinlock.h | 14 ++-
> >> lib/librte_eal/linuxapp/eal/Makefile   |  1 +
> > 
> > I am not sure we should add a .c file for each arch, given it is called only
> > from arch/x86/rte_spinlock.h.
> 
> I did not like having the .c for everyone, but the previous comment seemed
> to suggest it. I am willing to change it any better method, just let me
> know what you think. I would like just one.

I will make sure it is not needed. In this case, we can keep the original
patch from Damjan and just do some trivial changes. I can make them quickly
before RC3.

> On a side note I have combined the bsdapp and linuxapp into a single
> directory before. It is doable and it eliminates a number of duplicate
> files or code.

Yes patches to remove duplicated code are welcome. But please do not
introduce more #ifdefs. I think it is better to keep separate directories
bsdapp/ and linuxapp/ while increasing the shared code in common/ as
much as possible.
Some functions are really different and are better handled separately.

> Plus a also added support for OS X for DPDK, but I do not have access
> to any NICs with that version yet other then virtual ones.
> I could submit it and may be someone will write the kext to make it work. :-)

Maybe that OS X would deserve a third separate directory.
I guess you wanted it only for dev testing?
Why not just use a Linux or FreeBSD VM?

[dpdk-dev] [RFC] Generic flow director/filtering/classification API

2016-07-15 Thread Chilikin, Andrey

Hi Sugesh,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Chandran, Sugesh
> Sent: Friday, July 15, 2016 10:23 AM
> To: 'Adrien Mazarguil' 



> > > > To PMD maintainers: please comment if you know devices that
> > > > support tagging matching packets with more than 32 bits of
> > > > user-provided data!
> > > [Sugesh] I guess the flow director ID is 64 bit , The XL710 datasheet 
> > > says so.
> > > And in the 'rte_mbuf' structure the 64 bit FDIR-ID is shared with
> > > rss hash. This can be a software driver limitation that expose only
> > > 32 bit. Possibly because of cache alignment issues? Since the
> > > hardware can support 64 bit, I feel it make sense to support 64 bit as 
> > > well.

XL710 supports 32bit FDIR ID only, I believe you mix it up with flexible 
payload data which can take 0, 4 or 8 bytes of the RX descriptor.

Regards,
Andrey

[dpdk-dev] [PATCH] examples/distributor: fix Rx thread logic for zero packets

2016-07-15 Thread De Lara Guarch, Pablo



> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Reshma Pattan
> Sent: Thursday, July 14, 2016 4:49 PM
> To: dev at dpdk.org
> Cc: Pattan, Reshma
> Subject: [dpdk-dev] [PATCH] examples/distributor: fix Rx thread logic for zero
> packets
> 
> From: Reshma Pattan 
> 
> Zero packets can be returned by rte_eth_rx_burst() and
> rte_distributor_returned_pkts() inside lcore_rx(), so
> for zero packet scenario instead of proceeding to
> next operations we should continue to the next iteration of the
> loop to avoid unnecessary processing overhead which is causing
> rx packets to be dropped and hence distributor failing to forward the
> packets.
> 
> Fixes: 07db4a97 ("examples/distributor: new sample app")
> 
> Signed-off-by: Reshma Pattan 

Acked-by: Pablo de Lara

[dpdk-dev] [PATCH] app/testpmd: fix redundant time update

2016-07-15 Thread De Lara Guarch, Pablo



> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Reshma Pattan
> Sent: Friday, July 15, 2016 10:19 AM
> To: dev at dpdk.org
> Cc: Pattan, Reshma
> Subject: [dpdk-dev] [PATCH] app/testpmd: fix redundant time update
> 
> Inside flush_fwd_rx_queues removed redundant prev_tsc update with
> cur_tsc,
> as prev_tsc value is always updated with rte_rdtsc() in for loop.
> 
> Coverity issue: 127797
> 
> Fixes: f487715f36f5 ("app/testpmd: add timeout in Rx queue flushing")
> 
> Signed-off-by: Reshma Pattan 

Acked-by: Pablo de Lara

[dpdk-dev] spinlock: Move constructor function out of header file

2016-07-15 Thread Damjan Marion (damarion)

> On 15 Jul 2016, at 10:31, Thomas Monjalon  
> wrote:
> 
> 2016-07-14 22:20, Damjan Marion:
>> 
>>> On 15 Jul 2016, at 00:06, Thomas Monjalon  
>>> wrote:
>>> 
>>> 2016-07-14 18:10, Damjan Marion:
 Dear Jan,

 Thank you for your comments. A bit too much overhead to submit simple patch
 so let?s forget about it. I will just add it as it is to our private
 collection of patches.
>>> 
>>> These are changes trivial to fix when applying.
>>> I strongly prefer that you upstream patches instead of keeping patches
>>> in the VPP repository. I will help you in this task.
>>> Thanks for the effort.
>> 
>> Yeah, I agree with you, unfortunately it is all about time,
>> for me it is still cheaper to rebase them.
>> 
>> I respect your rules, but I just don?t have enough free cycles
>> to spend learning all of them, for my occasional patch submission.
> 
> We know there is a learning curve for the submission process.
> That's why we do not expect it to be fully satisfied, especially
> from occasional contributors.
> I am used to do some not significant changes before applying to
> help newcomers patches being accepted. That's what I will do in
> your case.
> As I said previously I will help you to drop your local patches.
> 
> Please continue sending other patches with a detailed explanation
> and we will take care of them.

Ok, Thanks!

So we don?t have much pending beside 2 patches for i40e which 
Jeff submitted yesterday and they will i guess need to wait for 16.11.

Only one which I have on my mind is:

https://git.fd.io/cgit/vpp/tree/dpdk/dpdk-16.04_patches/0005-Allow-applications-to-override-rte_delay_us.patch

This is big issue for us when running single-core, as some
drivers tend to call rte_delay_us for a long time, and that is 
causing packet drops. I.e. if you do stop/start on one interface
and you are running BFD on another one, BFD will fail?

Current patch is hack, it basically allows us to override 
delay function so we can de-schedule it,
do some other useful work while waiting for delay to finish
and then give control back to original function?

Maybe we can fix this by providing a delay callback functionality...

Thanks,

Damjan

[dpdk-dev] [PATCH] doc: fix consumer/producer mixup in Ring lib doc

2016-07-15 Thread Mcnamara, John

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Shreyansh Jain
> Sent: Tuesday, July 12, 2016 10:45 AM
> To: olivier.matz at 6wind.com
> Cc: dev at dpdk.org; thomas.monjalon at 6wind.com
> Subject: [dpdk-dev] [PATCH] doc: fix consumer/producer mixup in Ring lib
> doc
> 
> Signed-off-by: Shreyansh Jain 

Thanks. It is strange that it has gone unnoticed for so long.

Acked-by: John McNamara

[dpdk-dev] net/virtio-user: Fix missing brackets in if condition

2016-07-15 Thread Yuanhan Liu

On Thu, Jul 14, 2016 at 06:31:11PM +0200, viktorin at rehivetech.com wrote:
> On Tue, 12 Jul 2016 11:30:25 +0200
> Maxime Coquelin  wrote:
> 
> > The error is reported using test build script:
> 
> I recommend to note that the error is reported only by GCC 6+.

Agreed!

--yliu
> 
> > 
> > $ scripts/test-build.sh x86_64-native-linuxapp-gcc
> > ...
> > drivers/net/virtio/virtio_user_ethdev.c: In function 
> > ?virtio_user_pmd_devinit?:
> > drivers/net/virtio/virtio_user_ethdev.c:345:2: error: this ?if? clause does 
> > not guard... [-Werror=misleading-indentation]
> >   if (rte_kvargs_count(kvlist, VIRTIO_USER_ARG_PATH) == 1)
> > ^~
> > 
> > Fixes: 404bd6bfe360 ("net/virtio-user: fix return value not checked")
> > 
> > Cc: Jianfeng Tan 
> > Signed-off-by: Maxime Coquelin 
> > Acked-by: Yuanhan Liu 
> > 
> Reviewed-by: Jan Viktorin

[dpdk-dev] [RFC] Generic flow director/filtering/classification API

2016-07-15 Thread Chandran, Sugesh

Thank you Adrien,
Please find below for some more comments/inputs

Let me know your thoughts on this.


Regards
_Sugesh


> -Original Message-
> From: Adrien Mazarguil [mailto:adrien.mazarguil at 6wind.com]
> Sent: Wednesday, July 13, 2016 9:03 PM
> To: Chandran, Sugesh 
> Cc: dev at dpdk.org; Thomas Monjalon ;
> Zhang, Helin ; Wu, Jingjing
> ; Rasesh Mody ; Ajit
> Khaparde ; Rahul Lakkireddy
> ; Lu, Wenzhuo ;
> Jan Medala ; John Daley ; Chen,
> Jing D ; Ananyev, Konstantin
> ; Matej Vido ;
> Alejandro Lucero ; Sony Chacko
> ; Jerin Jacob
> ; De Lara Guarch, Pablo
> ; Olga Shern 
> Subject: Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification
> API
> 
> On Mon, Jul 11, 2016 at 10:42:36AM +, Chandran, Sugesh wrote:
> > Hi Adrien,
> >
> > Thank you for your response,
> > Please see my comments inline.
> 
> Hi Sugesh,
> 
> Sorry for the delay, please see my answers inline as well.
> 
> [...]
> > > > > Flow director
> > > > > -
> > > > >
> > > > > Flow director (FDIR) is the name of the most capable filter
> > > > > type, which covers most features offered by others. As such, it
> > > > > is the most
> > > widespread
> > > > > in PMDs that support filtering (i.e. all of them besides **e1000**).
> > > > >
> > > > > It is also the only type that allows an arbitrary 32 bits value
> > > > > provided by applications to be attached to a filter and returned
> > > > > with matching packets instead of relying on the destination queue to
> recognize flows.
> > > > >
> > > > > Unfortunately, even FDIR requires applications to be aware of
> > > > > low-level capabilities and limitations (most of which come
> > > > > directly from **ixgbe**
> > > and
> > > > > **i40e**):
> > > > >
> > > > > - Bitmasks are set globally per device (port?), not per filter.
> > > > [Sugesh] This means application cannot define filters that matches
> > > > on
> > > arbitrary different offsets?
> > > > If that?s the case, I assume the application has to program
> > > > bitmask in
> > > advance. Otherwise how
> > > > the API framework deduce this bitmask information from the rules??
> > > > Its
> > > not very clear to me
> > > > that how application pass down the bitmask information for
> > > > multiple filters
> > > on same port?
> > >
> > > This is my understanding of how flow director currently works,
> > > perhaps someome more familiar with it can answer this question better
> than I could.
> > >
> > > Let me take an example, if particular device can only handle a
> > > single IPv4 mask common to all flow rules (say only to match
> > > destination addresses), updating that mask to also match the source
> > > address affects all defined and future flow rules simultaneously.
> > >
> > > That is how FDIR currently works and I think it is wrong, as it
> > > penalizes devices that do support individual bit-masks per rule, and
> > > is a little awkward from an application point of view.
> > >
> > > What I suggest for the new API instead is the ability to specify one
> > > bit-mask per rule, and let the PMD deal with HW limitations by
> > > automatically configuring global bitmasks from the first added rule,
> > > then refusing to add subsequent rules if they specify a conflicting
> > > bit-mask. Existing rules remain unaffected that way, and
> > > applications do not have to be extra cautious.
> > >
> > [Sugesh] The issue with that approach is, the hardware simply discards
> > the rule when it is a super set of first one eventhough the hardware
> > is capable of handling it. How its guaranteed the first rule will set
> > the bitmask for all the subsequent rules.
> 
> Just to clarify, the API only says that new rules cannot affect existing ones
> (which I think makes sense from a user's perspective), so as long as the PMD
> does whatever is needed to make all rules work together, there should not
> be any problem with this approach.
> 
> Even if the PMD has to temporarily remove an existing rule and reconfigure
> global masks in order to add subsequent rules, it is fine as long as packets
> aren't misdirected in the meantime (they may be dropped if there is no
> other choice).
[Sugesh] I feel this is fine. Thank you for confirming.
> 
> > How about having a CLASSIFER_TYPE for the classifier. Every port can
> > have set of supported flow types(for eg: L3_TYPE, L4_TYPE,
> > L4_TYPE_8BYTE_FLEX,
> > L4_TYPE_16BYTE_FLEX) based on the underlying FDIR support. Application
> > can query this and set the type accordingly while initializing the
> > port. This way the first rule need not set all the bits that may needed in 
> > the
> future rules.
> 
> Again from a user's POV, I think doing so would add unwanted HW-specific
> complexity.
> 
> However this concern can be handled through a different approach. Let's say
> user creates a pattern that only specifies a IP header with a given bit-mask.
> 
> In FDIR language this translates to:
> 
> - Set global mask for IPv4 accordingly, remaining global masks all zeroed
>

[dpdk-dev] capture packets on VM

2016-07-15 Thread Matt Laswell

Hey Raja,

When you bind the ports to the DPDK poll mode drivers, the kernel no longer
has visibility into them.  This makes some sense intuitively - it would be
very bad for both the kernel and a user mode application to both attempt to
control the ports.  This is why tools like tcpdump and wireshark don't work
(and why the ports don't show up in ifconfig generally).

If you just want to know that packets are flowing, an easy way to do it is
simply to emit messages (via printf or the logging subsystem of your
choice) or increment counters when you receive packets.  If you want to
verify a little bit of information about the packets but don't need full
capture, you can either add some parsing information to your messages, or
build out more stats.

However, if you want to actually capture the packet contents, it's a little
trickier.  You can write your own packet-capture application, of course,
but that might be a bigger task than you're looking for.  You can also
instantiate a KNI interface and either copy or forward the packets to it
(and, from there, you can do tcpdump on the kernel side of the interface).
  I seem to recall that there's been some work done on tcpdump like
applications within DPDK, but don't remember what state those efforts are
in presently.

--
Matt Laswell
laswell at infinite.io
infinite io, inc.

On Fri, Jul 15, 2016 at 12:54 AM, Raja Jayapal  wrote:

> Hi All,
>
> I have installed dpdk on VM and would like to know how to capture the
> packets on dpdk ports.
> I am sending traffic from host  and want to know how to confirm whether
> the packets are flowing via dpdk ports.
> I tried with tcpdump and wireshark but could not capture the packets
> inside VM.
> setup : bridge1(Host)--- VM(Guest with DPDK) - bridge2(Host)
>
> Please suggest.
>
> Thanks,
> Raja
>
> =-=-=
> Notice: The information contained in this e-mail
> message and/or attachments to it may contain
> confidential or privileged information. If you are
> not the intended recipient, any dissemination, use,
> review, distribution, printing or copying of the
> information contained in this e-mail message
> and/or attachments to it are strictly prohibited. If
> you have received this communication in error,
> please notify us by reply e-mail or telephone and
> immediately and permanently delete the message
> and any attachments. Thank you
>
>
>

[dpdk-dev] [PATCH] lib: change rte_ring dequeue to guarantee ordering before tail update

2016-07-15 Thread Juhamatti Kuusisaari

Consumer queue dequeuing must be guaranteed to be done fully before
the tail is updated. This is not guaranteed with a read barrier,
changed to a write barrier just before tail update which in practice
guarantees correct order of reads and writes.

Signed-off-by: Juhamatti Kuusisaari 
---
 lib/librte_ring/rte_ring.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index eb45e41..14920af 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -748,7 +748,7 @@ __rte_ring_sc_do_dequeue(struct rte_ring *r, void 
**obj_table,

/* copy in table */
DEQUEUE_PTRS();
-   rte_smp_rmb();
+   rte_smp_wmb();

__RING_STAT_ADD(r, deq_success, n);
r->cons.tail = cons_next;
--
2.9.0



The information contained in this message may be privileged
and confidential and protected from disclosure. If the reader
of this message is not the intended recipient, or an employee
or agent responsible for delivering this message to the
intended recipient, you are hereby notified that any reproduction,
dissemination or distribution of this communication is strictly
prohibited. If you have received this communication in error,
please notify us immediately by replying to the message and
deleting it from your computer. Thank you. Coriant-Tellabs

[dpdk-dev] [PATCH 1/2] bnx2x: disable fast path interrupts

2016-07-15 Thread Chas Williams

On Thu, 2016-07-14 at 10:03 -0700, Rasesh Mody wrote:
> Disable fastpath interrupts and remove unneeded delay in
> bnx2x_interrupt_action(). This patch fixes and prevents performance
> degradation for BNX2X PMD.
> 
> Fixes: 540a2110 ("bnx2x: driver core")
> 
> Signed-off-by: Rasesh Mody 
>?
> diff --git a/drivers/net/bnx2x/bnx2x_ethdev.c 
> b/drivers/net/bnx2x/bnx2x_ethdev.c
> index c8d2bf2..f97dd5b 100644
> --- a/drivers/net/bnx2x/bnx2x_ethdev.c
> +++ b/drivers/net/bnx2x/bnx2x_ethdev.c
> @@ -107,8 +107,8 @@ bnx2x_interrupt_action(struct rte_eth_dev *dev)
> ?
> ? PMD_DEBUG_PERIODIC_LOG(INFO, "Interrupt handled");
> ?
> - if (bnx2x_intr_legacy(sc, 0))
> - DELAY_MS(250);
> + bnx2x_intr_legacy(sc, 0)
> +

Trailing semicolon?

[dpdk-dev] [PATCH] lib: move rte_ring read barrier to correct location

2016-07-15 Thread Kuusisaari, Juhamatti


Hi Konstantin,

> Hi Juhamatti,
> 
> >
> > Hi Konstantin,
> >
> > > > > > > It is quite safe to move the barrier before DEQUEUE because
> > > > > > > after the DEQUEUE there is nothing really that we would want
> > > > > > > to protect
> > > with a
> > > > > read barrier.
> > > > > >
> > > > > > I don't think so.
> > > > > > If you remove barrier after DEQUEUE(), that means on systems
> > > > > > with relaxed memory ordering cons.tail could be updated before
> > > > > > DEQUEUE() will be finished and producer can overwrite queue
> > > > > > entries that were
> > > not
> > > > > yet dequeued.
> > > > > > So if cpu can really do such speculative out of order loads,
> > > > > > then we do need for  __rte_ring_sc_do_dequeue() something like:
> > > > > >
> > > > > > rte_smp_rmb();
> > > > > > DEQUEUE_PTRS();
> > > > > > rte_smp_rmb();
> > > >
> > > > You have a valid point here, there needs to be a guarantee that
> > > > cons_tail
> > > cannot
> > > > be updated before DEQUEUE is completed. Nevertheless, my point was
> > > that it is
> > > > not guaranteed with a read barrier anyway. The implementation has
> > > > the
> > > following
> > > > sequence
> > > >
> > > > DEQUEUE_PTRS(); (i.e. READ/LOAD)
> > > > rte_smp_rmb();
> > > > ..
> > > > r->cons.tail = cons_next; (i.e WRITE/STORE)
> > > >
> > > > Above read barrier does not guarantee any ordering for the
> > > > following
> > > writes/stores.
> > > > As a guarantee is needed, I think we in fact need to change the
> > > > read barrier
> > > on the
> > > > dequeue to a full barrier, which guarantees the read+write order,
> > > > as
> > > follows
> > > >
> > > > DEQUEUE_PTRS();
> > > > rte_smp_mb();
> > > > ..
> > > > r->cons.tail = cons_next;
> > > >
> > > > If you agree, I can for sure prepare another patch for this issue.
> > >
> > > Hmm, I think for __rte_ring_mc_do_dequeue() we are ok with
> > > smp_rmb(), as we have to read cons.tail anyway.
> >
> > Are you certain that this read creates strong enough dependency between
> read of cons.tail and the write of it on the mc_do_dequeue()?
> 
> Yes, I believe so.
> 
> > I think it does not really create any control dependency there as the next
> write is not dependent of the result of the read.
> 
> I think it is dependent: cons.tail can be updated only if it's current value 
> is
> eual to precomputed before cons_head.
> So cpu has to read cons.tail value first.

I was thinking that it might match to this processing pattern:

S1. if (a == b)
S2. a = a + b
S3. b = a + b

Above S3 has no control dependency to S1 and can be in theory executed
in parallel. 

In the (simplified) implementation we have:

X1. while (a != b)
X2. { }
X3. a = c

If we would consider S3 and X3 equal, there might be a problem with coherence 
without a write barrier after the while(). It may of course be purely 
theoretical, 
depending on the HW. This is anyway the reason for my suggestion to have a 
write-barrier also on the mc_do_dequeue() just before tail update.

--
 Juhamatti 

> > The CPU also
> > knows already the value that will be written to cons.tail and that
> > value does not depend on the previous read either. The CPU does not
> know we are planning to do a spinlock there, so it might do things out-of-
> order without proper dependencies.
> >
> > > For  __rte_ring_sc_do_dequeue(), I think you right, we might need
> > > something stronger.
> > > I don't want to put rte_smp_mb() here as it would cause full HW
> > > barrier even on machines with strong memory order (IA).
> > > I think that rte_smp_wmb() might be enough here:
> > > it would force cpu to wait till writes in DEQUEUE_PTRS() are become
> > > visible, which means reads have to be completed too.
> >
> > In practice I think that rte_smp_wmb() would work fine, even though it
> > is not strictly according to the book. Below solution would be my
> > proposal as a fix to the issue of sc dequeueing (and also to mc dequeueing,
> if we have the problem of CPU completely ignoring the spinlock in reality
> there):
> >
> > DEQUEUE_PTRS();
> > ..
> > rte_smp_wmb();
> > r->cons.tail = cons_next;
> 
> As I said in previous email - it looks good for me for
> _rte_ring_sc_do_dequeue(), but I am interested to hear what  ARM and PPC
> maintainers think about it.
> Jan, Jerin do you have any comments on it?
> Chao, sorry but I still not sure why PPC is considered as architecture with
> strong memory ordering?
> Might be I am missing something obvious here.
> Thank
> Konstantin
> 
> >
> > --
> >  Juhamatti
> >
> > > Another option would be to define a new macro: rte_weak_mb() or so,
> > > that would be expanded into CB on boxes with strong memory model,
> > > and to full MB on machines with relaxed ones.
> > > Interested to hear what ARM and PPC guys think.
> > > Konstantin
> > >
> > > P.S. Another thing a bit off-topic - for PPC guys:
> > > As I can see smp_rmb/smp_wmb are just a complier barriers:
> > > find

[dpdk-dev] spinlock: Move constructor function out of header file

2016-07-15 Thread Thomas Monjalon

2016-07-14 18:10, Damjan Marion:
> Dear Jan,
> 
> Thank you for your comments. A bit too much overhead to submit simple patch
> so let?s forget about it. I will just add it as it is to our private
> collection of patches.

These are changes trivial to fix when applying.
I strongly prefer that you upstream patches instead of keeping patches
in the VPP repository. I will help you in this task.
Thanks for the effort.

> If anybody wants to pick it from here, please do...

I can update the bsdapp Makefile and do the trivial changes for you,
if we agree that we do not need to touch to other archs (see below).

> > On 14 Jul 2016, at 20:03, Jan Viktorin  wrote:
> >> --- a/lib/librte_eal/linuxapp/eal/Makefile
> >> +++ b/lib/librte_eal/linuxapp/eal/Makefile
> >> @@ -106,6 +106,7 @@ SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += rte_keepalive.c
> >> 
> >> # from arch dir
> >> SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += rte_cpuflags.c
> >> +SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += rte_spinlock.c
> > 
> > This is not good, you provide rte_spinlock.c only for x86. Building
> > for any other arch would fail to find this file.

It is not used in other archs. It is really x86 specific.

> > Moreover, the bsdapp/eal/Makefile should reflect this situation as
> > well.

Good catch.

86 matches

Mail list logo