[dpdk-dev] Performance hit - NICs on different CPU sockets

2016-06-16 Thread Take Ceara
On Thu, Jun 16, 2016 at 10:19 PM, Wiles, Keith  wrote:
>
> On 6/16/16, 3:16 PM, "dev on behalf of Wiles, Keith"  on behalf of keith.wiles at intel.com> wrote:
>
>>
>>On 6/16/16, 3:00 PM, "Take Ceara"  wrote:
>>
>>>On Thu, Jun 16, 2016 at 9:33 PM, Wiles, Keith  
>>>wrote:
 On 6/16/16, 1:20 PM, "Take Ceara"  wrote:

>On Thu, Jun 16, 2016 at 6:59 PM, Wiles, Keith  
>wrote:
>>
>> On 6/16/16, 11:56 AM, "dev on behalf of Wiles, Keith" > dpdk.org on behalf of keith.wiles at intel.com> wrote:
>>
>>>
>>>On 6/16/16, 11:20 AM, "Take Ceara"  wrote:
>>>
On Thu, Jun 16, 2016 at 5:29 PM, Wiles, Keith >>>intel.com> wrote:

>
> Right now I do not know what the issue is with the system. Could be 
> too many Rx/Tx ring pairs per port and limiting the memory in the 
> NICs, which is why you get better performance when you have 8 core 
> per port. I am not really seeing the whole picture and how DPDK is 
> configured to help more. Sorry.

I doubt that there is a limitation wrt running 16 cores per port vs 8
cores per port as I've tried with two different machines connected
back to back each with one X710 port and 16 cores on each of them
running on that port. In that case our performance doubled as
expected.

>
> Maybe seeing the DPDK command line would help.

The command line I use with ports 01:00.3 and 81:00.3 is:
./warp17 -c 0xF3   -m 32768 -w :81:00.3 -w :01:00.3 --
--qmap 0.0x003FF003F0 --qmap 1.0x0FC00FFC00

Our own qmap args allow the user to control exactly how cores are
split between ports. In this case we end up with:

warp17> show port map
Port 0[socket: 0]:
   Core 4[socket:0] (Tx: 0, Rx: 0)
   Core 5[socket:0] (Tx: 1, Rx: 1)
   Core 6[socket:0] (Tx: 2, Rx: 2)
   Core 7[socket:0] (Tx: 3, Rx: 3)
   Core 8[socket:0] (Tx: 4, Rx: 4)
   Core 9[socket:0] (Tx: 5, Rx: 5)
   Core 20[socket:0] (Tx: 6, Rx: 6)
   Core 21[socket:0] (Tx: 7, Rx: 7)
   Core 22[socket:0] (Tx: 8, Rx: 8)
   Core 23[socket:0] (Tx: 9, Rx: 9)
   Core 24[socket:0] (Tx: 10, Rx: 10)
   Core 25[socket:0] (Tx: 11, Rx: 11)
   Core 26[socket:0] (Tx: 12, Rx: 12)
   Core 27[socket:0] (Tx: 13, Rx: 13)
   Core 28[socket:0] (Tx: 14, Rx: 14)
   Core 29[socket:0] (Tx: 15, Rx: 15)

Port 1[socket: 1]:
   Core 10[socket:1] (Tx: 0, Rx: 0)
   Core 11[socket:1] (Tx: 1, Rx: 1)
   Core 12[socket:1] (Tx: 2, Rx: 2)
   Core 13[socket:1] (Tx: 3, Rx: 3)
   Core 14[socket:1] (Tx: 4, Rx: 4)
   Core 15[socket:1] (Tx: 5, Rx: 5)
   Core 16[socket:1] (Tx: 6, Rx: 6)
   Core 17[socket:1] (Tx: 7, Rx: 7)
   Core 18[socket:1] (Tx: 8, Rx: 8)
   Core 19[socket:1] (Tx: 9, Rx: 9)
   Core 30[socket:1] (Tx: 10, Rx: 10)
   Core 31[socket:1] (Tx: 11, Rx: 11)
   Core 32[socket:1] (Tx: 12, Rx: 12)
   Core 33[socket:1] (Tx: 13, Rx: 13)
   Core 34[socket:1] (Tx: 14, Rx: 14)
   Core 35[socket:1] (Tx: 15, Rx: 15)
>>>
>>>On each socket you have 10 physical cores or 20 lcores per socket for 40 
>>>lcores total.
>>>
>>>The above is listing the LCORES (or hyper-threads) and not COREs, which 
>>>I understand some like to think they are interchangeable. The problem is 
>>>the hyper-threads are logically interchangeable, but not performance 
>>>wise. If you have two run-to-completion threads on a single physical 
>>>core each on a different hyper-thread of that core [0,1], then the 
>>>second lcore or thread (1) on that physical core will only get at most 
>>>about 30-20% of the CPU cycles. Normally it is much less, unless you 
>>>tune the code to make sure each thread is not trying to share the 
>>>internal execution units, but some internal execution units are always 
>>>shared.
>>>
>>>To get the best performance when hyper-threading is enable is to not run 
>>>both threads on a single physical core, but only run one hyper-thread-0.
>>>
>>>In the table below the table lists the physical core id and each of the 
>>>lcore ids per socket. Use the first lcore per socket for the best 
>>>performance:
>>>Core 1 [1, 21][11, 31]
>>>Use lcore 1 or 11 depending on the socket you are on.
>>>
>>>The info below is most likely the best performance and utilization of 
>>>your system. If I got the values right ?
>>>
>>>./warp17 -c 0x0FFFe0   -m 32768 -w :81:00.3 -w :01:00.3 --
>>>--qmap 0.0x0003FE --qmap 1.0x0FFE00
>>>
>>>Port 0[socket: 0]:
>>>   Core 2[socket:0] (Tx: 0, Rx: 0)
>>>   Core 3[socket:0] (Tx: 1, Rx: 1)
>>>   

[dpdk-dev] [PATCH v10 4/7] ethdev: make get port by name and get name by port public

2016-06-16 Thread Thomas Monjalon
2016-06-15 15:06, Reshma Pattan:
> Converted rte_eth_dev_get_port_by_name to a public API.
> Converted rte_eth_dev_get_name_by_port to a public API.
> Updated the release notes with the changes.

It is not an API change, just a new API, so no need to reference
it in the release notes.


[dpdk-dev] [PATCH 4/4] doc: add MTU update to feature matrix for enic

2016-06-16 Thread John Daley
Signed-off-by: John Daley 
---
 doc/guides/nics/overview.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/guides/nics/overview.rst b/doc/guides/nics/overview.rst
index 29a6163..6b30085 100644
--- a/doc/guides/nics/overview.rst
+++ b/doc/guides/nics/overview.rst
@@ -92,7 +92,7 @@ Most of these differences are summarized below.
Queue status event  
   Y
Rx interrupt Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y
Queue start/stop   Y   Y   Y Y Y Y Y Y Y Y Y Y Y Y Y Y  
 Y   Y Y
-   MTU update Y Y Y   Y   Y Y Y Y Y Y
+   MTU update Y Y Y Y Y   Y Y Y Y Y Y
Jumbo frameY Y Y Y Y Y Y Y Y   Y Y Y Y Y Y Y Y Y Y  
 Y Y Y
Scattered Rx   Y Y Y   Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y  
 Y   Y
LROY Y Y Y
-- 
2.7.0



[dpdk-dev] [PATCH 3/4] enic: add an update MTU function for non-Rx scatter mode

2016-06-16 Thread John Daley
Provide an update MTU callbaack. The function returns -ENOTSUP
if Rx scatter is enabled. Updating the MTU to be greater than
the value configured via the Cisco CIMC/UCSM management interface
is allowed provided it is still less than the maximum egress packet
size allowed by the NIC.

Signed-off-by: John Daley 
---
 drivers/net/enic/enic.h|  1 +
 drivers/net/enic/enic_ethdev.c | 10 +-
 drivers/net/enic/enic_main.c   | 44 ++
 3 files changed, 54 insertions(+), 1 deletion(-)

diff --git a/drivers/net/enic/enic.h b/drivers/net/enic/enic.h
index 78f7bd7..8122358 100644
--- a/drivers/net/enic/enic.h
+++ b/drivers/net/enic/enic.h
@@ -245,4 +245,5 @@ uint16_t enic_recv_pkts(void *rx_queue, struct rte_mbuf 
**rx_pkts,
uint16_t nb_pkts);
 uint16_t enic_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
   uint16_t nb_pkts);
+int enic_set_mtu(struct enic *enic, uint16_t new_mtu);
 #endif /* _ENIC_H_ */
diff --git a/drivers/net/enic/enic_ethdev.c b/drivers/net/enic/enic_ethdev.c
index 31d9600..9a738c2 100644
--- a/drivers/net/enic/enic_ethdev.c
+++ b/drivers/net/enic/enic_ethdev.c
@@ -520,6 +520,14 @@ static void enicpmd_remove_mac_addr(struct rte_eth_dev 
*eth_dev, __rte_unused ui
enic_del_mac_address(enic);
 }

+static int enicpmd_mtu_set(struct rte_eth_dev *eth_dev, uint16_t mtu)
+{
+   struct enic *enic = pmd_priv(eth_dev);
+
+   ENICPMD_FUNC_TRACE();
+   return enic_set_mtu(enic, mtu);
+}
+
 static const struct eth_dev_ops enicpmd_eth_dev_ops = {
.dev_configure= enicpmd_dev_configure,
.dev_start= enicpmd_dev_start,
@@ -537,7 +545,7 @@ static const struct eth_dev_ops enicpmd_eth_dev_ops = {
.queue_stats_mapping_set = NULL,
.dev_infos_get= enicpmd_dev_info_get,
.dev_supported_ptypes_get = enicpmd_dev_supported_ptypes_get,
-   .mtu_set  = NULL,
+   .mtu_set  = enicpmd_mtu_set,
.vlan_filter_set  = enicpmd_vlan_filter_set,
.vlan_tpid_set= NULL,
.vlan_offload_set = enicpmd_vlan_offload_set,
diff --git a/drivers/net/enic/enic_main.c b/drivers/net/enic/enic_main.c
index 32ecdae..c23938a 100644
--- a/drivers/net/enic/enic_main.c
+++ b/drivers/net/enic/enic_main.c
@@ -854,6 +854,50 @@ int enic_set_vnic_res(struct enic *enic)
return rc;
 }

+/* The Cisco NIC can send and receive packets up to a max packet size
+ * determined by the NIC type and firmware. There is also an MTU
+ * configured into the NIC via the CIMC/UCSM management interface
+ * which can be overridden by this function (up to the max packet size).
+ * Depending on the network setup, doing so may cause packet drops
+ * and unexpected behavior.
+ */
+int enic_set_mtu(struct enic *enic, uint16_t new_mtu)
+{
+   uint16_t old_mtu;   /* previous setting */
+   uint16_t config_mtu;/* Value configured into NIC via CIMC/UCSM */
+   struct rte_eth_dev *eth_dev = enic->rte_dev;
+
+   old_mtu = eth_dev->data->mtu;
+   config_mtu = enic->config.mtu;
+
+   /* only works with Rx scatter disabled */
+   if (enic->rte_dev->data->dev_conf.rxmode.enable_scatter)
+   return -ENOTSUP;
+
+   if (new_mtu > enic->max_mtu) {
+   dev_err(enic,
+   "MTU not updated: requested (%u) greater than max 
(%u)\n",
+   new_mtu, enic->max_mtu);
+   return -EINVAL;
+   }
+   if (new_mtu < ENIC_MIN_MTU) {
+   dev_info(enic,
+   "MTU not updated: requested (%u) less than min (%u)\n",
+   new_mtu, ENIC_MIN_MTU);
+   return -EINVAL;
+   }
+   if (new_mtu > config_mtu)
+   dev_warning(enic,
+   "MTU (%u) is greater than value configured in NIC 
(%u)\n",
+   new_mtu, config_mtu);
+
+   /* update the mtu */
+   eth_dev->data->mtu = new_mtu;
+
+   dev_info(enic, "MTU changed from %u to %u\n",  old_mtu, new_mtu);
+   return 0;
+}
+
 static int enic_dev_init(struct enic *enic)
 {
int err;
-- 
2.7.0



[dpdk-dev] [PATCH 2/4] enic: set the max allowed MTU for the NIC

2016-06-16 Thread John Daley
The max MTU is set to the max egress packet size allowed by the VIC
minus the size of a an IPv4 L2 header with .1Q (18 bytes).

Signed-off-by: John Daley 
---
 drivers/net/enic/enic.h|  1 +
 drivers/net/enic/enic_ethdev.c |  3 ++-
 drivers/net/enic/enic_res.c| 25 +
 drivers/net/enic/enic_res.h|  4 +++-
 4 files changed, 23 insertions(+), 10 deletions(-)

diff --git a/drivers/net/enic/enic.h b/drivers/net/enic/enic.h
index 1e6914e..78f7bd7 100644
--- a/drivers/net/enic/enic.h
+++ b/drivers/net/enic/enic.h
@@ -118,6 +118,7 @@ struct enic {
u8 ig_vlan_strip_en;
int link_status;
u8 hw_ip_checksum;
+   u16 max_mtu;

unsigned int flags;
unsigned int priv_flags;
diff --git a/drivers/net/enic/enic_ethdev.c b/drivers/net/enic/enic_ethdev.c
index 697ff82..31d9600 100644
--- a/drivers/net/enic/enic_ethdev.c
+++ b/drivers/net/enic/enic_ethdev.c
@@ -435,7 +435,8 @@ static void enicpmd_dev_info_get(struct rte_eth_dev 
*eth_dev,
device_info->max_rx_queues = enic->rq_count;
device_info->max_tx_queues = enic->wq_count;
device_info->min_rx_bufsize = ENIC_MIN_MTU;
-   device_info->max_rx_pktlen = enic->config.mtu;
+   device_info->max_rx_pktlen = enic->rte_dev->data->mtu
+  + ETHER_HDR_LEN + 4;
device_info->max_mac_addrs = 1;
device_info->rx_offload_capa =
DEV_RX_OFFLOAD_VLAN_STRIP |
diff --git a/drivers/net/enic/enic_res.c b/drivers/net/enic/enic_res.c
index ebe379d..e82181f 100644
--- a/drivers/net/enic/enic_res.c
+++ b/drivers/net/enic/enic_res.c
@@ -83,6 +83,20 @@ int enic_get_vnic_config(struct enic *enic)
GET_CONFIG(intr_timer_usec);
GET_CONFIG(loop_tag);
GET_CONFIG(num_arfs);
+   GET_CONFIG(max_pkt_size);
+
+   /* max packet size is only defined in newer VIC firmware
+* and will be 0 for legacy firmware and VICs
+*/
+   if (c->max_pkt_size > ENIC_DEFAULT_MAX_PKT_SIZE)
+   enic->max_mtu = c->max_pkt_size - (ETHER_HDR_LEN + 4);
+   else
+   enic->max_mtu = ENIC_DEFAULT_MAX_PKT_SIZE - (ETHER_HDR_LEN + 4);
+   if (c->mtu == 0)
+   c->mtu = 1500;
+
+   enic->rte_dev->data->mtu = min_t(u16, enic->max_mtu,
+max_t(u16, ENIC_MIN_MTU, c->mtu));

c->wq_desc_count =
min_t(u32, ENIC_MAX_WQ_DESCS,
@@ -96,21 +110,16 @@ int enic_get_vnic_config(struct enic *enic)
c->rq_desc_count));
c->rq_desc_count &= 0xffe0; /* must be aligned to groups of 32 */

-   if (c->mtu == 0)
-   c->mtu = 1500;
-   c->mtu = min_t(u16, ENIC_MAX_MTU,
-   max_t(u16, ENIC_MIN_MTU,
-   c->mtu));
-
c->intr_timer_usec = min_t(u32, c->intr_timer_usec,
vnic_dev_get_intr_coal_timer_max(enic->vdev));

dev_info(enic_get_dev(enic),
"vNIC MAC addr %02x:%02x:%02x:%02x:%02x:%02x "
-   "wq/rq %d/%d mtu %d\n",
+   "wq/rq %d/%d mtu %d, max mtu:%d\n",
enic->mac_addr[0], enic->mac_addr[1], enic->mac_addr[2],
enic->mac_addr[3], enic->mac_addr[4], enic->mac_addr[5],
-   c->wq_desc_count, c->rq_desc_count, c->mtu);
+   c->wq_desc_count, c->rq_desc_count,
+   enic->rte_dev->data->mtu, enic->max_mtu);
dev_info(enic_get_dev(enic), "vNIC csum tx/rx %s/%s "
"rss %s intr mode %s type %s timer %d usec "
"loopback tag 0x%04x\n",
diff --git a/drivers/net/enic/enic_res.h b/drivers/net/enic/enic_res.h
index 3c8e303..303530e 100644
--- a/drivers/net/enic/enic_res.h
+++ b/drivers/net/enic/enic_res.h
@@ -46,7 +46,9 @@
 #define ENIC_MAX_RQ_DESCS  4096

 #define ENIC_MIN_MTU   68
-#define ENIC_MAX_MTU   9000
+
+/* Does not include (possible) inserted VLAN tag and FCS */
+#define ENIC_DEFAULT_MAX_PKT_SIZE  9022

 #define ENIC_MULTICAST_PERFECT_FILTERS 32
 #define ENIC_UNICAST_PERFECT_FILTERS   32
-- 
2.7.0



[dpdk-dev] [PATCH 1/4] enic: enable NIC max packet size discovery

2016-06-16 Thread John Daley
Pull in common VNIC code which enables querying for max egress
packet size.

Signed-off-by: John Daley 
---
There are some non-related fields and defines in this file because
it is shared with other drivers and interfaces to the VIC.

 drivers/net/enic/base/vnic_enet.h | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/net/enic/base/vnic_enet.h 
b/drivers/net/enic/base/vnic_enet.h
index cc34998..5062247 100644
--- a/drivers/net/enic/base/vnic_enet.h
+++ b/drivers/net/enic/base/vnic_enet.h
@@ -35,6 +35,10 @@
 #ifndef _VNIC_ENIC_H_
 #define _VNIC_ENIC_H_

+/* Hardware intr coalesce timer is in units of 1.5us */
+#define INTR_COALESCE_USEC_TO_HW(usec) ((usec) * 2 / 3)
+#define INTR_COALESCE_HW_TO_USEC(usec) ((usec) * 3 / 2)
+
 /* Device-specific region: enet configuration */
 struct vnic_enet_config {
u32 flags;
@@ -50,6 +54,12 @@ struct vnic_enet_config {
u16 vf_rq_count;
u16 num_arfs;
u64 mem_paddr;
+   u16 rdma_qp_id;
+   u16 rdma_qp_count;
+   u16 rdma_resgrp;
+   u32 rdma_mr_id;
+   u32 rdma_mr_count;
+   u32 max_pkt_size;
 };

 #define VENETF_TSO 0x1 /* TSO enabled */
@@ -64,9 +74,14 @@ struct vnic_enet_config {
 #define VENETF_RSSHASH_IPV6_EX 0x200   /* Hash on IPv6 extended fields */
 #define VENETF_RSSHASH_TCPIPV6_EX 0x400/* Hash on TCP + IPv6 ext. 
fields */
 #define VENETF_LOOP0x800   /* Loopback enabled */
-#define VENETF_VMQ 0x4000  /* using VMQ flag for VMware NETQ */
+#define VENETF_FAILOVER0x1000  /* Fabric failover enabled */
+#define VENETF_USPACE_NIC   0x2000 /* vHPC enabled */
+#define VENETF_VMQ  0x4000 /* VMQ enabled */
+#define VENETF_ARFS0x8000  /* ARFS enabled */
 #define VENETF_VXLAN0x1 /* VxLAN offload */
 #define VENETF_NVGRE0x2 /* NVGRE offload */
+#define VENETF_GRPINTR  0x4 /* group interrupt */
+
 #define VENET_INTR_TYPE_MIN0   /* Timer specs min interrupt spacing */
 #define VENET_INTR_TYPE_IDLE   1   /* Timer specs idle time before irq */

-- 
2.7.0



[dpdk-dev] [PATCH 0/4] enic: enable MTU update callback

2016-06-16 Thread John Daley
This patchset determines the max egress packet size allowed on the
NIC and uses it to set an upper limit for MTU. An MTU update function
is added, but only works if Rx scatter is disabled. If Rx scatter is
enabled, -ENOSUP is returned. Another patch with Rx scatter support will
come later.

These patches should apply cleanly to dpdk-net-next rel_16_07 or on the
enic Rx scatter patch http://www.dpdk.org/dev/patchwork/patch/13933/

John Daley (4):
  enic: enable NIC max packet size discovery
  enic: set the max allowed MTU for the NIC
  enic: add an update MTU function for non-rx scatter mode
  doc: add MTU update to feature matrix for enic

 doc/guides/nics/overview.rst  |  2 +-
 drivers/net/enic/base/vnic_enet.h | 17 ++-
 drivers/net/enic/enic.h   |  2 ++
 drivers/net/enic/enic_ethdev.c| 13 ++--
 drivers/net/enic/enic_main.c  | 44 +++
 drivers/net/enic/enic_res.c   | 25 +++---
 drivers/net/enic/enic_res.h   |  4 +++-
 7 files changed, 94 insertions(+), 13 deletions(-)

-- 
2.7.0



[dpdk-dev] Performance hit - NICs on different CPU sockets

2016-06-16 Thread Take Ceara
On Thu, Jun 16, 2016 at 9:33 PM, Wiles, Keith  wrote:
> On 6/16/16, 1:20 PM, "Take Ceara"  wrote:
>
>>On Thu, Jun 16, 2016 at 6:59 PM, Wiles, Keith  
>>wrote:
>>>
>>> On 6/16/16, 11:56 AM, "dev on behalf of Wiles, Keith" >> dpdk.org on behalf of keith.wiles at intel.com> wrote:
>>>

On 6/16/16, 11:20 AM, "Take Ceara"  wrote:

>On Thu, Jun 16, 2016 at 5:29 PM, Wiles, Keith  
>wrote:
>
>>
>> Right now I do not know what the issue is with the system. Could be too 
>> many Rx/Tx ring pairs per port and limiting the memory in the NICs, 
>> which is why you get better performance when you have 8 core per port. I 
>> am not really seeing the whole picture and how DPDK is configured to 
>> help more. Sorry.
>
>I doubt that there is a limitation wrt running 16 cores per port vs 8
>cores per port as I've tried with two different machines connected
>back to back each with one X710 port and 16 cores on each of them
>running on that port. In that case our performance doubled as
>expected.
>
>>
>> Maybe seeing the DPDK command line would help.
>
>The command line I use with ports 01:00.3 and 81:00.3 is:
>./warp17 -c 0xF3   -m 32768 -w :81:00.3 -w :01:00.3 --
>--qmap 0.0x003FF003F0 --qmap 1.0x0FC00FFC00
>
>Our own qmap args allow the user to control exactly how cores are
>split between ports. In this case we end up with:
>
>warp17> show port map
>Port 0[socket: 0]:
>   Core 4[socket:0] (Tx: 0, Rx: 0)
>   Core 5[socket:0] (Tx: 1, Rx: 1)
>   Core 6[socket:0] (Tx: 2, Rx: 2)
>   Core 7[socket:0] (Tx: 3, Rx: 3)
>   Core 8[socket:0] (Tx: 4, Rx: 4)
>   Core 9[socket:0] (Tx: 5, Rx: 5)
>   Core 20[socket:0] (Tx: 6, Rx: 6)
>   Core 21[socket:0] (Tx: 7, Rx: 7)
>   Core 22[socket:0] (Tx: 8, Rx: 8)
>   Core 23[socket:0] (Tx: 9, Rx: 9)
>   Core 24[socket:0] (Tx: 10, Rx: 10)
>   Core 25[socket:0] (Tx: 11, Rx: 11)
>   Core 26[socket:0] (Tx: 12, Rx: 12)
>   Core 27[socket:0] (Tx: 13, Rx: 13)
>   Core 28[socket:0] (Tx: 14, Rx: 14)
>   Core 29[socket:0] (Tx: 15, Rx: 15)
>
>Port 1[socket: 1]:
>   Core 10[socket:1] (Tx: 0, Rx: 0)
>   Core 11[socket:1] (Tx: 1, Rx: 1)
>   Core 12[socket:1] (Tx: 2, Rx: 2)
>   Core 13[socket:1] (Tx: 3, Rx: 3)
>   Core 14[socket:1] (Tx: 4, Rx: 4)
>   Core 15[socket:1] (Tx: 5, Rx: 5)
>   Core 16[socket:1] (Tx: 6, Rx: 6)
>   Core 17[socket:1] (Tx: 7, Rx: 7)
>   Core 18[socket:1] (Tx: 8, Rx: 8)
>   Core 19[socket:1] (Tx: 9, Rx: 9)
>   Core 30[socket:1] (Tx: 10, Rx: 10)
>   Core 31[socket:1] (Tx: 11, Rx: 11)
>   Core 32[socket:1] (Tx: 12, Rx: 12)
>   Core 33[socket:1] (Tx: 13, Rx: 13)
>   Core 34[socket:1] (Tx: 14, Rx: 14)
>   Core 35[socket:1] (Tx: 15, Rx: 15)

On each socket you have 10 physical cores or 20 lcores per socket for 40 
lcores total.

The above is listing the LCORES (or hyper-threads) and not COREs, which I 
understand some like to think they are interchangeable. The problem is the 
hyper-threads are logically interchangeable, but not performance wise. If 
you have two run-to-completion threads on a single physical core each on a 
different hyper-thread of that core [0,1], then the second lcore or thread 
(1) on that physical core will only get at most about 30-20% of the CPU 
cycles. Normally it is much less, unless you tune the code to make sure 
each thread is not trying to share the internal execution units, but some 
internal execution units are always shared.

To get the best performance when hyper-threading is enable is to not run 
both threads on a single physical core, but only run one hyper-thread-0.

In the table below the table lists the physical core id and each of the 
lcore ids per socket. Use the first lcore per socket for the best 
performance:
Core 1 [1, 21][11, 31]
Use lcore 1 or 11 depending on the socket you are on.

The info below is most likely the best performance and utilization of your 
system. If I got the values right ?

./warp17 -c 0x0FFFe0   -m 32768 -w :81:00.3 -w :01:00.3 --
--qmap 0.0x0003FE --qmap 1.0x0FFE00

Port 0[socket: 0]:
   Core 2[socket:0] (Tx: 0, Rx: 0)
   Core 3[socket:0] (Tx: 1, Rx: 1)
   Core 4[socket:0] (Tx: 2, Rx: 2)
   Core 5[socket:0] (Tx: 3, Rx: 3)
   Core 6[socket:0] (Tx: 4, Rx: 4)
   Core 7[socket:0] (Tx: 5, Rx: 5)
   Core 8[socket:0] (Tx: 6, Rx: 6)
   Core 9[socket:0] (Tx: 7, Rx: 7)

8 cores on first socket leaving 0-1 lcores for Linux.
>>>
>>> 9 cores and leaving the first core or two lcores for Linux

Port 1[socket: 1]:
   Core 10[socket:1] (Tx: 0, Rx: 0)
   Core 11[socket:1] (Tx: 1, Rx: 1)
   Core 12[socket:1] (Tx: 2, Rx: 2)
   Core 13[socket:1] (Tx: 3, Rx: 3)
   

[dpdk-dev] [PATCH v4] e1000: configure VLAN TPID

2016-06-16 Thread Beilei Xing
This patch enables configuring the outer TPID for double VLAN.
Note that all other TPID values are read only.

Signed-off-by: Beilei Xing 
---
v4 changes:
 Optimize the code to be more readable.
v3 changes:
 Update commit log and comments.
v2 changes:
 Modify return value. Cause inner TPID is not supported by single
 VLAN,  return - ENOTSUP.
 Add return value. If want to set inner TPID of double VLAN or set
 outer TPID of single VLAN, return -ENOTSUP.

 drivers/net/e1000/igb_ethdev.c | 33 ++---
 1 file changed, 22 insertions(+), 11 deletions(-)

diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
index f0921ee..0ed95c8 100644
--- a/drivers/net/e1000/igb_ethdev.c
+++ b/drivers/net/e1000/igb_ethdev.c
@@ -86,6 +86,13 @@
 #define E1000_INCVALUE_82576 (16 << IGB_82576_TSYNC_SHIFT)
 #define E1000_TSAUXC_DISABLE_SYSTIME 0x8000

+/* External VLAN Enable bit mask */
+#define E1000_CTRL_EXT_EXT_VLAN  (1 << 26)
+
+/* External VLAN Ether Type bit mask and shift */
+#define E1000_VET_VET_EXT0x
+#define E1000_VET_VET_EXT_SHIFT  16
+
 static int  eth_igb_configure(struct rte_eth_dev *dev);
 static int  eth_igb_start(struct rte_eth_dev *dev);
 static void eth_igb_stop(struct rte_eth_dev *dev);
@@ -2237,21 +2244,25 @@ eth_igb_vlan_tpid_set(struct rte_eth_dev *dev,
 {
struct e1000_hw *hw =
E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
-   uint32_t reg = ETHER_TYPE_VLAN;
-   int ret = 0;
+   uint32_t reg, qinq;
+
+   qinq = E1000_READ_REG(hw, E1000_CTRL_EXT);
+   qinq &= E1000_CTRL_EXT_EXT_VLAN;

-   switch (vlan_type) {
-   case ETH_VLAN_TYPE_INNER:
-   reg |= (tpid << 16);
+   /* only outer TPID of double VLAN can be configured*/
+   if (qinq && vlan_type == ETH_VLAN_TYPE_OUTER) {
+   reg = E1000_READ_REG(hw, E1000_VET);
+   reg = (reg & (~E1000_VET_VET_EXT)) |
+   ((uint32_t)tpid << E1000_VET_VET_EXT_SHIFT);
E1000_WRITE_REG(hw, E1000_VET, reg);
-   break;
-   default:
-   ret = -EINVAL;
-   PMD_DRV_LOG(ERR, "Unsupported vlan type %d\n", vlan_type);
-   break;
+
+   return 0;
}

-   return ret;
+   /* all other TPID values are read-only*/
+   PMD_DRV_LOG(ERR, "Not supported");
+
+   return -ENOTSUP;
 }

 static void
-- 
2.5.0



[dpdk-dev] [PATCH v10 3/7] ethdev: add new fields to ethdev info struct

2016-06-16 Thread Thomas Monjalon
2016-06-15 15:06, Reshma Pattan:
> The new fields nb_rx_queues and nb_tx_queues are added to the
> rte_eth_dev_info structure.
> Changes to API rte_eth_dev_info_get() are done to update these new fields
> to the rte_eth_dev_info object.

The ABI is changed, not the API.

> Release notes is updated with the changes.
[...]
> --- a/lib/librte_ether/rte_ether_version.map
> +++ b/lib/librte_ether/rte_ether_version.map
> @@ -137,4 +137,5 @@ DPDK_16.07 {
>   global:
>  
>   rte_eth_add_first_rx_callback;
> + rte_eth_dev_info_get;
>  } DPDK_16.04;

Why duplicating this symbol in 16.07?
The ABI is broken anyway.


[dpdk-dev] [PATCH 4/4] app/test: typo fixing

2016-06-16 Thread Jain, Deepak K
Fixing typo in the performance tests for example preftest to perftest.

Signed-off-by: Jain, Deepak K 
---
 app/test/test_cryptodev_perf.c | 26 +-
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/app/test/test_cryptodev_perf.c b/app/test/test_cryptodev_perf.c
index 6c43a93..903529f 100644
--- a/app/test/test_cryptodev_perf.c
+++ b/app/test/test_cryptodev_perf.c
@@ -208,7 +208,7 @@ setup_test_string(struct rte_mempool *mpool,

 static struct crypto_testsuite_params testsuite_params = { NULL };
 static struct crypto_unittest_params unittest_params;
-static enum rte_cryptodev_type gbl_cryptodev_preftest_devtype;
+static enum rte_cryptodev_type gbl_cryptodev_perftest_devtype;

 static int
 testsuite_setup(void)
@@ -245,7 +245,7 @@ testsuite_setup(void)
}

/* Create 2 AESNI MB devices if required */
-   if (gbl_cryptodev_preftest_devtype == RTE_CRYPTODEV_AESNI_MB_PMD) {
+   if (gbl_cryptodev_perftest_devtype == RTE_CRYPTODEV_AESNI_MB_PMD) {
nb_devs = 
rte_cryptodev_count_devtype(RTE_CRYPTODEV_AESNI_MB_PMD);
if (nb_devs < 2) {
for (i = nb_devs; i < 2; i++) {
@@ -260,7 +260,7 @@ testsuite_setup(void)
}

/* Create 2 SNOW3G devices if required */
-   if (gbl_cryptodev_preftest_devtype == RTE_CRYPTODEV_SNOW3G_PMD) {
+   if (gbl_cryptodev_perftest_devtype == RTE_CRYPTODEV_SNOW3G_PMD) {
nb_devs = rte_cryptodev_count_devtype(RTE_CRYPTODEV_SNOW3G_PMD);
if (nb_devs < 2) {
for (i = nb_devs; i < 2; i++) {
@@ -283,7 +283,7 @@ testsuite_setup(void)
/* Search for the first valid */
for (i = 0; i < nb_devs; i++) {
rte_cryptodev_info_get(i, );
-   if (info.dev_type == gbl_cryptodev_preftest_devtype) {
+   if (info.dev_type == gbl_cryptodev_perftest_devtype) {
ts_params->dev_id = i;
valid_dev_id = 1;
break;
@@ -1956,7 +1956,7 @@ test_perf_crypto_qp_vary_burst_size(uint16_t dev_num)
}

while (num_received != num_to_submit) {
-   if (gbl_cryptodev_preftest_devtype ==
+   if (gbl_cryptodev_perftest_devtype ==
RTE_CRYPTODEV_AESNI_MB_PMD)
rte_cryptodev_enqueue_burst(dev_num, 0,
NULL, 0);
@@ -2028,7 +2028,7 @@ test_perf_snow3G_optimise_cyclecount(struct 
perf_test_params *pparams)

printf("\nOn %s dev%u qp%u, %s, cipher algo:%s, auth_algo:%s, "
"Packet Size %u bytes",
-   pmd_name(gbl_cryptodev_preftest_devtype),
+   pmd_name(gbl_cryptodev_perftest_devtype),
ts_params->dev_id, 0,
chain_mode_name(pparams->chain),
cipher_algo_name(pparams->cipher_algo),
@@ -2072,7 +2072,7 @@ test_perf_snow3G_optimise_cyclecount(struct 
perf_test_params *pparams)
}

while (num_ops_received != num_to_submit) {
-   if (gbl_cryptodev_preftest_devtype ==
+   if (gbl_cryptodev_perftest_devtype ==
RTE_CRYPTODEV_AESNI_MB_PMD)
rte_cryptodev_enqueue_burst(ts_params->dev_id, 
0,
NULL, 0);
@@ -2680,7 +2680,7 @@ test_perf_snow3g(uint8_t dev_id, uint16_t queue_id,
double cycles_B = cycles_buff / pparams->buf_size;
double throughput = (ops_s * pparams->buf_size * 8) / 100;

-   if (gbl_cryptodev_preftest_devtype == RTE_CRYPTODEV_QAT_SYM_PMD) {
+   if (gbl_cryptodev_perftest_devtype == RTE_CRYPTODEV_QAT_SYM_PMD) {
/* Cycle count misleading on HW devices for this test, so don't 
print */
printf("%4u\t%6.2f\t%10.2f\t n/a \t\t n/a "
"\t\t n/a \t\t%8"PRIu64"\t%8"PRIu64,
@@ -2824,7 +2824,7 @@ test_perf_snow3G_vary_pkt_size(void)
for (k = 0; k < RTE_DIM(burst_sizes); k++) {
printf("\nOn %s dev%u qp%u, %s, "
"cipher algo:%s, auth algo:%s, burst_size: %d 
ops",
-   pmd_name(gbl_cryptodev_preftest_devtype),
+   pmd_name(gbl_cryptodev_perftest_devtype),
testsuite_params.dev_id, 0,
chain_mode_name(params_set[i].chain),
cipher_algo_name(params_set[i].cipher_algo),
@@ -2893,7 +2893,7 @@ static struct unit_test_suite cryptodev_snow3g_testsuite  
= {
 static int
 perftest_aesni_mb_cryptodev(void /*argv __rte_unused, int argc __rte_unused*/)
 {
-   gbl_cryptodev_preftest_devtype = RTE_CRYPTODEV_AESNI_MB_PMD;
+   

[dpdk-dev] [PATCH 3/4] app/test: updating AES SHA performance test

2016-06-16 Thread Jain, Deepak K
From: Fiona Trahe 

Updating the AES performance test in line with snow3g peformance test.
Output format has been updated so as to get better understanding of numbers.

Signed-off-by: Fiona Trahe 
Signed-off-by: Jain, Deepak K 
---
 app/test/test_cryptodev.h  |   2 +
 app/test/test_cryptodev_perf.c | 551 +++--
 2 files changed, 370 insertions(+), 183 deletions(-)

diff --git a/app/test/test_cryptodev.h b/app/test/test_cryptodev.h
index d549eca..382802c 100644
--- a/app/test/test_cryptodev.h
+++ b/app/test/test_cryptodev.h
@@ -64,7 +64,9 @@
 #define AES_XCBC_MAC_KEY_SZ(16)

 #define TRUNCATED_DIGEST_BYTE_LENGTH_SHA1  (12)
+#define TRUNCATED_DIGEST_BYTE_LENGTH_SHA224(16)
 #define TRUNCATED_DIGEST_BYTE_LENGTH_SHA256(16)
+#define TRUNCATED_DIGEST_BYTE_LENGTH_SHA384(24)
 #define TRUNCATED_DIGEST_BYTE_LENGTH_SHA512(32)

 #endif /* TEST_CRYPTODEV_H_ */
diff --git a/app/test/test_cryptodev_perf.c b/app/test/test_cryptodev_perf.c
index 06148d0..6c43a93 100644
--- a/app/test/test_cryptodev_perf.c
+++ b/app/test/test_cryptodev_perf.c
@@ -492,12 +492,11 @@ const char plaintext_quote[] =
 #define CIPHER_KEY_LENGTH_AES_CBC  (16)
 #define CIPHER_IV_LENGTH_AES_CBC   (CIPHER_KEY_LENGTH_AES_CBC)

-
-static uint8_t aes_cbc_key[] = {
+static uint8_t aes_cbc_128_key[] = {
0xE4, 0x23, 0x33, 0x8A, 0x35, 0x64, 0x61, 0xE2,
0xF1, 0x35, 0x5C, 0x3B, 0xDD, 0x9A, 0x65, 0xBA };

-static uint8_t aes_cbc_iv[] = {
+static uint8_t aes_cbc_128_iv[] = {
0xf5, 0xd3, 0x89, 0x0f, 0x47, 0x00, 0xcb, 0x52,
0x42, 0x1a, 0x7d, 0x3d, 0xf5, 0x82, 0x80, 0xf1 };

@@ -1846,7 +1845,7 @@ test_perf_crypto_qp_vary_burst_size(uint16_t dev_num)

ut_params->cipher_xform.cipher.algo = RTE_CRYPTO_CIPHER_AES_CBC;
ut_params->cipher_xform.cipher.op = RTE_CRYPTO_CIPHER_OP_DECRYPT;
-   ut_params->cipher_xform.cipher.key.data = aes_cbc_key;
+   ut_params->cipher_xform.cipher.key.data = aes_cbc_128_key;
ut_params->cipher_xform.cipher.key.length = CIPHER_IV_LENGTH_AES_CBC;


@@ -1902,7 +1901,7 @@ test_perf_crypto_qp_vary_burst_size(uint16_t dev_num)
op->sym->cipher.iv.phys_addr = rte_pktmbuf_mtophys(m);
op->sym->cipher.iv.length = CIPHER_IV_LENGTH_AES_CBC;

-   rte_memcpy(op->sym->cipher.iv.data, aes_cbc_iv,
+   rte_memcpy(op->sym->cipher.iv.data, aes_cbc_128_iv,
CIPHER_IV_LENGTH_AES_CBC);

op->sym->cipher.data.offset = CIPHER_IV_LENGTH_AES_CBC;
@@ -1985,169 +1984,6 @@ test_perf_crypto_qp_vary_burst_size(uint16_t dev_num)
 }

 static int
-test_perf_AES_CBC_HMAC_SHA256_encrypt_digest_vary_req_size(uint16_t dev_num)
-{
-   uint16_t index;
-   uint32_t burst_sent, burst_received;
-   uint32_t b, num_sent, num_received;
-   uint64_t failed_polls, retries, start_cycles, end_cycles;
-   const uint64_t mhz = rte_get_tsc_hz()/100;
-   double throughput, mmps;
-
-   struct rte_crypto_op *c_ops[DEFAULT_BURST_SIZE];
-   struct rte_crypto_op *proc_ops[DEFAULT_BURST_SIZE];
-
-   struct crypto_testsuite_params *ts_params = _params;
-   struct crypto_unittest_params *ut_params = _params;
-   struct crypto_data_params *data_params = aes_cbc_hmac_sha256_output;
-
-   if (rte_cryptodev_count() == 0) {
-   printf("\nNo crypto devices available. Is kernel driver 
loaded?\n");
-   return TEST_FAILED;
-   }
-
-   /* Setup Cipher Parameters */
-   ut_params->cipher_xform.type = RTE_CRYPTO_SYM_XFORM_CIPHER;
-   ut_params->cipher_xform.next = _params->auth_xform;
-
-   ut_params->cipher_xform.cipher.algo = RTE_CRYPTO_CIPHER_AES_CBC;
-   ut_params->cipher_xform.cipher.op = RTE_CRYPTO_CIPHER_OP_ENCRYPT;
-   ut_params->cipher_xform.cipher.key.data = aes_cbc_key;
-   ut_params->cipher_xform.cipher.key.length = CIPHER_IV_LENGTH_AES_CBC;
-
-   /* Setup HMAC Parameters */
-   ut_params->auth_xform.type = RTE_CRYPTO_SYM_XFORM_AUTH;
-   ut_params->auth_xform.next = NULL;
-
-   ut_params->auth_xform.auth.op = RTE_CRYPTO_AUTH_OP_GENERATE;
-   ut_params->auth_xform.auth.algo = RTE_CRYPTO_AUTH_SHA256_HMAC;
-   ut_params->auth_xform.auth.key.data = hmac_sha256_key;
-   ut_params->auth_xform.auth.key.length = HMAC_KEY_LENGTH_SHA256;
-   ut_params->auth_xform.auth.digest_length = DIGEST_BYTE_LENGTH_SHA256;
-
-   /* Create Crypto session*/
-   ut_params->sess = rte_cryptodev_sym_session_create(ts_params->dev_id,
-   _params->cipher_xform);
-
-   TEST_ASSERT_NOT_NULL(ut_params->sess, "Session creation failed");
-
-   printf("\nThroughput test which will continually attempt to send "
-   "AES128_CBC_SHA256_HMAC requests with a constant burst "
-   "size of %u 

[dpdk-dev] [PATCH 2/4] app/test: adding Snow3g performance test

2016-06-16 Thread Jain, Deepak K
From: Fiona Trahe 

Adding performance test for snow3g wireless algorithm.
Performance test can run over both software and hardware.

Signed-off-by: Fiona Trahe 
Signed-off-by: Jain, Deepak K 
Signed-off-by: Declan Doherty 
---
 app/test/test_cryptodev.h  |   2 +-
 app/test/test_cryptodev_perf.c | 688 -
 2 files changed, 688 insertions(+), 2 deletions(-)

diff --git a/app/test/test_cryptodev.h b/app/test/test_cryptodev.h
index 6059a01..d549eca 100644
--- a/app/test/test_cryptodev.h
+++ b/app/test/test_cryptodev.h
@@ -46,7 +46,7 @@
 #define DEFAULT_BURST_SIZE  (64)
 #define DEFAULT_NUM_XFORMS  (2)
 #define NUM_MBUFS   (8191)
-#define MBUF_CACHE_SIZE (250)
+#define MBUF_CACHE_SIZE (256)
 #define MBUF_DATAPAYLOAD_SIZE  (2048 + DIGEST_BYTE_LENGTH_SHA512)
 #define MBUF_SIZE  (sizeof(struct rte_mbuf) + \
RTE_PKTMBUF_HEADROOM + MBUF_DATAPAYLOAD_SIZE)
diff --git a/app/test/test_cryptodev_perf.c b/app/test/test_cryptodev_perf.c
index b3f4fd9..06148d0 100644
--- a/app/test/test_cryptodev_perf.c
+++ b/app/test/test_cryptodev_perf.c
@@ -58,6 +58,25 @@ struct crypto_testsuite_params {
uint8_t dev_id;
 };

+enum chain_mode {
+   CIPHER_HASH,
+   HASH_CIPHER,
+   CIPHER_ONLY,
+   HASH_ONLY
+};
+
+struct perf_test_params {
+
+   unsigned total_operations;
+   unsigned burst_size;
+   unsigned buf_size;
+
+   enum chain_mode chain;
+
+   enum rte_crypto_cipher_algorithm cipher_algo;
+   unsigned cipher_key_length;
+   enum rte_crypto_auth_algorithm auth_algo;
+};

 #define MAX_NUM_OF_OPS_PER_UT  (128)

@@ -75,6 +94,98 @@ struct crypto_unittest_params {
uint8_t *digest;
 };

+static struct rte_cryptodev_sym_session *
+test_perf_create_snow3g_session(uint8_t dev_id, enum chain_mode chain,
+   enum rte_crypto_cipher_algorithm cipher_algo, unsigned 
cipher_key_len,
+   enum rte_crypto_auth_algorithm auth_algo);
+static struct rte_mbuf *
+test_perf_create_pktmbuf(struct rte_mempool *mpool, unsigned buf_sz);
+static inline struct rte_crypto_op *
+test_perf_set_crypto_op_snow3g(struct rte_crypto_op *op, struct rte_mbuf *m,
+   struct rte_cryptodev_sym_session *sess, unsigned data_len,
+   unsigned digest_len);
+static uint32_t get_auth_digest_length(enum rte_crypto_auth_algorithm algo);
+
+
+static const char *chain_mode_name(enum chain_mode mode)
+{
+   switch (mode) {
+   case CIPHER_HASH: return "cipher_hash"; break;
+   case HASH_CIPHER: return "hash_cipher"; break;
+   case CIPHER_ONLY: return "cipher_only"; break;
+   case HASH_ONLY: return "hash_only"; break;
+   default: return ""; break;
+   }
+}
+
+static const char *pmd_name(enum rte_cryptodev_type pmd)
+{
+   switch (pmd) {
+   case RTE_CRYPTODEV_NULL_PMD: return CRYPTODEV_NAME_NULL_PMD; break;
+   case RTE_CRYPTODEV_AESNI_GCM_PMD:
+   return CRYPTODEV_NAME_AESNI_GCM_PMD;
+   case RTE_CRYPTODEV_AESNI_MB_PMD:
+   return CRYPTODEV_NAME_AESNI_MB_PMD;
+   case RTE_CRYPTODEV_QAT_SYM_PMD:
+   return CRYPTODEV_NAME_QAT_SYM_PMD;
+   case RTE_CRYPTODEV_SNOW3G_PMD:
+   return CRYPTODEV_NAME_SNOW3G_PMD;
+   default:
+   return "";
+   }
+}
+
+static const char *cipher_algo_name(enum rte_crypto_cipher_algorithm 
cipher_algo)
+{
+   switch (cipher_algo) {
+   case RTE_CRYPTO_CIPHER_NULL: return "NULL";
+   case RTE_CRYPTO_CIPHER_3DES_CBC: return "3DES_CBC";
+   case RTE_CRYPTO_CIPHER_3DES_CTR: return "3DES_CTR";
+   case RTE_CRYPTO_CIPHER_3DES_ECB: return "3DES_ECB";
+   case RTE_CRYPTO_CIPHER_AES_CBC: return "AES_CBC";
+   case RTE_CRYPTO_CIPHER_AES_CCM: return "AES_CCM";
+   case RTE_CRYPTO_CIPHER_AES_CTR: return "AES_CTR";
+   case RTE_CRYPTO_CIPHER_AES_ECB: return "AES_ECB";
+   case RTE_CRYPTO_CIPHER_AES_F8: return "AES_F8";
+   case RTE_CRYPTO_CIPHER_AES_GCM: return "AES_GCM";
+   case RTE_CRYPTO_CIPHER_AES_XTS: return "AES_XTS";
+   case RTE_CRYPTO_CIPHER_ARC4: return "ARC4";
+   case RTE_CRYPTO_CIPHER_KASUMI_F8: return "KASUMI_F8";
+   case RTE_CRYPTO_CIPHER_SNOW3G_UEA2: return "SNOW3G_UEA2";
+   case RTE_CRYPTO_CIPHER_ZUC_EEA3: return "ZUC_EEA3";
+   default: return "Another cipher algo";
+   }
+}
+
+static const char *auth_algo_name(enum rte_crypto_auth_algorithm auth_algo)
+{
+   switch (auth_algo) {
+   case RTE_CRYPTO_AUTH_NULL: return "NULL"; break;
+   case RTE_CRYPTO_AUTH_AES_CBC_MAC: return "AES_CBC_MAC"; break;
+   case RTE_CRYPTO_AUTH_AES_CCM: return "AES_CCM"; break;
+   case RTE_CRYPTO_AUTH_AES_CMAC: return "AES_CMAC,"; break;
+   case RTE_CRYPTO_AUTH_AES_GCM: return "AES_GCM"; break;
+   case RTE_CRYPTO_AUTH_AES_GMAC: return "AES_GMAC"; break;
+   

[dpdk-dev] [PATCH 1/4] cryptodev: add rte_crypto_op_bulk_free function

2016-06-16 Thread Jain, Deepak K
From: Declan Doherty 

Adding rte_crypto_op_bulk_free to free up the ops in bulk
so as to expect improvement in performance.

Signed-off-by: Declan Doherty 
---
 lib/librte_cryptodev/rte_crypto.h | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/lib/librte_cryptodev/rte_crypto.h 
b/lib/librte_cryptodev/rte_crypto.h
index 5bc3eaa..31abbdc 100644
--- a/lib/librte_cryptodev/rte_crypto.h
+++ b/lib/librte_cryptodev/rte_crypto.h
@@ -328,6 +328,21 @@ rte_crypto_op_free(struct rte_crypto_op *op)
 }

 /**
+ * free crypto operation structure
+ * If operation has been allocate from a rte_mempool, then the operation will
+ * be returned to the mempool.
+ *
+ * @param  op  symmetric crypto operation
+ */
+static inline void
+rte_crypto_op_bulk_free(struct rte_mempool *mpool, struct rte_crypto_op **ops,
+   uint16_t nb_ops)
+{
+   if (ops != NULL)
+   rte_mempool_put_bulk(mpool, (void * const *)ops, nb_ops);
+}
+
+/**
  * Allocate a symmetric crypto operation in the private data of an mbuf.
  *
  * @param  m   mbuf which is associated with the crypto operation, the
-- 
2.5.5



[dpdk-dev] [PATCH 0/4] Extending cryptodev Performance tests

2016-06-16 Thread Jain, Deepak K
Performance tests haven been extended in this patchset.

Patchset consists of 4 patches:
Patch 1 adds new function rte_crypto_op_bulk_free to be used in patch 2 and 
patch 3.
Patch 2 add snow3g performance tests.
Patch 3 updates the existing aes performanc test
Patch 4 fixes the typo in names of perftest


Declan Doherty (1):
  cryptodev: add rte_crypto_op_bulk_free function

Fiona Trahe (2):
  app/test: adding Snow3g performance test
  app/test: updating AES SHA performance test

Jain, Deepak K (1):
  app/test: typo fixing

 app/test/test_cryptodev.h |4 +-
 app/test/test_cryptodev_perf.c| 1153 -
 lib/librte_cryptodev/rte_crypto.h |   15 +
 3 files changed, 1030 insertions(+), 142 deletions(-)

-- 
2.5.5



[dpdk-dev] [PATCH v3 3/4] bonding: take queue spinlock in rx/tx burst functions

2016-06-16 Thread Thomas Monjalon
2016-06-16 16:41, Iremonger, Bernard:
> Hi Thomas,
> 
> > 2016-06-16 15:32, Bruce Richardson:
> > > On Mon, Jun 13, 2016 at 01:28:08PM +0100, Iremonger, Bernard wrote:
> > > > > Why does this particular PMD need spinlocks when doing RX and TX,
> > > > > while other device types do not? How is adding/removing devices
> > > > > from a bonded device different to other control operations that
> > > > > can be done on physical PMDs? Is this not similar to say bringing
> > > > > down or hotplugging out a physical port just before an RX or TX
> > operation takes place?
> > > > > For all other PMDs we rely on the app to synchronise control and
> > > > > data plane operation - why not here?
> > > > >
> > > > > /Bruce
> > > >
> > > > This issue arose during VM live migration testing.
> > > > For VM live migration it is necessary (while traffic is running) to be 
> > > > able to
> > remove a bonded slave device, stop it, close it and detach it.
> > > > It a slave device is removed from a bonded device while traffic is 
> > > > running
> > a segmentation fault may occur in the rx/tx burst function. The spinlock has
> > been added to prevent this occurring.
> > > >
> > > > The bonding device already uses a spinlock to synchronise between the
> > add and remove functionality and the slave_link_status_change_monitor
> > code.
> > > >
> > > > Previously testpmd did not allow, stop, close or detach of PMD while
> > > > traffic was running. Testpmd has been modified with the following
> > > > patchset
> > > >
> > > > http://dpdk.org/dev/patchwork/patch/13472/
> > > >
> > > > It now allows stop, close and detach of a PMD provided in it is not
> > forwarding and is not a slave of bonded PMD.
> > > >
> > > I will admit to not being fully convinced, but if nobody else has any
> > > serious objections, and since this patch has been reviewed and acked,
> > > I'm ok to merge it in. I'll do so shortly.
> > 
> > Please hold on.
> > Seeing locks introduced in the Rx/Tx path is an alert.
> > We clearly need a design document to explain where locks can be used and
> > what are the responsibility of the control plane.
> > If everybody agrees in this document that DPDK can have some locks in the
> > fast path, then OK to merge it.
> > 
> > So I would say NACK for 16.07 and maybe postpone to 16.11.
> 
> Looking at the documentation for the bonding PMD.
> 
> http://dpdk.org/doc/guides/prog_guide/link_bonding_poll_mode_drv_lib.html
> 
> In section 10.2 it states the following:
> 
> Bonded devices support the dynamical addition and removal of slave devices 
> using the rte_eth_bond_slave_add / rte_eth_bond_slave_remove APIs.
> 
> If a slave device is added or removed while traffic is running, there is the 
> possibility of a segmentation fault in the rx/tx burst functions. This is 
> most likely to occur in the round robin bonding mode.
> 
> This patch set fixes what appears to be a bug in the bonding PMD.

It can be fixed by removing this statement in the doc.

One of the design principle of DPDK is to avoid locks.

> Performance measurements have been made with this patch set applied and 
> without the patches applied using 64 byte packets. 
> 
> With the patches applied the following drop in performance was observed:
> 
> % drop for fwd+io:0.16%
> % drop for fwd+mac:   0.39%
> 
> This patch set has been reviewed and ack'ed, so I think it should be applied 
> in 16.07

I understand your point of view and I gave mine.
Now we need more opinions from others.


[dpdk-dev] Performance hit - NICs on different CPU sockets

2016-06-16 Thread Take Ceara
On Thu, Jun 16, 2016 at 6:59 PM, Wiles, Keith  wrote:
>
> On 6/16/16, 11:56 AM, "dev on behalf of Wiles, Keith"  dpdk.org on behalf of keith.wiles at intel.com> wrote:
>
>>
>>On 6/16/16, 11:20 AM, "Take Ceara"  wrote:
>>
>>>On Thu, Jun 16, 2016 at 5:29 PM, Wiles, Keith  
>>>wrote:
>>>

 Right now I do not know what the issue is with the system. Could be too 
 many Rx/Tx ring pairs per port and limiting the memory in the NICs, which 
 is why you get better performance when you have 8 core per port. I am not 
 really seeing the whole picture and how DPDK is configured to help more. 
 Sorry.
>>>
>>>I doubt that there is a limitation wrt running 16 cores per port vs 8
>>>cores per port as I've tried with two different machines connected
>>>back to back each with one X710 port and 16 cores on each of them
>>>running on that port. In that case our performance doubled as
>>>expected.
>>>

 Maybe seeing the DPDK command line would help.
>>>
>>>The command line I use with ports 01:00.3 and 81:00.3 is:
>>>./warp17 -c 0xF3   -m 32768 -w :81:00.3 -w :01:00.3 --
>>>--qmap 0.0x003FF003F0 --qmap 1.0x0FC00FFC00
>>>
>>>Our own qmap args allow the user to control exactly how cores are
>>>split between ports. In this case we end up with:
>>>
>>>warp17> show port map
>>>Port 0[socket: 0]:
>>>   Core 4[socket:0] (Tx: 0, Rx: 0)
>>>   Core 5[socket:0] (Tx: 1, Rx: 1)
>>>   Core 6[socket:0] (Tx: 2, Rx: 2)
>>>   Core 7[socket:0] (Tx: 3, Rx: 3)
>>>   Core 8[socket:0] (Tx: 4, Rx: 4)
>>>   Core 9[socket:0] (Tx: 5, Rx: 5)
>>>   Core 20[socket:0] (Tx: 6, Rx: 6)
>>>   Core 21[socket:0] (Tx: 7, Rx: 7)
>>>   Core 22[socket:0] (Tx: 8, Rx: 8)
>>>   Core 23[socket:0] (Tx: 9, Rx: 9)
>>>   Core 24[socket:0] (Tx: 10, Rx: 10)
>>>   Core 25[socket:0] (Tx: 11, Rx: 11)
>>>   Core 26[socket:0] (Tx: 12, Rx: 12)
>>>   Core 27[socket:0] (Tx: 13, Rx: 13)
>>>   Core 28[socket:0] (Tx: 14, Rx: 14)
>>>   Core 29[socket:0] (Tx: 15, Rx: 15)
>>>
>>>Port 1[socket: 1]:
>>>   Core 10[socket:1] (Tx: 0, Rx: 0)
>>>   Core 11[socket:1] (Tx: 1, Rx: 1)
>>>   Core 12[socket:1] (Tx: 2, Rx: 2)
>>>   Core 13[socket:1] (Tx: 3, Rx: 3)
>>>   Core 14[socket:1] (Tx: 4, Rx: 4)
>>>   Core 15[socket:1] (Tx: 5, Rx: 5)
>>>   Core 16[socket:1] (Tx: 6, Rx: 6)
>>>   Core 17[socket:1] (Tx: 7, Rx: 7)
>>>   Core 18[socket:1] (Tx: 8, Rx: 8)
>>>   Core 19[socket:1] (Tx: 9, Rx: 9)
>>>   Core 30[socket:1] (Tx: 10, Rx: 10)
>>>   Core 31[socket:1] (Tx: 11, Rx: 11)
>>>   Core 32[socket:1] (Tx: 12, Rx: 12)
>>>   Core 33[socket:1] (Tx: 13, Rx: 13)
>>>   Core 34[socket:1] (Tx: 14, Rx: 14)
>>>   Core 35[socket:1] (Tx: 15, Rx: 15)
>>
>>On each socket you have 10 physical cores or 20 lcores per socket for 40 
>>lcores total.
>>
>>The above is listing the LCORES (or hyper-threads) and not COREs, which I 
>>understand some like to think they are interchangeable. The problem is the 
>>hyper-threads are logically interchangeable, but not performance wise. If you 
>>have two run-to-completion threads on a single physical core each on a 
>>different hyper-thread of that core [0,1], then the second lcore or thread 
>>(1) on that physical core will only get at most about 30-20% of the CPU 
>>cycles. Normally it is much less, unless you tune the code to make sure each 
>>thread is not trying to share the internal execution units, but some internal 
>>execution units are always shared.
>>
>>To get the best performance when hyper-threading is enable is to not run both 
>>threads on a single physical core, but only run one hyper-thread-0.
>>
>>In the table below the table lists the physical core id and each of the lcore 
>>ids per socket. Use the first lcore per socket for the best performance:
>>Core 1 [1, 21][11, 31]
>>Use lcore 1 or 11 depending on the socket you are on.
>>
>>The info below is most likely the best performance and utilization of your 
>>system. If I got the values right ?
>>
>>./warp17 -c 0x0FFFe0   -m 32768 -w :81:00.3 -w :01:00.3 --
>>--qmap 0.0x0003FE --qmap 1.0x0FFE00
>>
>>Port 0[socket: 0]:
>>   Core 2[socket:0] (Tx: 0, Rx: 0)
>>   Core 3[socket:0] (Tx: 1, Rx: 1)
>>   Core 4[socket:0] (Tx: 2, Rx: 2)
>>   Core 5[socket:0] (Tx: 3, Rx: 3)
>>   Core 6[socket:0] (Tx: 4, Rx: 4)
>>   Core 7[socket:0] (Tx: 5, Rx: 5)
>>   Core 8[socket:0] (Tx: 6, Rx: 6)
>>   Core 9[socket:0] (Tx: 7, Rx: 7)
>>
>>8 cores on first socket leaving 0-1 lcores for Linux.
>
> 9 cores and leaving the first core or two lcores for Linux
>>
>>Port 1[socket: 1]:
>>   Core 10[socket:1] (Tx: 0, Rx: 0)
>>   Core 11[socket:1] (Tx: 1, Rx: 1)
>>   Core 12[socket:1] (Tx: 2, Rx: 2)
>>   Core 13[socket:1] (Tx: 3, Rx: 3)
>>   Core 14[socket:1] (Tx: 4, Rx: 4)
>>   Core 15[socket:1] (Tx: 5, Rx: 5)
>>   Core 16[socket:1] (Tx: 6, Rx: 6)
>>   Core 17[socket:1] (Tx: 7, Rx: 7)
>>   Core 18[socket:1] (Tx: 8, Rx: 8)
>>   Core 19[socket:1] (Tx: 9, Rx: 9)
>>
>>All 10 cores on the second socket.

The values were almost right :) But that's because we reserve the
first two lcores 

[dpdk-dev] Performance hit - NICs on different CPU sockets

2016-06-16 Thread Wiles, Keith

On 6/16/16, 3:16 PM, "dev on behalf of Wiles, Keith"  wrote:

>
>On 6/16/16, 3:00 PM, "Take Ceara"  wrote:
>
>>On Thu, Jun 16, 2016 at 9:33 PM, Wiles, Keith  
>>wrote:
>>> On 6/16/16, 1:20 PM, "Take Ceara"  wrote:
>>>
On Thu, Jun 16, 2016 at 6:59 PM, Wiles, Keith  
wrote:
>
> On 6/16/16, 11:56 AM, "dev on behalf of Wiles, Keith"  dpdk.org on behalf of keith.wiles at intel.com> wrote:
>
>>
>>On 6/16/16, 11:20 AM, "Take Ceara"  wrote:
>>
>>>On Thu, Jun 16, 2016 at 5:29 PM, Wiles, Keith  
>>>wrote:
>>>

 Right now I do not know what the issue is with the system. Could be 
 too many Rx/Tx ring pairs per port and limiting the memory in the 
 NICs, which is why you get better performance when you have 8 core per 
 port. I am not really seeing the whole picture and how DPDK is 
 configured to help more. Sorry.
>>>
>>>I doubt that there is a limitation wrt running 16 cores per port vs 8
>>>cores per port as I've tried with two different machines connected
>>>back to back each with one X710 port and 16 cores on each of them
>>>running on that port. In that case our performance doubled as
>>>expected.
>>>

 Maybe seeing the DPDK command line would help.
>>>
>>>The command line I use with ports 01:00.3 and 81:00.3 is:
>>>./warp17 -c 0xF3   -m 32768 -w :81:00.3 -w :01:00.3 --
>>>--qmap 0.0x003FF003F0 --qmap 1.0x0FC00FFC00
>>>
>>>Our own qmap args allow the user to control exactly how cores are
>>>split between ports. In this case we end up with:
>>>
>>>warp17> show port map
>>>Port 0[socket: 0]:
>>>   Core 4[socket:0] (Tx: 0, Rx: 0)
>>>   Core 5[socket:0] (Tx: 1, Rx: 1)
>>>   Core 6[socket:0] (Tx: 2, Rx: 2)
>>>   Core 7[socket:0] (Tx: 3, Rx: 3)
>>>   Core 8[socket:0] (Tx: 4, Rx: 4)
>>>   Core 9[socket:0] (Tx: 5, Rx: 5)
>>>   Core 20[socket:0] (Tx: 6, Rx: 6)
>>>   Core 21[socket:0] (Tx: 7, Rx: 7)
>>>   Core 22[socket:0] (Tx: 8, Rx: 8)
>>>   Core 23[socket:0] (Tx: 9, Rx: 9)
>>>   Core 24[socket:0] (Tx: 10, Rx: 10)
>>>   Core 25[socket:0] (Tx: 11, Rx: 11)
>>>   Core 26[socket:0] (Tx: 12, Rx: 12)
>>>   Core 27[socket:0] (Tx: 13, Rx: 13)
>>>   Core 28[socket:0] (Tx: 14, Rx: 14)
>>>   Core 29[socket:0] (Tx: 15, Rx: 15)
>>>
>>>Port 1[socket: 1]:
>>>   Core 10[socket:1] (Tx: 0, Rx: 0)
>>>   Core 11[socket:1] (Tx: 1, Rx: 1)
>>>   Core 12[socket:1] (Tx: 2, Rx: 2)
>>>   Core 13[socket:1] (Tx: 3, Rx: 3)
>>>   Core 14[socket:1] (Tx: 4, Rx: 4)
>>>   Core 15[socket:1] (Tx: 5, Rx: 5)
>>>   Core 16[socket:1] (Tx: 6, Rx: 6)
>>>   Core 17[socket:1] (Tx: 7, Rx: 7)
>>>   Core 18[socket:1] (Tx: 8, Rx: 8)
>>>   Core 19[socket:1] (Tx: 9, Rx: 9)
>>>   Core 30[socket:1] (Tx: 10, Rx: 10)
>>>   Core 31[socket:1] (Tx: 11, Rx: 11)
>>>   Core 32[socket:1] (Tx: 12, Rx: 12)
>>>   Core 33[socket:1] (Tx: 13, Rx: 13)
>>>   Core 34[socket:1] (Tx: 14, Rx: 14)
>>>   Core 35[socket:1] (Tx: 15, Rx: 15)
>>
>>On each socket you have 10 physical cores or 20 lcores per socket for 40 
>>lcores total.
>>
>>The above is listing the LCORES (or hyper-threads) and not COREs, which I 
>>understand some like to think they are interchangeable. The problem is 
>>the hyper-threads are logically interchangeable, but not performance 
>>wise. If you have two run-to-completion threads on a single physical core 
>>each on a different hyper-thread of that core [0,1], then the second 
>>lcore or thread (1) on that physical core will only get at most about 
>>30-20% of the CPU cycles. Normally it is much less, unless you tune the 
>>code to make sure each thread is not trying to share the internal 
>>execution units, but some internal execution units are always shared.
>>
>>To get the best performance when hyper-threading is enable is to not run 
>>both threads on a single physical core, but only run one hyper-thread-0.
>>
>>In the table below the table lists the physical core id and each of the 
>>lcore ids per socket. Use the first lcore per socket for the best 
>>performance:
>>Core 1 [1, 21][11, 31]
>>Use lcore 1 or 11 depending on the socket you are on.
>>
>>The info below is most likely the best performance and utilization of 
>>your system. If I got the values right ?
>>
>>./warp17 -c 0x0FFFe0   -m 32768 -w :81:00.3 -w :01:00.3 --
>>--qmap 0.0x0003FE --qmap 1.0x0FFE00
>>
>>Port 0[socket: 0]:
>>   Core 2[socket:0] (Tx: 0, Rx: 0)
>>   Core 3[socket:0] (Tx: 1, Rx: 1)
>>   Core 4[socket:0] (Tx: 2, Rx: 2)
>>   Core 5[socket:0] (Tx: 3, Rx: 3)
>>   Core 6[socket:0] (Tx: 4, Rx: 4)
>>   Core 7[socket:0] (Tx: 5, Rx: 5)
>>   Core 8[socket:0] (Tx: 6, Rx: 6)
>>   Core 9[socket:0] (Tx: 

[dpdk-dev] Performance hit - NICs on different CPU sockets

2016-06-16 Thread Wiles, Keith

On 6/16/16, 3:00 PM, "Take Ceara"  wrote:

>On Thu, Jun 16, 2016 at 9:33 PM, Wiles, Keith  wrote:
>> On 6/16/16, 1:20 PM, "Take Ceara"  wrote:
>>
>>>On Thu, Jun 16, 2016 at 6:59 PM, Wiles, Keith  
>>>wrote:

 On 6/16/16, 11:56 AM, "dev on behalf of Wiles, Keith" >>> dpdk.org on behalf of keith.wiles at intel.com> wrote:

>
>On 6/16/16, 11:20 AM, "Take Ceara"  wrote:
>
>>On Thu, Jun 16, 2016 at 5:29 PM, Wiles, Keith  
>>wrote:
>>
>>>
>>> Right now I do not know what the issue is with the system. Could be too 
>>> many Rx/Tx ring pairs per port and limiting the memory in the NICs, 
>>> which is why you get better performance when you have 8 core per port. 
>>> I am not really seeing the whole picture and how DPDK is configured to 
>>> help more. Sorry.
>>
>>I doubt that there is a limitation wrt running 16 cores per port vs 8
>>cores per port as I've tried with two different machines connected
>>back to back each with one X710 port and 16 cores on each of them
>>running on that port. In that case our performance doubled as
>>expected.
>>
>>>
>>> Maybe seeing the DPDK command line would help.
>>
>>The command line I use with ports 01:00.3 and 81:00.3 is:
>>./warp17 -c 0xF3   -m 32768 -w :81:00.3 -w :01:00.3 --
>>--qmap 0.0x003FF003F0 --qmap 1.0x0FC00FFC00
>>
>>Our own qmap args allow the user to control exactly how cores are
>>split between ports. In this case we end up with:
>>
>>warp17> show port map
>>Port 0[socket: 0]:
>>   Core 4[socket:0] (Tx: 0, Rx: 0)
>>   Core 5[socket:0] (Tx: 1, Rx: 1)
>>   Core 6[socket:0] (Tx: 2, Rx: 2)
>>   Core 7[socket:0] (Tx: 3, Rx: 3)
>>   Core 8[socket:0] (Tx: 4, Rx: 4)
>>   Core 9[socket:0] (Tx: 5, Rx: 5)
>>   Core 20[socket:0] (Tx: 6, Rx: 6)
>>   Core 21[socket:0] (Tx: 7, Rx: 7)
>>   Core 22[socket:0] (Tx: 8, Rx: 8)
>>   Core 23[socket:0] (Tx: 9, Rx: 9)
>>   Core 24[socket:0] (Tx: 10, Rx: 10)
>>   Core 25[socket:0] (Tx: 11, Rx: 11)
>>   Core 26[socket:0] (Tx: 12, Rx: 12)
>>   Core 27[socket:0] (Tx: 13, Rx: 13)
>>   Core 28[socket:0] (Tx: 14, Rx: 14)
>>   Core 29[socket:0] (Tx: 15, Rx: 15)
>>
>>Port 1[socket: 1]:
>>   Core 10[socket:1] (Tx: 0, Rx: 0)
>>   Core 11[socket:1] (Tx: 1, Rx: 1)
>>   Core 12[socket:1] (Tx: 2, Rx: 2)
>>   Core 13[socket:1] (Tx: 3, Rx: 3)
>>   Core 14[socket:1] (Tx: 4, Rx: 4)
>>   Core 15[socket:1] (Tx: 5, Rx: 5)
>>   Core 16[socket:1] (Tx: 6, Rx: 6)
>>   Core 17[socket:1] (Tx: 7, Rx: 7)
>>   Core 18[socket:1] (Tx: 8, Rx: 8)
>>   Core 19[socket:1] (Tx: 9, Rx: 9)
>>   Core 30[socket:1] (Tx: 10, Rx: 10)
>>   Core 31[socket:1] (Tx: 11, Rx: 11)
>>   Core 32[socket:1] (Tx: 12, Rx: 12)
>>   Core 33[socket:1] (Tx: 13, Rx: 13)
>>   Core 34[socket:1] (Tx: 14, Rx: 14)
>>   Core 35[socket:1] (Tx: 15, Rx: 15)
>
>On each socket you have 10 physical cores or 20 lcores per socket for 40 
>lcores total.
>
>The above is listing the LCORES (or hyper-threads) and not COREs, which I 
>understand some like to think they are interchangeable. The problem is the 
>hyper-threads are logically interchangeable, but not performance wise. If 
>you have two run-to-completion threads on a single physical core each on a 
>different hyper-thread of that core [0,1], then the second lcore or thread 
>(1) on that physical core will only get at most about 30-20% of the CPU 
>cycles. Normally it is much less, unless you tune the code to make sure 
>each thread is not trying to share the internal execution units, but some 
>internal execution units are always shared.
>
>To get the best performance when hyper-threading is enable is to not run 
>both threads on a single physical core, but only run one hyper-thread-0.
>
>In the table below the table lists the physical core id and each of the 
>lcore ids per socket. Use the first lcore per socket for the best 
>performance:
>Core 1 [1, 21][11, 31]
>Use lcore 1 or 11 depending on the socket you are on.
>
>The info below is most likely the best performance and utilization of your 
>system. If I got the values right ?
>
>./warp17 -c 0x0FFFe0   -m 32768 -w :81:00.3 -w :01:00.3 --
>--qmap 0.0x0003FE --qmap 1.0x0FFE00
>
>Port 0[socket: 0]:
>   Core 2[socket:0] (Tx: 0, Rx: 0)
>   Core 3[socket:0] (Tx: 1, Rx: 1)
>   Core 4[socket:0] (Tx: 2, Rx: 2)
>   Core 5[socket:0] (Tx: 3, Rx: 3)
>   Core 6[socket:0] (Tx: 4, Rx: 4)
>   Core 7[socket:0] (Tx: 5, Rx: 5)
>   Core 8[socket:0] (Tx: 6, Rx: 6)
>   Core 9[socket:0] (Tx: 7, Rx: 7)
>
>8 cores on first socket leaving 0-1 lcores for Linux.

 9 cores and leaving the first core or two lcores for Linux
>
>Port 1[socket: 1]:
>   

[dpdk-dev] [PATCH] ena: Update PMD to cooperate with latest ENA firmware

2016-06-16 Thread Jan Medala
This patch includes:
* Update of ENA communication layer

* Fixed memory management issue
After allocating memzone it's required to zeroize it
as well as freeing memzone with dedicated function.

* Added debug area and host information

* Disabling readless communication regarding to HW revision

* Allocating coherent memory in node-aware way

Signed-off-by: Alexander Matushevsky 
Signed-off-by: Jakub Palider 
Signed-off-by: Jan Medala 
---
 drivers/net/ena/base/ena_com.c  | 254 +++---
 drivers/net/ena/base/ena_com.h  |  82 +++--
 drivers/net/ena/base/ena_defs/ena_admin_defs.h  | 110 +-
 drivers/net/ena/base/ena_defs/ena_eth_io_defs.h | 436 ++--
 drivers/net/ena/base/ena_defs/ena_gen_info.h|   4 +-
 drivers/net/ena/base/ena_eth_com.c  |  42 +--
 drivers/net/ena/base/ena_eth_com.h  |  14 +
 drivers/net/ena/base/ena_plat_dpdk.h|  42 ++-
 drivers/net/ena/ena_ethdev.c| 268 ++-
 drivers/net/ena/ena_ethdev.h|  40 +++
 10 files changed, 674 insertions(+), 618 deletions(-)

diff --git a/drivers/net/ena/base/ena_com.c b/drivers/net/ena/base/ena_com.c
index a21a951..4431346 100644
--- a/drivers/net/ena/base/ena_com.c
+++ b/drivers/net/ena/base/ena_com.c
@@ -42,9 +42,6 @@
 #define ENA_ASYNC_QUEUE_DEPTH 4
 #define ENA_ADMIN_QUEUE_DEPTH 32

-#define ENA_EXTENDED_STAT_GET_FUNCT(_funct_queue) (_funct_queue & 0x)
-#define ENA_EXTENDED_STAT_GET_QUEUE(_funct_queue) (_funct_queue >> 16)
-
 #define MIN_ENA_VER (((ENA_COMMON_SPEC_VERSION_MAJOR) << \
ENA_REGS_VERSION_MAJOR_VERSION_SHIFT) \
| (ENA_COMMON_SPEC_VERSION_MINOR))
@@ -201,12 +198,16 @@ static inline void comp_ctxt_release(struct 
ena_com_admin_queue *queue,
 static struct ena_comp_ctx *get_comp_ctxt(struct ena_com_admin_queue *queue,
  u16 command_id, bool capture)
 {
-   ENA_ASSERT(command_id < queue->q_depth,
-  "command id is larger than the queue size. cmd_id: %u queue 
size %d\n",
-  command_id, queue->q_depth);
+   if (unlikely(command_id >= queue->q_depth)) {
+   ena_trc_err("command id is larger than the queue size. cmd_id: 
%u queue size %d\n",
+   command_id, queue->q_depth);
+   return NULL;
+   }

-   ENA_ASSERT(!(queue->comp_ctx[command_id].occupied && capture),
-  "Completion context is occupied");
+   if (unlikely(queue->comp_ctx[command_id].occupied && capture)) {
+   ena_trc_err("Completion context is occupied\n");
+   return NULL;
+   }

if (capture) {
ATOMIC32_INC(>outstanding_cmds);
@@ -290,7 +291,8 @@ static inline int ena_com_init_comp_ctxt(struct 
ena_com_admin_queue *queue)

for (i = 0; i < queue->q_depth; i++) {
comp_ctx = get_comp_ctxt(queue, i, false);
-   ENA_WAIT_EVENT_INIT(comp_ctx->wait_event);
+   if (comp_ctx)
+   ENA_WAIT_EVENT_INIT(comp_ctx->wait_event);
}

return 0;
@@ -315,15 +317,21 @@ ena_com_submit_admin_cmd(struct ena_com_admin_queue 
*admin_queue,
  cmd_size_in_bytes,
  comp,
  comp_size_in_bytes);
+   if (unlikely(IS_ERR(comp_ctx)))
+   admin_queue->running_state = false;
ENA_SPINLOCK_UNLOCK(admin_queue->q_lock, flags);

return comp_ctx;
 }

 static int ena_com_init_io_sq(struct ena_com_dev *ena_dev,
+ struct ena_com_create_io_ctx *ctx,
  struct ena_com_io_sq *io_sq)
 {
size_t size;
+   int dev_node;
+
+   ENA_TOUCH(ctx);

memset(_sq->desc_addr, 0x0, sizeof(struct ena_com_io_desc_addr));

@@ -334,15 +342,29 @@ static int ena_com_init_io_sq(struct ena_com_dev *ena_dev,

size = io_sq->desc_entry_size * io_sq->q_depth;

-   if (io_sq->mem_queue_type == ENA_ADMIN_PLACEMENT_POLICY_HOST)
-   ENA_MEM_ALLOC_COHERENT(ena_dev->dmadev,
-  size,
-  io_sq->desc_addr.virt_addr,
-  io_sq->desc_addr.phys_addr,
-  io_sq->desc_addr.mem_handle);
-   else
-   io_sq->desc_addr.virt_addr =
-   ENA_MEM_ALLOC(ena_dev->dmadev, size);
+   if (io_sq->mem_queue_type == ENA_ADMIN_PLACEMENT_POLICY_HOST) {
+   ENA_MEM_ALLOC_COHERENT_NODE(ena_dev->dmadev,
+   size,
+   io_sq->desc_addr.virt_addr,
+   io_sq->desc_addr.phys_addr,
+   ctx->numa_node,
+  

[dpdk-dev] [PATCH v3 17/17] ethdev: get rid of device type

2016-06-16 Thread Shreyansh Jain
From: David Marchand 

Now that hotplug has been moved to eal, there is no reason to keep the device
type in this layer.

Signed-off-by: David Marchand 
---
 app/test/virtual_pmd.c|  2 +-
 drivers/net/af_packet/rte_eth_af_packet.c |  2 +-
 drivers/net/bonding/rte_eth_bond_api.c|  2 +-
 drivers/net/cxgbe/cxgbe_main.c|  2 +-
 drivers/net/mlx4/mlx4.c   |  2 +-
 drivers/net/mlx5/mlx5.c   |  2 +-
 drivers/net/mpipe/mpipe_tilegx.c  |  2 +-
 drivers/net/null/rte_eth_null.c   |  2 +-
 drivers/net/pcap/rte_eth_pcap.c   |  2 +-
 drivers/net/ring/rte_eth_ring.c   |  2 +-
 drivers/net/vhost/rte_eth_vhost.c |  2 +-
 drivers/net/xenvirt/rte_eth_xenvirt.c |  2 +-
 examples/ip_pipeline/init.c   | 22 --
 lib/librte_ether/rte_ethdev.c |  5 ++---
 lib/librte_ether/rte_ethdev.h | 15 +--
 15 files changed, 15 insertions(+), 51 deletions(-)

diff --git a/app/test/virtual_pmd.c b/app/test/virtual_pmd.c
index b4bd2f2..8a1f0d0 100644
--- a/app/test/virtual_pmd.c
+++ b/app/test/virtual_pmd.c
@@ -581,7 +581,7 @@ virtual_ethdev_create(const char *name, struct ether_addr 
*mac_addr,
goto err;

/* reserve an ethdev entry */
-   eth_dev = rte_eth_dev_allocate(name, RTE_ETH_DEV_PCI);
+   eth_dev = rte_eth_dev_allocate(name);
if (eth_dev == NULL)
goto err;

diff --git a/drivers/net/af_packet/rte_eth_af_packet.c 
b/drivers/net/af_packet/rte_eth_af_packet.c
index f17bd7e..36ac102 100644
--- a/drivers/net/af_packet/rte_eth_af_packet.c
+++ b/drivers/net/af_packet/rte_eth_af_packet.c
@@ -648,7 +648,7 @@ rte_pmd_init_internals(const char *name,
}

/* reserve an ethdev entry */
-   *eth_dev = rte_eth_dev_allocate(name, RTE_ETH_DEV_VIRTUAL);
+   *eth_dev = rte_eth_dev_allocate(name);
if (*eth_dev == NULL)
goto error;

diff --git a/drivers/net/bonding/rte_eth_bond_api.c 
b/drivers/net/bonding/rte_eth_bond_api.c
index 53df9fe..b858ee1 100644
--- a/drivers/net/bonding/rte_eth_bond_api.c
+++ b/drivers/net/bonding/rte_eth_bond_api.c
@@ -189,7 +189,7 @@ rte_eth_bond_create(const char *name, uint8_t mode, uint8_t 
socket_id)
}

/* reserve an ethdev entry */
-   eth_dev = rte_eth_dev_allocate(name, RTE_ETH_DEV_VIRTUAL);
+   eth_dev = rte_eth_dev_allocate(name);
if (eth_dev == NULL) {
RTE_BOND_LOG(ERR, "Unable to allocate rte_eth_dev");
goto err;
diff --git a/drivers/net/cxgbe/cxgbe_main.c b/drivers/net/cxgbe/cxgbe_main.c
index ceaf5ab..922155b 100644
--- a/drivers/net/cxgbe/cxgbe_main.c
+++ b/drivers/net/cxgbe/cxgbe_main.c
@@ -1150,7 +1150,7 @@ int cxgbe_probe(struct adapter *adapter)
 */

/* reserve an ethdev entry */
-   pi->eth_dev = rte_eth_dev_allocate(name, RTE_ETH_DEV_PCI);
+   pi->eth_dev = rte_eth_dev_allocate(name);
if (!pi->eth_dev)
goto out_free;

diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index b594433..ba42c33 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -5715,7 +5715,7 @@ mlx4_pci_devinit(struct rte_pci_driver *pci_drv, struct 
rte_pci_device *pci_dev)

snprintf(name, sizeof(name), "%s port %u",
 ibv_get_device_name(ibv_dev), port);
-   eth_dev = rte_eth_dev_allocate(name, RTE_ETH_DEV_PCI);
+   eth_dev = rte_eth_dev_allocate(name);
}
if (eth_dev == NULL) {
ERROR("can not allocate rte ethdev");
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 1989a37..f6399fc 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -519,7 +519,7 @@ mlx5_pci_devinit(struct rte_pci_driver *pci_drv, struct 
rte_pci_device *pci_dev)

snprintf(name, sizeof(name), "%s port %u",
 ibv_get_device_name(ibv_dev), port);
-   eth_dev = rte_eth_dev_allocate(name, RTE_ETH_DEV_PCI);
+   eth_dev = rte_eth_dev_allocate(name);
}
if (eth_dev == NULL) {
ERROR("can not allocate rte ethdev");
diff --git a/drivers/net/mpipe/mpipe_tilegx.c b/drivers/net/mpipe/mpipe_tilegx.c
index 26e1424..9de556e 100644
--- a/drivers/net/mpipe/mpipe_tilegx.c
+++ b/drivers/net/mpipe/mpipe_tilegx.c
@@ -1587,7 +1587,7 @@ rte_pmd_mpipe_devinit(const char *ifname,
return -ENODEV;
}

-   eth_dev = rte_eth_dev_allocate(ifname, RTE_ETH_DEV_VIRTUAL);
+   eth_dev = rte_eth_dev_allocate(ifname);
if (!eth_dev) {
RTE_LOG(ERR, PMD, "%s: Failed to allocate device.\n", ifname);
rte_free(priv);
diff --git 

[dpdk-dev] [PATCH v3 16/17] ethdev: convert to eal hotplug

2016-06-16 Thread Shreyansh Jain
From: David Marchand 

Remove bus logic from ethdev hotplug by using eal for this.

Current api is preserved:
- the last port that has been created is tracked to return it to the
  application when attaching,
- the internal device name is reused when detaching.

We can not get rid of ethdev hotplug yet since we still need some mechanism
to inform applications of port creation/removal to substitute for ethdev
hotplug api.

dev_type field in struct rte_eth_dev and rte_eth_dev_allocate are kept as
is, but this information is not needed anymore and is removed in the following
commit.

Signed-off-by: David Marchand 
---
 lib/librte_ether/rte_ethdev.c | 251 ++
 1 file changed, 33 insertions(+), 218 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index a496521..12d24ff 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -72,6 +72,7 @@
 static const char *MZ_RTE_ETH_DEV_DATA = "rte_eth_dev_data";
 struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS];
 static struct rte_eth_dev_data *rte_eth_dev_data;
+static uint8_t eth_dev_last_created_port;
 static uint8_t nb_ports;

 /* spinlock for eth device callbacks */
@@ -210,6 +211,7 @@ rte_eth_dev_allocate(const char *name, enum 
rte_eth_dev_type type)
eth_dev->data->port_id = port_id;
eth_dev->attached = DEV_ATTACHED;
eth_dev->dev_type = type;
+   eth_dev_last_created_port = port_id;
nb_ports++;
return eth_dev;
 }
@@ -341,99 +343,6 @@ rte_eth_dev_count(void)
return nb_ports;
 }

-static enum rte_eth_dev_type
-rte_eth_dev_get_device_type(uint8_t port_id)
-{
-   RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, RTE_ETH_DEV_UNKNOWN);
-   return rte_eth_devices[port_id].dev_type;
-}
-
-static int
-rte_eth_dev_get_addr_by_port(uint8_t port_id, struct rte_pci_addr *addr)
-{
-   RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
-
-   if (addr == NULL) {
-   RTE_PMD_DEBUG_TRACE("Null pointer is specified\n");
-   return -EINVAL;
-   }
-
-   *addr = rte_eth_devices[port_id].pci_dev->addr;
-   return 0;
-}
-
-static int
-rte_eth_dev_get_name_by_port(uint8_t port_id, char *name)
-{
-   char *tmp;
-
-   RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
-
-   if (name == NULL) {
-   RTE_PMD_DEBUG_TRACE("Null pointer is specified\n");
-   return -EINVAL;
-   }
-
-   /* shouldn't check 'rte_eth_devices[i].data',
-* because it might be overwritten by VDEV PMD */
-   tmp = rte_eth_dev_data[port_id].name;
-   strcpy(name, tmp);
-   return 0;
-}
-
-static int
-rte_eth_dev_get_port_by_name(const char *name, uint8_t *port_id)
-{
-   int i;
-
-   if (name == NULL) {
-   RTE_PMD_DEBUG_TRACE("Null pointer is specified\n");
-   return -EINVAL;
-   }
-
-   *port_id = RTE_MAX_ETHPORTS;
-
-   for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
-
-   if (!strncmp(name,
-   rte_eth_dev_data[i].name, strlen(name))) {
-
-   *port_id = i;
-
-   return 0;
-   }
-   }
-   return -ENODEV;
-}
-
-static int
-rte_eth_dev_get_port_by_addr(const struct rte_pci_addr *addr, uint8_t *port_id)
-{
-   int i;
-   struct rte_pci_device *pci_dev = NULL;
-
-   if (addr == NULL) {
-   RTE_PMD_DEBUG_TRACE("Null pointer is specified\n");
-   return -EINVAL;
-   }
-
-   *port_id = RTE_MAX_ETHPORTS;
-
-   for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
-
-   pci_dev = rte_eth_devices[i].pci_dev;
-
-   if (pci_dev &&
-   !rte_eal_compare_pci_addr(_dev->addr, addr)) {
-
-   *port_id = i;
-
-   return 0;
-   }
-   }
-   return -ENODEV;
-}
-
 static int
 rte_eth_dev_is_detachable(uint8_t port_id)
 {
@@ -459,124 +368,45 @@ rte_eth_dev_is_detachable(uint8_t port_id)
return 1;
 }

-/* attach the new physical device, then store port_id of the device */
-static int
-rte_eth_dev_attach_pdev(struct rte_pci_addr *addr, uint8_t *port_id)
-{
-   /* Invoke probe func of the driver can handle the new device. */
-   if (rte_eal_pci_probe_one(addr))
-   goto err;
-
-   if (rte_eth_dev_get_port_by_addr(addr, port_id))
-   goto err;
-
-   return 0;
-err:
-   return -1;
-}
-
-/* detach the new physical device, then store pci_addr of the device */
-static int
-rte_eth_dev_detach_pdev(uint8_t port_id, struct rte_pci_addr *addr)
-{
-   struct rte_pci_addr freed_addr;
-   struct rte_pci_addr vp;
-
-   /* get pci address by port id */
-   if (rte_eth_dev_get_addr_by_port(port_id, _addr))
-   goto err;
-
-   /* Zeroed pci addr means the port comes from virtual device */
-   vp.domain = vp.bus = 

[dpdk-dev] [PATCH v3 15/17] eal: add hotplug operations for pci and vdev

2016-06-16 Thread Shreyansh Jain
From: David Marchand 

hotplug which deals with resources should come from the layer that already
handles them, i.e. eal.

For both attach and detach operations, 'name' is used to select the bus
that will handle the request.

Signed-off-by: David Marchand 
---
 lib/librte_eal/bsdapp/eal/rte_eal_version.map   |  2 ++
 lib/librte_eal/common/eal_common_dev.c  | 39 +
 lib/librte_eal/common/include/rte_dev.h | 25 
 lib/librte_eal/linuxapp/eal/rte_eal_version.map |  2 ++
 4 files changed, 68 insertions(+)

diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map 
b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index f8c3dea..e776768 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -156,5 +156,7 @@ DPDK_16.07 {
global:

pci_get_sysfs_path;
+   rte_eal_dev_attach;
+   rte_eal_dev_detach;

 } DPDK_16.04;
diff --git a/lib/librte_eal/common/eal_common_dev.c 
b/lib/librte_eal/common/eal_common_dev.c
index a8a4146..59ed3a0 100644
--- a/lib/librte_eal/common/eal_common_dev.c
+++ b/lib/librte_eal/common/eal_common_dev.c
@@ -150,3 +150,42 @@ rte_eal_vdev_uninit(const char *name)
RTE_LOG(ERR, EAL, "no driver found for %s\n", name);
return -EINVAL;
 }
+
+int rte_eal_dev_attach(const char *name, const char *devargs)
+{
+   struct rte_pci_addr addr;
+   int ret = -1;
+
+   if (eal_parse_pci_DomBDF(name, ) == 0) {
+   if (rte_eal_pci_probe_one() < 0)
+   goto err;
+
+   } else {
+   if (rte_eal_vdev_init(name, devargs))
+   goto err;
+   }
+
+   return 0;
+
+err:
+   RTE_LOG(ERR, EAL, "Driver, cannot attach the device\n");
+   return ret;
+}
+
+int rte_eal_dev_detach(const char *name)
+{
+   struct rte_pci_addr addr;
+
+   if (eal_parse_pci_DomBDF(name, ) == 0) {
+   if (rte_eal_pci_detach() < 0)
+   goto err;
+   } else {
+   if (rte_eal_vdev_uninit(name))
+   goto err;
+   }
+   return 0;
+
+err:
+   RTE_LOG(ERR, EAL, "Driver, cannot detach the device\n");
+   return -1;
+}
diff --git a/lib/librte_eal/common/include/rte_dev.h 
b/lib/librte_eal/common/include/rte_dev.h
index 85e48f2..b1c0520 100644
--- a/lib/librte_eal/common/include/rte_dev.h
+++ b/lib/librte_eal/common/include/rte_dev.h
@@ -178,6 +178,31 @@ int rte_eal_vdev_init(const char *name, const char *args);
  */
 int rte_eal_vdev_uninit(const char *name);

+/**
+ * Attach a resource to a registered driver.
+ *
+ * @param name
+ *   The resource name, that refers to a pci resource or some private
+ *   way of designating a resource for vdev drivers. Based on this
+ *   resource name, eal will identify a driver capable of handling
+ *   this resource and pass this resource to the driver probing
+ *   function.
+ * @param devargs
+ *   Device arguments to be passed to the driver.
+ * @return
+ *   0 on success, negative on error.
+ */
+int rte_eal_dev_attach(const char *name, const char *devargs);
+
+/**
+ * Detach a resource from its driver.
+ *
+ * @param name
+ *   Same description as for rte_eal_dev_attach().
+ *   Here, eal will call the driver detaching function.
+ */
+int rte_eal_dev_detach(const char *name);
+
 #define PMD_REGISTER_DRIVER(d)\
 RTE_INIT(devinitfn_ ##d);\
 static void devinitfn_ ##d(void)\
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map 
b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index 3d0ff93..50b774b 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -159,5 +159,7 @@ DPDK_16.07 {
global:

pci_get_sysfs_path;
+   rte_eal_dev_attach;
+   rte_eal_dev_detach;

 } DPDK_16.04;
-- 
2.7.4



[dpdk-dev] [PATCH v3 14/17] ethdev: do not scan all pci devices on attach

2016-06-16 Thread Shreyansh Jain
From: David Marchand 

No need to scan all devices, we only need to update the device being
attached.

Signed-off-by: David Marchand 
---
 lib/librte_eal/common/eal_common_pci.c | 11 ---
 lib/librte_ether/rte_ethdev.c  |  3 ---
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_pci.c 
b/lib/librte_eal/common/eal_common_pci.c
index dfd0a8c..d05dda4 100644
--- a/lib/librte_eal/common/eal_common_pci.c
+++ b/lib/librte_eal/common/eal_common_pci.c
@@ -339,6 +339,11 @@ rte_eal_pci_probe_one(const struct rte_pci_addr *addr)
if (addr == NULL)
return -1;

+   /* update current pci device in global list, kernel bindings might have
+* changed since last time we looked at it */
+   if (pci_update_device(addr) < 0)
+   goto err_return;
+
TAILQ_FOREACH(dev, _device_list, next) {
if (rte_eal_compare_pci_addr(>addr, addr))
continue;
@@ -351,9 +356,9 @@ rte_eal_pci_probe_one(const struct rte_pci_addr *addr)
return -1;

 err_return:
-   RTE_LOG(WARNING, EAL, "Requested device " PCI_PRI_FMT
-   " cannot be used\n", dev->addr.domain, dev->addr.bus,
-   dev->addr.devid, dev->addr.function);
+   RTE_LOG(WARNING, EAL,
+   "Requested device " PCI_PRI_FMT " cannot be used\n",
+   addr->domain, addr->bus, addr->devid, addr->function);
return -1;
 }

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 5bcf610..a496521 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -463,9 +463,6 @@ rte_eth_dev_is_detachable(uint8_t port_id)
 static int
 rte_eth_dev_attach_pdev(struct rte_pci_addr *addr, uint8_t *port_id)
 {
-   /* re-construct pci_device_list */
-   if (rte_eal_pci_scan())
-   goto err;
/* Invoke probe func of the driver can handle the new device. */
if (rte_eal_pci_probe_one(addr))
goto err;
-- 
2.7.4



[dpdk-dev] [PATCH v3 12/17] pci: add a helper for device name

2016-06-16 Thread Shreyansh Jain
From: David Marchand 

eal is a better place than crypto / ethdev for naming resources.
Add a helper in eal and make use of it in crypto / ethdev.

Signed-off-by: David Marchand 
---
 lib/librte_cryptodev/rte_cryptodev.c| 27 ---
 lib/librte_eal/common/include/rte_pci.h | 25 +
 lib/librte_ether/rte_ethdev.c   | 24 
 3 files changed, 33 insertions(+), 43 deletions(-)

diff --git a/lib/librte_cryptodev/rte_cryptodev.c 
b/lib/librte_cryptodev/rte_cryptodev.c
index a7cb33a..3b587e4 100644
--- a/lib/librte_cryptodev/rte_cryptodev.c
+++ b/lib/librte_cryptodev/rte_cryptodev.c
@@ -276,23 +276,6 @@ rte_cryptodev_pmd_allocate(const char *name, int socket_id)
return cryptodev;
 }

-static inline int
-rte_cryptodev_create_unique_device_name(char *name, size_t size,
-   struct rte_pci_device *pci_dev)
-{
-   int ret;
-
-   if ((name == NULL) || (pci_dev == NULL))
-   return -EINVAL;
-
-   ret = snprintf(name, size, "%d:%d.%d",
-   pci_dev->addr.bus, pci_dev->addr.devid,
-   pci_dev->addr.function);
-   if (ret < 0)
-   return ret;
-   return 0;
-}
-
 int
 rte_cryptodev_pmd_release_device(struct rte_cryptodev *cryptodev)
 {
@@ -355,9 +338,8 @@ rte_cryptodev_pci_probe(struct rte_pci_driver *pci_drv,
if (cryptodrv == NULL)
return -ENODEV;

-   /* Create unique Crypto device name using PCI address */
-   rte_cryptodev_create_unique_device_name(cryptodev_name,
-   sizeof(cryptodev_name), pci_dev);
+   rte_eal_pci_device_name(_dev->addr, cryptodev_name,
+   sizeof(cryptodev_name));

cryptodev = rte_cryptodev_pmd_allocate(cryptodev_name, rte_socket_id());
if (cryptodev == NULL)
@@ -412,9 +394,8 @@ rte_cryptodev_pci_remove(struct rte_pci_device *pci_dev)
if (pci_dev == NULL)
return -EINVAL;

-   /* Create unique device name using PCI address */
-   rte_cryptodev_create_unique_device_name(cryptodev_name,
-   sizeof(cryptodev_name), pci_dev);
+   rte_eal_pci_device_name(_dev->addr, cryptodev_name,
+   sizeof(cryptodev_name));

cryptodev = rte_cryptodev_pmd_get_named_dev(cryptodev_name);
if (cryptodev == NULL)
diff --git a/lib/librte_eal/common/include/rte_pci.h 
b/lib/librte_eal/common/include/rte_pci.h
index d7df1d9..5e8bd89 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -82,6 +82,7 @@ extern "C" {
 #include 
 #include 

+#include 
 #include 

 TAILQ_HEAD(pci_device_list, rte_pci_device); /**< PCI devices in D-linked Q. */
@@ -95,6 +96,7 @@ const char *pci_get_sysfs_path(void);

 /** Formatting string for PCI device identifier: Ex: :00:01.0 */
 #define PCI_PRI_FMT "%.4" PRIx16 ":%.2" PRIx8 ":%.2" PRIx8 ".%" PRIx8
+#define PCI_PRI_STR_SIZE sizeof(":XX:XX.X")

 /** Short formatting string, without domain, for PCI device: Ex: 00:01.0 */
 #define PCI_SHORT_PRI_FMT "%.2" PRIx8 ":%.2" PRIx8 ".%" PRIx8
@@ -308,6 +310,29 @@ eal_parse_pci_DomBDF(const char *input, struct 
rte_pci_addr *dev_addr)
 }
 #undef GET_PCIADDR_FIELD

+/**
+ * Utility function to write a pci device name, this device name can later be
+ * used to retrieve the corresponding rte_pci_addr using above functions.
+ *
+ * @param addr
+ * The PCI Bus-Device-Function address
+ * @param output
+ * The output buffer string
+ * @param size
+ * The output buffer size
+ * @return
+ *  0 on success, negative on error.
+ */
+static inline void
+rte_eal_pci_device_name(const struct rte_pci_addr *addr,
+   char *output, size_t size)
+{
+   RTE_VERIFY(size >= PCI_PRI_STR_SIZE);
+   RTE_VERIFY(snprintf(output, size, PCI_PRI_FMT,
+   addr->domain, addr->bus,
+   addr->devid, addr->function) >= 0);
+}
+
 /* Compare two PCI device addresses. */
 /**
  * Utility function to compare two PCI device addresses.
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 7258062..5bcf610 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -214,20 +214,6 @@ rte_eth_dev_allocate(const char *name, enum 
rte_eth_dev_type type)
return eth_dev;
 }

-static int
-rte_eth_dev_create_unique_device_name(char *name, size_t size,
-   struct rte_pci_device *pci_dev)
-{
-   int ret;
-
-   ret = snprintf(name, size, "%d:%d.%d",
-   pci_dev->addr.bus, pci_dev->addr.devid,
-   pci_dev->addr.function);
-   if (ret < 0)
-   return ret;
-   return 0;
-}
-
 int
 rte_eth_dev_release_port(struct rte_eth_dev *eth_dev)
 {
@@ -251,9 +237,8 @@ rte_eth_dev_pci_probe(struct rte_pci_driver *pci_drv,

eth_drv = (struct eth_driver *)pci_drv;

-  

[dpdk-dev] [PATCH v3 11/17] eal/linux: move back interrupt thread init before setting affinity

2016-06-16 Thread Shreyansh Jain
From: David Marchand 

Now that virtio pci driver is initialized in a constructor, iopl() stuff
happens early enough so that interrupt thread can be created right after
plugin loading.
This way, chelsio driver should be happy again [1].

[1] http://dpdk.org/ml/archives/dev/2015-November/028289.html

Signed-off-by: David Marchand 
Tested-by: Rahul Lakkireddy 
---
 lib/librte_eal/linuxapp/eal/eal.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c 
b/lib/librte_eal/linuxapp/eal/eal.c
index 5ec3d4e..6eca741 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -821,6 +821,9 @@ rte_eal_init(int argc, char **argv)
if (eal_plugins_init() < 0)
rte_panic("Cannot init plugins\n");

+   if (rte_eal_intr_init() < 0)
+   rte_panic("Cannot init interrupt-handling thread\n");
+
eal_thread_init_master(rte_config.master_lcore);

ret = eal_thread_dump_affinity(cpuset, RTE_CPU_AFFINITY_STR_LEN);
@@ -832,9 +835,6 @@ rte_eal_init(int argc, char **argv)
if (rte_eal_dev_init() < 0)
rte_panic("Cannot init pmd devices\n");

-   if (rte_eal_intr_init() < 0)
-   rte_panic("Cannot init interrupt-handling thread\n");
-
RTE_LCORE_FOREACH_SLAVE(i) {

/*
-- 
2.7.4



[dpdk-dev] [PATCH v3 10/17] ethdev: get rid of eth driver register callback

2016-06-16 Thread Shreyansh Jain
From: David Marchand 

Now that all pdev are pci drivers, we don't need to register ethdev drivers
through a dedicated channel.

Signed-off-by: David Marchand 
---
 lib/librte_ether/rte_ethdev.c  | 22 --
 lib/librte_ether/rte_ethdev.h  | 12 
 lib/librte_ether/rte_ether_version.map |  1 -
 3 files changed, 35 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index d05eada..7258062 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -334,28 +334,6 @@ rte_eth_dev_pci_remove(struct rte_pci_device *pci_dev)
return 0;
 }

-/**
- * Register an Ethernet [Poll Mode] driver.
- *
- * Function invoked by the initialization function of an Ethernet driver
- * to simultaneously register itself as a PCI driver and as an Ethernet
- * Poll Mode Driver.
- * Invokes the rte_eal_pci_register() function to register the *pci_drv*
- * structure embedded in the *eth_drv* structure, after having stored the
- * address of the rte_eth_dev_init() function in the *devinit* field of
- * the *pci_drv* structure.
- * During the PCI probing phase, the rte_eth_dev_init() function is
- * invoked for each PCI [Ethernet device] matching the embedded PCI
- * identifiers provided by the driver.
- */
-void
-rte_eth_driver_register(struct eth_driver *eth_drv)
-{
-   eth_drv->pci_drv.devinit = rte_eth_dev_pci_probe;
-   eth_drv->pci_drv.devuninit = rte_eth_dev_pci_remove;
-   rte_eal_pci_register(_drv->pci_drv);
-}
-
 int
 rte_eth_dev_is_valid_port(uint8_t port_id)
 {
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 6deafa2..64d889e 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1842,18 +1842,6 @@ struct eth_driver {
 };

 /**
- * @internal
- * A function invoked by the initialization function of an Ethernet driver
- * to simultaneously register itself as a PCI driver and as an Ethernet
- * Poll Mode Driver (PMD).
- *
- * @param eth_drv
- *   The pointer to the *eth_driver* structure associated with
- *   the Ethernet driver.
- */
-void rte_eth_driver_register(struct eth_driver *eth_drv);
-
-/**
  * Convert a numerical speed in Mbps to a bitmap flag that can be used in
  * the bitmap link_speeds of the struct rte_eth_conf
  *
diff --git a/lib/librte_ether/rte_ether_version.map 
b/lib/librte_ether/rte_ether_version.map
index 31017d4..d457b21 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -80,7 +80,6 @@ DPDK_2.2 {
rte_eth_dev_vlan_filter;
rte_eth_dev_wd_timeout_store;
rte_eth_dma_zone_reserve;
-   rte_eth_driver_register;
rte_eth_led_off;
rte_eth_led_on;
rte_eth_link;
-- 
2.7.4



[dpdk-dev] [PATCH v3 09/17] crypto: get rid of crypto driver register callback

2016-06-16 Thread Shreyansh Jain
From: David Marchand 

Now that all pdev are pci drivers, we don't need to register crypto drivers
through a dedicated channel.

Signed-off-by: David Marchand 
---
 lib/librte_cryptodev/rte_cryptodev.c   | 22 ---
 lib/librte_cryptodev/rte_cryptodev_pmd.h   | 30 --
 lib/librte_cryptodev/rte_cryptodev_version.map |  1 -
 3 files changed, 53 deletions(-)

diff --git a/lib/librte_cryptodev/rte_cryptodev.c 
b/lib/librte_cryptodev/rte_cryptodev.c
index 65a2e29..a7cb33a 100644
--- a/lib/librte_cryptodev/rte_cryptodev.c
+++ b/lib/librte_cryptodev/rte_cryptodev.c
@@ -444,28 +444,6 @@ rte_cryptodev_pci_remove(struct rte_pci_device *pci_dev)
return 0;
 }

-int
-rte_cryptodev_pmd_driver_register(struct rte_cryptodev_driver *cryptodrv,
-   enum pmd_type type)
-{
-   /* Call crypto device initialization directly if device is virtual */
-   if (type == PMD_VDEV)
-   return rte_cryptodev_pci_probe((struct rte_pci_driver 
*)cryptodrv,
-   NULL);
-
-   /*
-* Register PCI driver for physical device intialisation during
-* PCI probing
-*/
-   cryptodrv->pci_drv.devinit = rte_cryptodev_pci_probe;
-   cryptodrv->pci_drv.devuninit = rte_cryptodev_pci_remove;
-
-   rte_eal_pci_register(>pci_drv);
-
-   return 0;
-}
-
-
 uint16_t
 rte_cryptodev_queue_pair_count(uint8_t dev_id)
 {
diff --git a/lib/librte_cryptodev/rte_cryptodev_pmd.h 
b/lib/librte_cryptodev/rte_cryptodev_pmd.h
index 3fb7c7c..99fd69e 100644
--- a/lib/librte_cryptodev/rte_cryptodev_pmd.h
+++ b/lib/librte_cryptodev/rte_cryptodev_pmd.h
@@ -491,36 +491,6 @@ rte_cryptodev_pmd_virtual_dev_init(const char *name, 
size_t dev_private_size,
 extern int
 rte_cryptodev_pmd_release_device(struct rte_cryptodev *cryptodev);

-
-/**
- * Register a Crypto [Poll Mode] driver.
- *
- * Function invoked by the initialization function of a Crypto driver
- * to simultaneously register itself as Crypto Poll Mode Driver and to either:
- *
- * a - register itself as PCI driver if the crypto device is a physical
- * device, by invoking the rte_eal_pci_register() function to
- * register the *pci_drv* structure embedded in the *crypto_drv*
- * structure, after having stored the address of the
- * rte_cryptodev_init() function in the *devinit* field of the
- * *pci_drv* structure.
- *
- * During the PCI probing phase, the rte_cryptodev_init()
- * function is invoked for each PCI [device] matching the
- * embedded PCI identifiers provided by the driver.
- *
- * b, complete the initialization sequence if the device is a virtual
- * device by calling the rte_cryptodev_init() directly passing a
- * NULL parameter for the rte_pci_device structure.
- *
- *   @param crypto_drv crypto_driver structure associated with the crypto
- * driver.
- *   @param type   pmd type
- */
-extern int
-rte_cryptodev_pmd_driver_register(struct rte_cryptodev_driver *crypto_drv,
-   enum pmd_type type);
-
 /**
  * Executes all the user application registered callbacks for the specific
  * device.
diff --git a/lib/librte_cryptodev/rte_cryptodev_version.map 
b/lib/librte_cryptodev/rte_cryptodev_version.map
index 8d0edfb..e0a9620 100644
--- a/lib/librte_cryptodev/rte_cryptodev_version.map
+++ b/lib/librte_cryptodev/rte_cryptodev_version.map
@@ -14,7 +14,6 @@ DPDK_16.04 {
rte_cryptodev_info_get;
rte_cryptodev_pmd_allocate;
rte_cryptodev_pmd_callback_process;
-   rte_cryptodev_pmd_driver_register;
rte_cryptodev_pmd_release_device;
rte_cryptodev_pmd_virtual_dev_init;
rte_cryptodev_sym_session_create;
-- 
2.7.4



[dpdk-dev] [PATCH v3 08/17] drivers: convert all pdev drivers as pci drivers

2016-06-16 Thread Shreyansh Jain
From: David Marchand 

Simplify crypto and ethdev pci drivers init by using newly introduced
init macros and helpers.
Those drivers then don't need to register as "rte_driver"s anymore.

virtio and mlx* drivers use the general purpose RTE_INIT macro, as they both
need some special stuff to be done before registering a pci driver.

Signed-off-by: David Marchand 
---
 drivers/crypto/qat/rte_qat_cryptodev.c  | 16 +++
 drivers/net/bnx2x/bnx2x_ethdev.c| 35 +---
 drivers/net/cxgbe/cxgbe_ethdev.c| 24 +++--
 drivers/net/e1000/em_ethdev.c   | 16 +++
 drivers/net/e1000/igb_ethdev.c  | 40 +---
 drivers/net/ena/ena_ethdev.c| 18 +++--
 drivers/net/enic/enic_ethdev.c  | 23 +++-
 drivers/net/fm10k/fm10k_ethdev.c| 23 +++-
 drivers/net/i40e/i40e_ethdev.c  | 26 +++---
 drivers/net/i40e/i40e_ethdev_vf.c   | 25 +++---
 drivers/net/ixgbe/ixgbe_ethdev.c| 47 +
 drivers/net/mlx4/mlx4.c | 20 +++---
 drivers/net/mlx5/mlx5.c | 19 +++--
 drivers/net/nfp/nfp_net.c   | 21 +++
 drivers/net/szedata2/rte_eth_szedata2.c | 25 +++---
 drivers/net/virtio/virtio_ethdev.c  | 26 +-
 drivers/net/vmxnet3/vmxnet3_ethdev.c| 23 +++-
 17 files changed, 68 insertions(+), 359 deletions(-)

diff --git a/drivers/crypto/qat/rte_qat_cryptodev.c 
b/drivers/crypto/qat/rte_qat_cryptodev.c
index 08496ab..54f0c95 100644
--- a/drivers/crypto/qat/rte_qat_cryptodev.c
+++ b/drivers/crypto/qat/rte_qat_cryptodev.c
@@ -120,21 +120,11 @@ static struct rte_cryptodev_driver rte_qat_pmd = {
.name = "rte_qat_pmd",
.id_table = pci_id_qat_map,
.drv_flags = RTE_PCI_DRV_NEED_MAPPING,
+   .devinit = rte_cryptodev_pci_probe,
+   .devuninit = rte_cryptodev_pci_remove,
},
.cryptodev_init = crypto_qat_dev_init,
.dev_private_size = sizeof(struct qat_pmd_private),
 };

-static int
-rte_qat_pmd_init(const char *name __rte_unused, const char *params 
__rte_unused)
-{
-   PMD_INIT_FUNC_TRACE();
-   return rte_cryptodev_pmd_driver_register(_qat_pmd, PMD_PDEV);
-}
-
-static struct rte_driver pmd_qat_drv = {
-   .type = PMD_PDEV,
-   .init = rte_qat_pmd_init,
-};
-
-PMD_REGISTER_DRIVER(pmd_qat_drv);
+RTE_EAL_PCI_REGISTER(qat, rte_qat_pmd.pci_drv);
diff --git a/drivers/net/bnx2x/bnx2x_ethdev.c b/drivers/net/bnx2x/bnx2x_ethdev.c
index 071b44f..ba194b5 100644
--- a/drivers/net/bnx2x/bnx2x_ethdev.c
+++ b/drivers/net/bnx2x/bnx2x_ethdev.c
@@ -506,11 +506,15 @@ static struct eth_driver rte_bnx2x_pmd = {
.name = "rte_bnx2x_pmd",
.id_table = pci_id_bnx2x_map,
.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
+   .devinit = rte_eth_dev_pci_probe,
+   .devuninit = rte_eth_dev_pci_remove,
},
.eth_dev_init = eth_bnx2x_dev_init,
.dev_private_size = sizeof(struct bnx2x_softc),
 };

+RTE_EAL_PCI_REGISTER(bnx2x, rte_bnx2x_pmd.pci_drv);
+
 /*
  * virtual function driver struct
  */
@@ -519,36 +523,11 @@ static struct eth_driver rte_bnx2xvf_pmd = {
.name = "rte_bnx2xvf_pmd",
.id_table = pci_id_bnx2xvf_map,
.drv_flags = RTE_PCI_DRV_NEED_MAPPING,
+   .devinit = rte_eth_dev_pci_probe,
+   .devuninit = rte_eth_dev_pci_remove,
},
.eth_dev_init = eth_bnx2xvf_dev_init,
.dev_private_size = sizeof(struct bnx2x_softc),
 };

-static int rte_bnx2x_pmd_init(const char *name __rte_unused, const char 
*params __rte_unused)
-{
-   PMD_INIT_FUNC_TRACE();
-   rte_eth_driver_register(_bnx2x_pmd);
-
-   return 0;
-}
-
-static int rte_bnx2xvf_pmd_init(const char *name __rte_unused, const char 
*params __rte_unused)
-{
-   PMD_INIT_FUNC_TRACE();
-   rte_eth_driver_register(_bnx2xvf_pmd);
-
-   return 0;
-}
-
-static struct rte_driver rte_bnx2x_driver = {
-   .type = PMD_PDEV,
-   .init = rte_bnx2x_pmd_init,
-};
-
-static struct rte_driver rte_bnx2xvf_driver = {
-   .type = PMD_PDEV,
-   .init = rte_bnx2xvf_pmd_init,
-};
-
-PMD_REGISTER_DRIVER(rte_bnx2x_driver);
-PMD_REGISTER_DRIVER(rte_bnx2xvf_driver);
+RTE_EAL_PCI_REGISTER(bnx2xvf, rte_bnx2xvf_pmd.pci_drv);
diff --git a/drivers/net/cxgbe/cxgbe_ethdev.c b/drivers/net/cxgbe/cxgbe_ethdev.c
index 04eddaf..358c240 100644
--- a/drivers/net/cxgbe/cxgbe_ethdev.c
+++ b/drivers/net/cxgbe/cxgbe_ethdev.c
@@ -869,29 +869,11 @@ static struct eth_driver rte_cxgbe_pmd = {
.name = "rte_cxgbe_pmd",
.id_table = cxgb4_pci_tbl,
.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
+   .devinit = 

[dpdk-dev] [PATCH v3 07/17] ethdev: export init/uninit common wrappers for pci drivers

2016-06-16 Thread Shreyansh Jain
From: David Marchand 

Preparing for getting rid of eth_drv, here are two wrappers that can be
used by pci drivers that assume a 1 to 1 association between pci resource and
upper interface.

Signed-off-by: David Marchand 
---
 lib/librte_ether/rte_ethdev.c  | 14 +++---
 lib/librte_ether/rte_ethdev.h  | 13 +
 lib/librte_ether/rte_ether_version.map |  8 
 3 files changed, 28 insertions(+), 7 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index e148028..d05eada 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -239,9 +239,9 @@ rte_eth_dev_release_port(struct rte_eth_dev *eth_dev)
return 0;
 }

-static int
-rte_eth_dev_init(struct rte_pci_driver *pci_drv,
-struct rte_pci_device *pci_dev)
+int
+rte_eth_dev_pci_probe(struct rte_pci_driver *pci_drv,
+ struct rte_pci_device *pci_dev)
 {
struct eth_driver*eth_drv;
struct rte_eth_dev *eth_dev;
@@ -293,8 +293,8 @@ rte_eth_dev_init(struct rte_pci_driver *pci_drv,
return diag;
 }

-static int
-rte_eth_dev_uninit(struct rte_pci_device *pci_dev)
+int
+rte_eth_dev_pci_remove(struct rte_pci_device *pci_dev)
 {
const struct eth_driver *eth_drv;
struct rte_eth_dev *eth_dev;
@@ -351,8 +351,8 @@ rte_eth_dev_uninit(struct rte_pci_device *pci_dev)
 void
 rte_eth_driver_register(struct eth_driver *eth_drv)
 {
-   eth_drv->pci_drv.devinit = rte_eth_dev_init;
-   eth_drv->pci_drv.devuninit = rte_eth_dev_uninit;
+   eth_drv->pci_drv.devinit = rte_eth_dev_pci_probe;
+   eth_drv->pci_drv.devuninit = rte_eth_dev_pci_remove;
rte_eal_pci_register(_drv->pci_drv);
 }

diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index e5e91e4..6deafa2 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -4254,6 +4254,19 @@ rte_eth_dev_l2_tunnel_offload_set(uint8_t port_id,
  uint32_t mask,
  uint8_t en);

+/**
+ * Wrapper for use by pci drivers as a .devinit function to attach to a ethdev
+ * interface.
+ */
+int rte_eth_dev_pci_probe(struct rte_pci_driver *pci_drv,
+ struct rte_pci_device *pci_dev);
+
+/**
+ * Wrapper for use by pci drivers as a .devuninit function to detach a ethdev
+ * interface.
+ */
+int rte_eth_dev_pci_remove(struct rte_pci_device *pci_dev);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_ether/rte_ether_version.map 
b/lib/librte_ether/rte_ether_version.map
index 214ecc7..31017d4 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -132,3 +132,11 @@ DPDK_16.04 {
rte_eth_tx_buffer_set_err_callback;

 } DPDK_2.2;
+
+DPDK_16.07 {
+   global:
+
+   rte_eth_dev_pci_probe;
+   rte_eth_dev_pci_remove;
+
+} DPDK_16.04;
-- 
2.7.4



[dpdk-dev] [PATCH v3 06/17] crypto: export init/uninit common wrappers for pci drivers

2016-06-16 Thread Shreyansh Jain
From: David Marchand 

Preparing for getting rid of rte_cryptodev_driver, here are two wrappers
that can be used by pci drivers that assume a 1 to 1 association between
pci resource and upper interface.

Signed-off-by: David Marchand 
---
 lib/librte_cryptodev/rte_cryptodev.c   | 16 
 lib/librte_cryptodev/rte_cryptodev_pmd.h   | 12 
 lib/librte_cryptodev/rte_cryptodev_version.map |  8 
 3 files changed, 28 insertions(+), 8 deletions(-)

diff --git a/lib/librte_cryptodev/rte_cryptodev.c 
b/lib/librte_cryptodev/rte_cryptodev.c
index b0d806c..65a2e29 100644
--- a/lib/librte_cryptodev/rte_cryptodev.c
+++ b/lib/librte_cryptodev/rte_cryptodev.c
@@ -340,9 +340,9 @@ rte_cryptodev_pmd_virtual_dev_init(const char *name, size_t 
dev_private_size,
return cryptodev;
 }

-static int
-rte_cryptodev_init(struct rte_pci_driver *pci_drv,
-   struct rte_pci_device *pci_dev)
+int
+rte_cryptodev_pci_probe(struct rte_pci_driver *pci_drv,
+   struct rte_pci_device *pci_dev)
 {
struct rte_cryptodev_driver *cryptodrv;
struct rte_cryptodev *cryptodev;
@@ -401,8 +401,8 @@ rte_cryptodev_init(struct rte_pci_driver *pci_drv,
return -ENXIO;
 }

-static int
-rte_cryptodev_uninit(struct rte_pci_device *pci_dev)
+int
+rte_cryptodev_pci_remove(struct rte_pci_device *pci_dev)
 {
const struct rte_cryptodev_driver *cryptodrv;
struct rte_cryptodev *cryptodev;
@@ -450,15 +450,15 @@ rte_cryptodev_pmd_driver_register(struct 
rte_cryptodev_driver *cryptodrv,
 {
/* Call crypto device initialization directly if device is virtual */
if (type == PMD_VDEV)
-   return rte_cryptodev_init((struct rte_pci_driver *)cryptodrv,
+   return rte_cryptodev_pci_probe((struct rte_pci_driver 
*)cryptodrv,
NULL);

/*
 * Register PCI driver for physical device intialisation during
 * PCI probing
 */
-   cryptodrv->pci_drv.devinit = rte_cryptodev_init;
-   cryptodrv->pci_drv.devuninit = rte_cryptodev_uninit;
+   cryptodrv->pci_drv.devinit = rte_cryptodev_pci_probe;
+   cryptodrv->pci_drv.devuninit = rte_cryptodev_pci_remove;

rte_eal_pci_register(>pci_drv);

diff --git a/lib/librte_cryptodev/rte_cryptodev_pmd.h 
b/lib/librte_cryptodev/rte_cryptodev_pmd.h
index c977c61..3fb7c7c 100644
--- a/lib/librte_cryptodev/rte_cryptodev_pmd.h
+++ b/lib/librte_cryptodev/rte_cryptodev_pmd.h
@@ -534,6 +534,18 @@ rte_cryptodev_pmd_driver_register(struct 
rte_cryptodev_driver *crypto_drv,
 void rte_cryptodev_pmd_callback_process(struct rte_cryptodev *dev,
enum rte_cryptodev_event_type event);

+/**
+ * Wrapper for use by pci drivers as a .devinit function to attach to a crypto
+ * interface.
+ */
+int rte_cryptodev_pci_probe(struct rte_pci_driver *pci_drv,
+   struct rte_pci_device *pci_dev);
+
+/**
+ * Wrapper for use by pci drivers as a .devuninit function to detach a crypto
+ * interface.
+ */
+int rte_cryptodev_pci_remove(struct rte_pci_device *pci_dev);

 #ifdef __cplusplus
 }
diff --git a/lib/librte_cryptodev/rte_cryptodev_version.map 
b/lib/librte_cryptodev/rte_cryptodev_version.map
index 41004e1..8d0edfb 100644
--- a/lib/librte_cryptodev/rte_cryptodev_version.map
+++ b/lib/librte_cryptodev/rte_cryptodev_version.map
@@ -32,3 +32,11 @@ DPDK_16.04 {

local: *;
 };
+
+DPDK_16.07 {
+   global:
+
+   rte_cryptodev_pci_probe;
+   rte_cryptodev_pci_remove;
+
+} DPDK_16.04;
-- 
2.7.4



[dpdk-dev] [PATCH v3 05/17] eal: introduce init macros

2016-06-16 Thread Shreyansh Jain
From: David Marchand 

Introduce a RTE_INIT macro used to mark an init function as a constructor.
Current eal macros have been converted to use this (no functional impact).
RTE_EAL_PCI_REGISTER is added as a helper for pci drivers.

Suggested-by: Jan Viktorin 
Signed-off-by: David Marchand 
---
 lib/librte_eal/common/include/rte_dev.h   | 4 ++--
 lib/librte_eal/common/include/rte_eal.h   | 3 +++
 lib/librte_eal/common/include/rte_pci.h   | 7 +++
 lib/librte_eal/common/include/rte_tailq.h | 4 ++--
 4 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_dev.h 
b/lib/librte_eal/common/include/rte_dev.h
index f1b5507..85e48f2 100644
--- a/lib/librte_eal/common/include/rte_dev.h
+++ b/lib/librte_eal/common/include/rte_dev.h
@@ -179,8 +179,8 @@ int rte_eal_vdev_init(const char *name, const char *args);
 int rte_eal_vdev_uninit(const char *name);

 #define PMD_REGISTER_DRIVER(d)\
-void devinitfn_ ##d(void);\
-void __attribute__((constructor, used)) devinitfn_ ##d(void)\
+RTE_INIT(devinitfn_ ##d);\
+static void devinitfn_ ##d(void)\
 {\
rte_eal_driver_register();\
 }
diff --git a/lib/librte_eal/common/include/rte_eal.h 
b/lib/librte_eal/common/include/rte_eal.h
index a71d6f5..186f3c6 100644
--- a/lib/librte_eal/common/include/rte_eal.h
+++ b/lib/librte_eal/common/include/rte_eal.h
@@ -252,6 +252,9 @@ static inline int rte_gettid(void)
return RTE_PER_LCORE(_thread_id);
 }

+#define RTE_INIT(func) \
+static void __attribute__((constructor, used)) func(void)
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_eal/common/include/rte_pci.h 
b/lib/librte_eal/common/include/rte_pci.h
index fa74962..d7df1d9 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -470,6 +470,13 @@ void rte_eal_pci_dump(FILE *f);
  */
 void rte_eal_pci_register(struct rte_pci_driver *driver);

+#define RTE_EAL_PCI_REGISTER(name, d) \
+RTE_INIT(pciinitfn_ ##name); \
+static void pciinitfn_ ##name(void) \
+{ \
+   rte_eal_pci_register(); \
+}
+
 /**
  * Unregister a PCI driver.
  *
diff --git a/lib/librte_eal/common/include/rte_tailq.h 
b/lib/librte_eal/common/include/rte_tailq.h
index 4a686e6..71ed3bb 100644
--- a/lib/librte_eal/common/include/rte_tailq.h
+++ b/lib/librte_eal/common/include/rte_tailq.h
@@ -148,8 +148,8 @@ struct rte_tailq_head *rte_eal_tailq_lookup(const char 
*name);
 int rte_eal_tailq_register(struct rte_tailq_elem *t);

 #define EAL_REGISTER_TAILQ(t) \
-void tailqinitfn_ ##t(void); \
-void __attribute__((constructor, used)) tailqinitfn_ ##t(void) \
+RTE_INIT(tailqinitfn_ ##t); \
+static void tailqinitfn_ ##t(void) \
 { \
if (rte_eal_tailq_register() < 0) \
rte_panic("Cannot initialize tailq: %s\n", t.name); \
-- 
2.7.4



[dpdk-dev] [PATCH v3 04/17] eal: remove duplicate function declaration

2016-06-16 Thread Shreyansh Jain
From: David Marchand 

rte_eal_dev_init is declared in both eal_private.h and rte_dev.h since its
introduction.
This function has been exported in ABI, so remove it from eal_private.h

Fixes: e57f20e05177 ("eal: make vdev init path generic for both virtual and pci 
devices")
Signed-off-by: David Marchand 
---
 lib/librte_eal/common/eal_private.h | 7 ---
 lib/librte_eal/linuxapp/eal/eal.c   | 1 +
 2 files changed, 1 insertion(+), 7 deletions(-)

diff --git a/lib/librte_eal/common/eal_private.h 
b/lib/librte_eal/common/eal_private.h
index 857dc3e..06a68f6 100644
--- a/lib/librte_eal/common/eal_private.h
+++ b/lib/librte_eal/common/eal_private.h
@@ -259,13 +259,6 @@ int rte_eal_intr_init(void);
 int rte_eal_alarm_init(void);

 /**
- * This function initialises any virtual devices
- *
- * This function is private to the EAL.
- */
-int rte_eal_dev_init(void);
-
-/**
  * Function is to check if the kernel module(like, vfio, vfio_iommu_type1,
  * etc.) loaded.
  *
diff --git a/lib/librte_eal/linuxapp/eal/eal.c 
b/lib/librte_eal/linuxapp/eal/eal.c
index bba8fea..5ec3d4e 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -70,6 +70,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
-- 
2.7.4



[dpdk-dev] [PATCH v3 03/17] drivers: align pci driver definitions

2016-06-16 Thread Shreyansh Jain
From: David Marchand 

Pure coding style, but it might make it easier later if we want to move
fields in rte_cryptodev_driver and eth_driver structures.

Signed-off-by: David Marchand 
---
 drivers/crypto/qat/rte_qat_cryptodev.c | 2 +-
 drivers/net/ena/ena_ethdev.c   | 2 +-
 drivers/net/nfp/nfp_net.c  | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/crypto/qat/rte_qat_cryptodev.c 
b/drivers/crypto/qat/rte_qat_cryptodev.c
index a7912f5..08496ab 100644
--- a/drivers/crypto/qat/rte_qat_cryptodev.c
+++ b/drivers/crypto/qat/rte_qat_cryptodev.c
@@ -116,7 +116,7 @@ crypto_qat_dev_init(__attribute__((unused)) struct 
rte_cryptodev_driver *crypto_
 }

 static struct rte_cryptodev_driver rte_qat_pmd = {
-   {
+   .pci_drv = {
.name = "rte_qat_pmd",
.id_table = pci_id_qat_map,
.drv_flags = RTE_PCI_DRV_NEED_MAPPING,
diff --git a/drivers/net/ena/ena_ethdev.c b/drivers/net/ena/ena_ethdev.c
index e157587..8d01e9a 100644
--- a/drivers/net/ena/ena_ethdev.c
+++ b/drivers/net/ena/ena_ethdev.c
@@ -1427,7 +1427,7 @@ static uint16_t eth_ena_xmit_pkts(void *tx_queue, struct 
rte_mbuf **tx_pkts,
 }

 static struct eth_driver rte_ena_pmd = {
-   {
+   .pci_drv = {
.name = "rte_ena_pmd",
.id_table = pci_id_ena_map,
.drv_flags = RTE_PCI_DRV_NEED_MAPPING,
diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index 5c9f350..ef7011e 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -2463,7 +2463,7 @@ static struct rte_pci_id pci_id_nfp_net_map[] = {
 };

 static struct eth_driver rte_nfp_net_pmd = {
-   {
+   .pci_drv = {
.name = "rte_nfp_net_pmd",
.id_table = pci_id_nfp_net_map,
.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
-- 
2.7.4



[dpdk-dev] [PATCH v3 02/17] crypto: no need for a crypto pmd type

2016-06-16 Thread Shreyansh Jain
From: David Marchand 

This information is not used and just adds noise.

Signed-off-by: David Marchand 
---
 lib/librte_cryptodev/rte_cryptodev.c | 8 +++-
 lib/librte_cryptodev/rte_cryptodev.h | 2 --
 lib/librte_cryptodev/rte_cryptodev_pmd.h | 3 +--
 3 files changed, 4 insertions(+), 9 deletions(-)

diff --git a/lib/librte_cryptodev/rte_cryptodev.c 
b/lib/librte_cryptodev/rte_cryptodev.c
index 960e2d5..b0d806c 100644
--- a/lib/librte_cryptodev/rte_cryptodev.c
+++ b/lib/librte_cryptodev/rte_cryptodev.c
@@ -230,7 +230,7 @@ rte_cryptodev_find_free_device_index(void)
 }

 struct rte_cryptodev *
-rte_cryptodev_pmd_allocate(const char *name, enum pmd_type type, int socket_id)
+rte_cryptodev_pmd_allocate(const char *name, int socket_id)
 {
struct rte_cryptodev *cryptodev;
uint8_t dev_id;
@@ -269,7 +269,6 @@ rte_cryptodev_pmd_allocate(const char *name, enum pmd_type 
type, int socket_id)
cryptodev->data->dev_started = 0;

cryptodev->attached = RTE_CRYPTODEV_ATTACHED;
-   cryptodev->pmd_type = type;

cryptodev_globals.nb_devs++;
}
@@ -318,7 +317,7 @@ rte_cryptodev_pmd_virtual_dev_init(const char *name, size_t 
dev_private_size,
struct rte_cryptodev *cryptodev;

/* allocate device structure */
-   cryptodev = rte_cryptodev_pmd_allocate(name, PMD_VDEV, socket_id);
+   cryptodev = rte_cryptodev_pmd_allocate(name, socket_id);
if (cryptodev == NULL)
return NULL;

@@ -360,8 +359,7 @@ rte_cryptodev_init(struct rte_pci_driver *pci_drv,
rte_cryptodev_create_unique_device_name(cryptodev_name,
sizeof(cryptodev_name), pci_dev);

-   cryptodev = rte_cryptodev_pmd_allocate(cryptodev_name, PMD_PDEV,
-   rte_socket_id());
+   cryptodev = rte_cryptodev_pmd_allocate(cryptodev_name, rte_socket_id());
if (cryptodev == NULL)
return -ENOMEM;

diff --git a/lib/librte_cryptodev/rte_cryptodev.h 
b/lib/librte_cryptodev/rte_cryptodev.h
index d47f1e8..2d0b809 100644
--- a/lib/librte_cryptodev/rte_cryptodev.h
+++ b/lib/librte_cryptodev/rte_cryptodev.h
@@ -697,8 +697,6 @@ struct rte_cryptodev {

enum rte_cryptodev_type dev_type;
/**< Crypto device type */
-   enum pmd_type pmd_type;
-   /**< PMD type - PDEV / VDEV */

struct rte_cryptodev_cb_list link_intr_cbs;
/**< User application callback for interrupts if present */
diff --git a/lib/librte_cryptodev/rte_cryptodev_pmd.h 
b/lib/librte_cryptodev/rte_cryptodev_pmd.h
index 7d049ea..c977c61 100644
--- a/lib/librte_cryptodev/rte_cryptodev_pmd.h
+++ b/lib/librte_cryptodev/rte_cryptodev_pmd.h
@@ -454,13 +454,12 @@ struct rte_cryptodev_ops {
  * to that slot for the driver to use.
  *
  * @param  nameUnique identifier name for each device
- * @param  typeDevice type of this Crypto device
  * @param  socket_id   Socket to allocate resources on.
  * @return
  *   - Slot in the rte_dev_devices array for a new device;
  */
 struct rte_cryptodev *
-rte_cryptodev_pmd_allocate(const char *name, enum pmd_type type, int 
socket_id);
+rte_cryptodev_pmd_allocate(const char *name, int socket_id);

 /**
  * Creates a new virtual crypto device and returns the pointer
-- 
2.7.4



[dpdk-dev] [PATCH v3 01/17] pci: no need for dynamic tailq init

2016-06-16 Thread Shreyansh Jain
From: David Marchand 

These lists can be initialized once and for all at build time.
With this, those lists are only manipulated in a common place
(and we could even make them private).

A nice side effect is that pci drivers can now register in constructors.

Signed-off-by: David Marchand 
Reviewed-by: Jan Viktorin 
---
 lib/librte_eal/bsdapp/eal/eal_pci.c| 3 ---
 lib/librte_eal/common/eal_common_pci.c | 6 --
 lib/librte_eal/linuxapp/eal/eal_pci.c  | 3 ---
 3 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c 
b/lib/librte_eal/bsdapp/eal/eal_pci.c
index 7fdd6f1..880483d 100644
--- a/lib/librte_eal/bsdapp/eal/eal_pci.c
+++ b/lib/librte_eal/bsdapp/eal/eal_pci.c
@@ -623,9 +623,6 @@ rte_eal_pci_ioport_unmap(struct rte_pci_ioport *p)
 int
 rte_eal_pci_init(void)
 {
-   TAILQ_INIT(_driver_list);
-   TAILQ_INIT(_device_list);
-
/* for debug purposes, PCI can be disabled */
if (internal_config.no_pci)
return 0;
diff --git a/lib/librte_eal/common/eal_common_pci.c 
b/lib/librte_eal/common/eal_common_pci.c
index ba5283d..fee4aa5 100644
--- a/lib/librte_eal/common/eal_common_pci.c
+++ b/lib/librte_eal/common/eal_common_pci.c
@@ -82,8 +82,10 @@

 #include "eal_private.h"

-struct pci_driver_list pci_driver_list;
-struct pci_device_list pci_device_list;
+struct pci_driver_list pci_driver_list =
+   TAILQ_HEAD_INITIALIZER(pci_driver_list);
+struct pci_device_list pci_device_list =
+   TAILQ_HEAD_INITIALIZER(pci_device_list);

 #define SYSFS_PCI_DEVICES "/sys/bus/pci/devices"

diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c 
b/lib/librte_eal/linuxapp/eal/eal_pci.c
index f9c3efd..bfc410f 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -743,9 +743,6 @@ rte_eal_pci_ioport_unmap(struct rte_pci_ioport *p)
 int
 rte_eal_pci_init(void)
 {
-   TAILQ_INIT(_driver_list);
-   TAILQ_INIT(_device_list);
-
/* for debug purposes, PCI can be disabled */
if (internal_config.no_pci)
return 0;
-- 
2.7.4



[dpdk-dev] [PATCH v3 00/17] prepare for rte_device / rte_driver

2016-06-16 Thread Shreyansh Jain
From: David Marchand 

* Original patch series is from David Marchand. This is just a rebase over
master (d76c19309) *

Following discussions with Jan [1] and some cleanup I started on pci code,
here is a patchset that reworks pdev drivers registration and hotplug api.

The structures changes mentioned in [1] are still to be done, but at least,
I think we are one step closer to it.

Before this patchset, rte_driver .init semantics differed whether it
concerned a pdev or a vdev driver:
- for vdev, it actually meant that a devargs is given to the driver so
  that it creates ethdev / crypto objects, so it was a probing action
- for pdev, it only registered the driver triggering no ethdev / crypto
  objects

>From my pov, eal hotplug api introduced in this patchset still needs more
work so that it does not need to know about devargs. So a new devargs api
is needed.

Changes since v2:
- rebase over HEAD (d76c193)
- Move SYSFS_PCI_DRIVERS macro to rte_pci.h to avoid compilation issue

Changes since v1:
- rebased on HEAD, new drivers should be okay
- patches have been split into smaller pieces
- RTE_INIT macro has been added, but in the end, I am not sure it is useful
- device type has been removed from ethdev, as it was used only by hotplug
- getting rid of pmd type in eal patch (patch 5 of initial series) has been
  dropped for now, we can do this once vdev drivers have been converted

[1] http://dpdk.org/ml/archives/dev/2016-January/031390.html

David Marchand (17):
  pci: no need for dynamic tailq init
  crypto: no need for a crypto pmd type
  drivers: align pci driver definitions
  eal: remove duplicate function declaration
  eal: introduce init macros
  crypto: export init/uninit common wrappers for pci drivers
  ethdev: export init/uninit common wrappers for pci drivers
  drivers: convert all pdev drivers as pci drivers
  crypto: get rid of crypto driver register callback
  ethdev: get rid of eth driver register callback
  eal/linux: move back interrupt thread init before setting affinity
  pci: add a helper for device name
  pci: add a helper to update a device
  ethdev: do not scan all pci devices on attach
  eal: add hotplug operations for pci and vdev
  ethdev: convert to eal hotplug
  ethdev: get rid of device type

 app/test/virtual_pmd.c  |   2 +-
 drivers/crypto/qat/rte_qat_cryptodev.c  |  18 +-
 drivers/net/af_packet/rte_eth_af_packet.c   |   2 +-
 drivers/net/bnx2x/bnx2x_ethdev.c|  35 +--
 drivers/net/bonding/rte_eth_bond_api.c  |   2 +-
 drivers/net/cxgbe/cxgbe_ethdev.c|  24 +-
 drivers/net/cxgbe/cxgbe_main.c  |   2 +-
 drivers/net/e1000/em_ethdev.c   |  16 +-
 drivers/net/e1000/igb_ethdev.c  |  40 +--
 drivers/net/ena/ena_ethdev.c|  20 +-
 drivers/net/enic/enic_ethdev.c  |  23 +-
 drivers/net/fm10k/fm10k_ethdev.c|  23 +-
 drivers/net/i40e/i40e_ethdev.c  |  26 +-
 drivers/net/i40e/i40e_ethdev_vf.c   |  25 +-
 drivers/net/ixgbe/ixgbe_ethdev.c|  47 +---
 drivers/net/mlx4/mlx4.c |  22 +-
 drivers/net/mlx5/mlx5.c |  21 +-
 drivers/net/mpipe/mpipe_tilegx.c|   2 +-
 drivers/net/nfp/nfp_net.c   |  23 +-
 drivers/net/null/rte_eth_null.c |   2 +-
 drivers/net/pcap/rte_eth_pcap.c |   2 +-
 drivers/net/ring/rte_eth_ring.c |   2 +-
 drivers/net/szedata2/rte_eth_szedata2.c |  25 +-
 drivers/net/vhost/rte_eth_vhost.c   |   2 +-
 drivers/net/virtio/virtio_ethdev.c  |  26 +-
 drivers/net/vmxnet3/vmxnet3_ethdev.c|  23 +-
 drivers/net/xenvirt/rte_eth_xenvirt.c   |   2 +-
 examples/ip_pipeline/init.c |  22 --
 lib/librte_cryptodev/rte_cryptodev.c|  67 +
 lib/librte_cryptodev/rte_cryptodev.h|   2 -
 lib/librte_cryptodev/rte_cryptodev_pmd.h|  45 +---
 lib/librte_cryptodev/rte_cryptodev_version.map  |   9 +-
 lib/librte_eal/bsdapp/eal/eal_pci.c |  52 +++-
 lib/librte_eal/bsdapp/eal/rte_eal_version.map   |   2 +
 lib/librte_eal/common/eal_common_dev.c  |  39 +++
 lib/librte_eal/common/eal_common_pci.c  |  19 +-
 lib/librte_eal/common/eal_private.h |  20 +-
 lib/librte_eal/common/include/rte_dev.h |  29 ++-
 lib/librte_eal/common/include/rte_eal.h |   3 +
 lib/librte_eal/common/include/rte_pci.h |  35 +++
 lib/librte_eal/common/include/rte_tailq.h   |   4 +-
 lib/librte_eal/linuxapp/eal/eal.c   |   7 +-
 lib/librte_eal/linuxapp/eal/eal_pci.c   |  16 +-
 lib/librte_eal/linuxapp/eal/rte_eal_version.map |   2 +
 lib/librte_ether/rte_ethdev.c   | 315 
 lib/librte_ether/rte_ethdev.h   |  40 ++-
 

[dpdk-dev] [PATCH] port: add kni interface support

2016-06-16 Thread Ethan
Hi Cristian,

The new patch has been submitted just now. Please note that I do ignore
some check patch errors this time.

B.R.
Ethan


2016-06-13 21:18 GMT+08:00 Dumitrescu, Cristian <
cristian.dumitrescu at intel.com>:

> Hi Ethan,
>
>
>
> Great, we?ll wait for your patch later this week then. I recommend you add
> any other changes that you might have on top of the latest code that I just
> send, as this will minimize your work, my work to further code reviews and
> number of future iterations to merge this patch.
>
>
>
> Answers to your questions are inlined below.
>
>
>
> Regards,
>
> Cristian
>
>
>
> *From:* zhuangweijie at gmail.com [mailto:zhuangweijie at gmail.com] *On 
> Behalf
> Of *Ethan
> *Sent:* Monday, June 13, 2016 11:48 AM
> *To:* Dumitrescu, Cristian 
> *Cc:* dev at dpdk.org; Singh, Jasvinder ; Yigit,
> Ferruh 
> *Subject:* Re: [PATCH] port: add kni interface support
>
>
>
> Hi Cristian,
>
>
>
> I've got your comments. Thank you for review the code from a DPDK newbie.
> :-)
>
> I plan to submit a new patch to fix all during this week hopefully.
>
>
>
> There are four places I'd like to discuss further:
>
>
>
> 1. Dedicated lcore for kni kernel thread
>
> First of all, it is a bug to add kni kernel core to the user space core
> mask. What I want is just to check if the kni kernel thread has a dedicated
> core.
>
> The reason I prefer to allocate a dedicated core to kni kernel thread is
> that my application is latency sensitive. I worry the context switch and
> cache miss will cause the latency increasing if the kni kernel thread and
> application thread share one core.
>
> Anyway, I think I should remove the hard coded check because it will be
> more generic. Users who has the similar usage like mine can achieve so
> through configuration file.
>
>
>
> [Cristian] I agree with you that the user should be able to specify the
> core where the kernel thread should run, and this requirement is fully met
> by the latest code I sent, but implemented in a slightly different way,
> which I think it is a cleaner way.
>
>
>
> In your initial solution, the application redefines the meaning of the
> core mask as the reunion of cores used by the user space application (cores
> running the pipelines) and the cores used to run the kernel space KNI
> threads. This does not make sense to me. The application is in user space
> and it does not start or manage any kernel threads itself, why should the
> application worry about the cores running kernel threads? The application
> should just pick up the user instructions from the config file and send
> them to the KNI kernel module transparently.
>
>
>
> In the code that I just sent, the application preserves the current
> definition of the core mask, i.e. just the collection of cores running the
> pipelines. This leads to simpler code that meets all the requirements for
> kernel threads affinity:
>
> i) The user wants to affinitize the kernel thread to a CPU core that is
> not used to run any pipeline (this core will run just KNI kernel threads):
> Core entry in KNI section is set to be different than the core entry of any
> PIPELINE section in the config file;
>
> ii) The user affinitizes the kernel thread to a CPU core that also runs
> some of the pipelines (this core will run both user space and kernel space
> threads): Core entry in KNI section is equal to the core entry in one or
> several of the PIPELINE sections in the config file;
>
> iii) The user does not affinitize the kernel thread to any CPU core, so
> the kernel decides the scheduling policy for the KNI threads: Core entry of
> the KNI section is not present; this results in force_bind KNI parameter to
> be set to 0.
>
>
>
> Makes sense?
>
>
>
> 2. The compiler error of the Macro RTE_PORT_KNI_WRITER_STATS_PKTS_IN_ADD
>
> Actually I implements the macro similar
> to RTE_PORT_RING_READER_STATS_PKTS_IN_ADD first. But the
> scripts/checkpatches.sh fails: ERROR:COMPLEX_MACRO: Macros with complex
> values should be enclosed in parentheses
>
> I'm not share either I have done something wrong or the checkpatches
> script need an update.
>
>
>
> [Cristian] Let?s use the same consistent rule to create the stats macros
> for all the ports, i.e. follow the existing rule used for other ports. You
> can ignore this check patch issue.
>
>
>
> 3. KNI kernel operations callback
>
> To be  honest, I made reference to the the KNI sample application.
>
> Since there is very little docs tell the difference between link up call
> and device start call, I am not sure which one is better here.
>
> Any help will be appreciate. :-)
>
>
>
> [Cristian] I suggest you use the ones from the code that I just sent.
>
>
>
> 4. Shall I use DPDK_16.07 in the  librte_port/rte_port_version.map file?
>
>
>
> [Cristian] Yes.
>
>
>
>
>
> 2016-06-10 7:42 GMT+08:00 Dumitrescu, Cristian <
> cristian.dumitrescu at intel.com>:
>
> Hi Ethan,
>
> Great work! There are still several comments below that need to be
> addressed, but I am confident 

[dpdk-dev] [PATCH v3 3/3] port: document update

2016-06-16 Thread WeiJie Zhuang
add kni configurations into the document of ip pipeline sample application

Signed-off-by: WeiJie Zhuang 
---
 doc/guides/sample_app_ug/ip_pipeline.rst | 112 +++
 1 file changed, 83 insertions(+), 29 deletions(-)

diff --git a/doc/guides/sample_app_ug/ip_pipeline.rst 
b/doc/guides/sample_app_ug/ip_pipeline.rst
index 899fd4a..566106b 100644
--- a/doc/guides/sample_app_ug/ip_pipeline.rst
+++ b/doc/guides/sample_app_ug/ip_pipeline.rst
@@ -1,5 +1,5 @@
 ..  BSD LICENSE
-Copyright(c) 2015 Intel Corporation. All rights reserved.
+Copyright(c) 2016 Intel Corporation. All rights reserved.
 All rights reserved.

 Redistribution and use in source and binary forms, with or without
@@ -351,33 +351,35 @@ Application resources present in the configuration file

 .. table:: Application resource names in the configuration file

-   
+--+-+-+
-   | Resource type| Format  | Examples 
   |
-   
+==+=+=+
-   | Pipeline | ``PIPELINE``| ``PIPELINE0``, 
``PIPELINE1``|
-   
+--+-+-+
-   | Mempool  | ``MEMPOOL`` | ``MEMPOOL0``, 
``MEMPOOL1``  |
-   
+--+-+-+
-   | Link (network interface) | ``LINK``| ``LINK0``, 
``LINK1``|
-   
+--+-+-+
-   | Link RX queue| ``RXQ.`` | ``RXQ0.0``, 
``RXQ1.5``  |
-   
+--+-+-+
-   | Link TX queue| ``TXQ.`` | ``TXQ0.0``, 
``TXQ1.5``  |
-   
+--+-+-+
-   | Software queue   | ``SWQ`` | ``SWQ0``, 
``SWQ1``  |
-   
+--+-+-+
-   | Traffic Manager  | ``TM`` | ``TM0``, ``TM1`` 
   |
-   
+--+-+-+
-   | Source   | ``SOURCE``  | ``SOURCE0``, 
``SOURCE1``|
-   
+--+-+-+
-   | Sink | ``SINK``| ``SINK0``, 
``SINK1``|
-   
+--+-+-+
-   | Message queue| ``MSGQ``| ``MSGQ0``, 
``MSGQ1``,   |
-   |  | ``MSGQ-REQ-PIPELINE``   | 
``MSGQ-REQ-PIPELINE2``, ``MSGQ-RSP-PIPELINE2,`` |
-   |  | ``MSGQ-RSP-PIPELINE``   | 
``MSGQ-REQ-CORE-s0c1``, ``MSGQ-RSP-CORE-s0c1``  |
-   |  | ``MSGQ-REQ-CORE-`` |  
   |
-   |  | ``MSGQ-RSP-CORE-`` |  
   |
-   
+--+-+-+
+   
++-+-+
+   | Resource type  | Format  | Examples   
 |
+   
++=+=+
+   | Pipeline   | ``PIPELINE``| ``PIPELINE0``, 
``PIPELINE1``|
+   
++-+-+
+   | Mempool| ``MEMPOOL`` | ``MEMPOOL0``, 
``MEMPOOL1``  |
+   
++-+-+
+   | Link (network interface)   | ``LINK``| ``LINK0``, 
``LINK1``|
+   
++-+-+
+   | Link RX queue  | ``RXQ.`` | ``RXQ0.0``, 
``RXQ1.5``  |
+   

[dpdk-dev] [PATCH v3 2/3] port: add kni nodrop writer

2016-06-16 Thread WeiJie Zhuang
1. add no drop writing operations to the kni port
2. support dropless kni config in the ip pipeline sample application

Signed-off-by: WeiJie Zhuang 
---
 examples/ip_pipeline/app.h   |   2 +
 examples/ip_pipeline/config_parse.c  |  31 -
 examples/ip_pipeline/init.c  |  26 -
 examples/ip_pipeline/pipeline_be.h   |   6 +
 lib/librte_port/rte_port_kni.c   | 220 +++
 lib/librte_port/rte_port_kni.h   |  13 +++
 lib/librte_port/rte_port_version.map |   1 +
 7 files changed, 292 insertions(+), 7 deletions(-)

diff --git a/examples/ip_pipeline/app.h b/examples/ip_pipeline/app.h
index abbd6d4..6a6fdd9 100644
--- a/examples/ip_pipeline/app.h
+++ b/examples/ip_pipeline/app.h
@@ -147,6 +147,8 @@ struct app_pktq_kni_params {
uint32_t mempool_id; /* Position in the app->mempool_params */
uint32_t burst_read;
uint32_t burst_write;
+   uint32_t dropless;
+   uint64_t n_retries;
 };

 #ifndef APP_FILE_NAME_SIZE
diff --git a/examples/ip_pipeline/config_parse.c 
b/examples/ip_pipeline/config_parse.c
index c55be31..31a50c2 100644
--- a/examples/ip_pipeline/config_parse.c
+++ b/examples/ip_pipeline/config_parse.c
@@ -199,6 +199,8 @@ struct app_pktq_kni_params default_kni_params = {
.mempool_id = 0,
.burst_read = 32,
.burst_write = 32,
+   .dropless = 0,
+   .n_retries = 0,
 };

 struct app_pktq_source_params default_source_params = {
@@ -1927,7 +1929,7 @@ parse_kni(struct app_params *app,

if (strcmp(ent->name, "mempool") == 0) {
int status = validate_name(ent->value,
-  
"MEMPOOL", 1);
+   "MEMPOOL", 1);
ssize_t idx;

PARSE_ERROR((status == 0), section_name,
@@ -1940,7 +1942,7 @@ parse_kni(struct app_params *app,

if (strcmp(ent->name, "burst_read") == 0) {
int status = parser_read_uint32(>burst_read,
-   
ent->value);
+   ent->value);

PARSE_ERROR((status == 0), section_name,
ent->name);
@@ -1949,7 +1951,25 @@ parse_kni(struct app_params *app,

if (strcmp(ent->name, "burst_write") == 0) {
int status = parser_read_uint32(>burst_write,
-   
ent->value);
+   ent->value);
+
+   PARSE_ERROR((status == 0), section_name,
+   ent->name);
+   continue;
+   }
+
+   if (strcmp(ent->name, "dropless") == 0) {
+   int status = parser_read_arg_bool(ent->value);
+
+   PARSE_ERROR((status != -EINVAL), section_name,
+   ent->name);
+   param->dropless = status;
+   continue;
+   }
+
+   if (strcmp(ent->name, "n_retries") == 0) {
+   int status = parser_read_uint64(>n_retries,
+   ent->value);

PARSE_ERROR((status == 0), section_name,
ent->name);
@@ -2794,6 +2814,11 @@ save_kni_params(struct app_params *app, FILE *f)
/* burst_write */
fprintf(f, "%s = %" PRIu32 "\n", "burst_write", p->burst_write);

+   /* dropless */
+   fprintf(f, "%s = %s\n",
+   "dropless",
+   p->dropless ? "yes" : "no");
+
fputc('\n', f);
}
 }
diff --git a/examples/ip_pipeline/init.c b/examples/ip_pipeline/init.c
index d522de4..af24f52 100644
--- a/examples/ip_pipeline/init.c
+++ b/examples/ip_pipeline/init.c
@@ -1434,10 +1434,28 @@ void app_pipeline_params_get(struct app_params *app,
 #ifdef RTE_LIBRTE_KNI
case APP_PKTQ_OUT_KNI:
{
-   out->type = PIPELINE_PORT_OUT_KNI_WRITER;
-   out->params.kni.kni = app->kni[in->id];
-   out->params.kni.tx_burst_sz =
-   app->kni_params[in->id].burst_write;
+   struct app_pktq_kni_params *p_kni =
+   >kni_params[in->id];
+
+   if (p_kni->dropless == 0) {
+   struct rte_port_kni_writer_params *params =
+   >params.kni;
+
+   out->type = PIPELINE_PORT_OUT_KNI_WRITER;
+   params->kni = app->kni[in->id];
+  

[dpdk-dev] [PATCH v3 1/3] port: add kni interface support

2016-06-16 Thread WeiJie Zhuang
1. add KNI port type to the packet framework
2. add KNI support to the IP Pipeline sample Application
3. some bug fix

Signed-off-by: WeiJie Zhuang 
---
v2:
* Fix check patch error.
v3:
* Fix code review comments.
---
 doc/api/doxy-api-index.md  |   1 +
 examples/ip_pipeline/Makefile  |   2 +-
 examples/ip_pipeline/app.h | 181 +++-
 examples/ip_pipeline/config/kni.cfg|  67 +
 examples/ip_pipeline/config_check.c|  26 +-
 examples/ip_pipeline/config_parse.c| 166 ++-
 examples/ip_pipeline/init.c| 132 -
 examples/ip_pipeline/pipeline/pipeline_common_fe.c |  29 ++
 examples/ip_pipeline/pipeline/pipeline_master_be.c |   6 +
 examples/ip_pipeline/pipeline_be.h |  27 ++
 lib/librte_port/Makefile   |   7 +
 lib/librte_port/rte_port_kni.c | 325 +
 lib/librte_port/rte_port_kni.h |  82 ++
 lib/librte_port/rte_port_version.map   |   8 +
 14 files changed, 1047 insertions(+), 12 deletions(-)
 create mode 100644 examples/ip_pipeline/config/kni.cfg
 create mode 100644 lib/librte_port/rte_port_kni.c
 create mode 100644 lib/librte_port/rte_port_kni.h

diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index f626386..5e7f024 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -118,6 +118,7 @@ There are many libraries, so their headers may be grouped 
by topics:
 [frag] (@ref rte_port_frag.h),
 [reass](@ref rte_port_ras.h),
 [sched](@ref rte_port_sched.h),
+[kni]  (@ref rte_port_kni.h),
 [src/sink] (@ref rte_port_source_sink.h)
   * [table](@ref rte_table.h):
 [lpm IPv4] (@ref rte_table_lpm.h),
diff --git a/examples/ip_pipeline/Makefile b/examples/ip_pipeline/Makefile
index 5827117..6dc3f52 100644
--- a/examples/ip_pipeline/Makefile
+++ b/examples/ip_pipeline/Makefile
@@ -1,6 +1,6 @@
 #   BSD LICENSE
 #
-#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
 #   All rights reserved.
 #
 #   Redistribution and use in source and binary forms, with or without
diff --git a/examples/ip_pipeline/app.h b/examples/ip_pipeline/app.h
index 7611341..abbd6d4 100644
--- a/examples/ip_pipeline/app.h
+++ b/examples/ip_pipeline/app.h
@@ -44,6 +44,9 @@
 #include 

 #include 
+#ifdef RTE_LIBRTE_KNI
+#include 
+#endif

 #include "cpu_core_map.h"
 #include "pipeline.h"
@@ -132,6 +135,20 @@ struct app_pktq_swq_params {
uint32_t mempool_indirect_id;
 };

+struct app_pktq_kni_params {
+   char *name;
+   uint32_t parsed;
+
+   uint32_t socket_id;
+   uint32_t core_id;
+   uint32_t hyper_th_id;
+   uint32_t force_bind;
+
+   uint32_t mempool_id; /* Position in the app->mempool_params */
+   uint32_t burst_read;
+   uint32_t burst_write;
+};
+
 #ifndef APP_FILE_NAME_SIZE
 #define APP_FILE_NAME_SIZE   256
 #endif
@@ -185,6 +202,7 @@ enum app_pktq_in_type {
APP_PKTQ_IN_HWQ,
APP_PKTQ_IN_SWQ,
APP_PKTQ_IN_TM,
+   APP_PKTQ_IN_KNI,
APP_PKTQ_IN_SOURCE,
 };

@@ -197,6 +215,7 @@ enum app_pktq_out_type {
APP_PKTQ_OUT_HWQ,
APP_PKTQ_OUT_SWQ,
APP_PKTQ_OUT_TM,
+   APP_PKTQ_OUT_KNI,
APP_PKTQ_OUT_SINK,
 };

@@ -420,6 +439,8 @@ struct app_eal_params {

 #define APP_MAX_PKTQ_TM  APP_MAX_LINKS

+#define APP_MAX_PKTQ_KNI APP_MAX_LINKS
+
 #ifndef APP_MAX_PKTQ_SOURCE
 #define APP_MAX_PKTQ_SOURCE  64
 #endif
@@ -471,6 +492,7 @@ struct app_params {
struct app_pktq_hwq_out_params hwq_out_params[APP_MAX_HWQ_OUT];
struct app_pktq_swq_params swq_params[APP_MAX_PKTQ_SWQ];
struct app_pktq_tm_params tm_params[APP_MAX_PKTQ_TM];
+   struct app_pktq_kni_params kni_params[APP_MAX_PKTQ_KNI];
struct app_pktq_source_params source_params[APP_MAX_PKTQ_SOURCE];
struct app_pktq_sink_params sink_params[APP_MAX_PKTQ_SINK];
struct app_msgq_params msgq_params[APP_MAX_MSGQ];
@@ -482,6 +504,7 @@ struct app_params {
uint32_t n_pktq_hwq_out;
uint32_t n_pktq_swq;
uint32_t n_pktq_tm;
+   uint32_t n_pktq_kni;
uint32_t n_pktq_source;
uint32_t n_pktq_sink;
uint32_t n_msgq;
@@ -495,6 +518,9 @@ struct app_params {
struct app_link_data link_data[APP_MAX_LINKS];
struct rte_ring *swq[APP_MAX_PKTQ_SWQ];
struct rte_sched_port *tm[APP_MAX_PKTQ_TM];
+#ifdef RTE_LIBRTE_KNI
+   struct rte_kni *kni[APP_MAX_PKTQ_KNI];
+#endif /* RTE_LIBRTE_KNI */
struct rte_ring *msgq[APP_MAX_MSGQ];
struct pipeline_type pipeline_type[APP_MAX_PIPELINE_TYPES];
struct app_pipeline_data 

[dpdk-dev] [PATCH v4 0/3] Keep-alive enhancements

2016-06-16 Thread Thomas Monjalon
> Remy Horton (3):
>   eal: export keepalive state enumerations
>   eal: add additional keepalive callbacks
>   examples/l2fwd-keepalive: add IPC liveness reporting

Applied, thanks

Just a last comment: the agent in the example should not appear
in examples/Makefile.


[dpdk-dev] Performance hit - NICs on different CPU sockets

2016-06-16 Thread Take Ceara
On Thu, Jun 16, 2016 at 5:29 PM, Wiles, Keith  wrote:

>
> Right now I do not know what the issue is with the system. Could be too many 
> Rx/Tx ring pairs per port and limiting the memory in the NICs, which is why 
> you get better performance when you have 8 core per port. I am not really 
> seeing the whole picture and how DPDK is configured to help more. Sorry.

I doubt that there is a limitation wrt running 16 cores per port vs 8
cores per port as I've tried with two different machines connected
back to back each with one X710 port and 16 cores on each of them
running on that port. In that case our performance doubled as
expected.

>
> Maybe seeing the DPDK command line would help.

The command line I use with ports 01:00.3 and 81:00.3 is:
./warp17 -c 0xF3   -m 32768 -w :81:00.3 -w :01:00.3 --
--qmap 0.0x003FF003F0 --qmap 1.0x0FC00FFC00

Our own qmap args allow the user to control exactly how cores are
split between ports. In this case we end up with:

warp17> show port map
Port 0[socket: 0]:
   Core 4[socket:0] (Tx: 0, Rx: 0)
   Core 5[socket:0] (Tx: 1, Rx: 1)
   Core 6[socket:0] (Tx: 2, Rx: 2)
   Core 7[socket:0] (Tx: 3, Rx: 3)
   Core 8[socket:0] (Tx: 4, Rx: 4)
   Core 9[socket:0] (Tx: 5, Rx: 5)
   Core 20[socket:0] (Tx: 6, Rx: 6)
   Core 21[socket:0] (Tx: 7, Rx: 7)
   Core 22[socket:0] (Tx: 8, Rx: 8)
   Core 23[socket:0] (Tx: 9, Rx: 9)
   Core 24[socket:0] (Tx: 10, Rx: 10)
   Core 25[socket:0] (Tx: 11, Rx: 11)
   Core 26[socket:0] (Tx: 12, Rx: 12)
   Core 27[socket:0] (Tx: 13, Rx: 13)
   Core 28[socket:0] (Tx: 14, Rx: 14)
   Core 29[socket:0] (Tx: 15, Rx: 15)

Port 1[socket: 1]:
   Core 10[socket:1] (Tx: 0, Rx: 0)
   Core 11[socket:1] (Tx: 1, Rx: 1)
   Core 12[socket:1] (Tx: 2, Rx: 2)
   Core 13[socket:1] (Tx: 3, Rx: 3)
   Core 14[socket:1] (Tx: 4, Rx: 4)
   Core 15[socket:1] (Tx: 5, Rx: 5)
   Core 16[socket:1] (Tx: 6, Rx: 6)
   Core 17[socket:1] (Tx: 7, Rx: 7)
   Core 18[socket:1] (Tx: 8, Rx: 8)
   Core 19[socket:1] (Tx: 9, Rx: 9)
   Core 30[socket:1] (Tx: 10, Rx: 10)
   Core 31[socket:1] (Tx: 11, Rx: 11)
   Core 32[socket:1] (Tx: 12, Rx: 12)
   Core 33[socket:1] (Tx: 13, Rx: 13)
   Core 34[socket:1] (Tx: 14, Rx: 14)
   Core 35[socket:1] (Tx: 15, Rx: 15)

Just for reference, the cpu_layout script shows:
$ $RTE_SDK/tools/cpu_layout.py

Core and Socket Information (as reported by '/proc/cpuinfo')


cores =  [0, 1, 2, 3, 4, 8, 9, 10, 11, 12]
sockets =  [0, 1]

Socket 0Socket 1

Core 0  [0, 20] [10, 30]
Core 1  [1, 21] [11, 31]
Core 2  [2, 22] [12, 32]
Core 3  [3, 23] [13, 33]
Core 4  [4, 24] [14, 34]
Core 8  [5, 25] [15, 35]
Core 9  [6, 26] [16, 36]
Core 10 [7, 27] [17, 37]
Core 11 [8, 28] [18, 38]
Core 12 [9, 29] [19, 39]

I know it might be complicated to gigure out exactly what's happening
in our setup with our own code so please let me know if you need
additional information.

I appreciate the help!

Thanks,
Dumitru


[dpdk-dev] [PATCH] qat: fix for VFs not getting recognized

2016-06-16 Thread Thomas Monjalon
2016-06-16 16:29, Jain, Deepak K:
> Due to addition of CLASS_ID in EAL, class_id is
> amended into the code.

Why the VF is not recognized?
The class id should not be mandatory.


[dpdk-dev] [PATCH v5 0/7] Remove string operations from xstats

2016-06-16 Thread Thomas Monjalon
> Remy Horton (7):
>   rte: change xstats to use integer ids
>   drivers/net/ixgbe: change xstats to use integer ids
>   drivers/net/e1000: change xstats to use integer ids
>   drivers/net/fm10k: change xstats to use integer ids
>   drivers/net/i40e: change xstats to use integer ids
>   drivers/net/virtio: change xstats to use integer ids
>   rte: change xstats usage to new API

Applied, thanks


[dpdk-dev] [PATCH v3] i40e: configure MTU

2016-06-16 Thread Yong Wang
On 6/16/16, 10:40 AM, "dev on behalf of Yong Wang"  wrote:

>On 5/16/16, 5:27 AM, "dev on behalf of Olivier Matz" on behalf of olivier.matz at 6wind.com> wrote:
>
>>Hi Beilei,
>>
>>On 05/13/2016 10:15 AM, Beilei Xing wrote:
>>> This patch enables configuring MTU for i40e.
>>> Since changing MTU needs to reconfigure queue, stop port first
>>> before configuring MTU.
>>> 
>>> Signed-off-by: Beilei Xing 
>>> ---
>>> v3 changes:
>>>  Add frame size with extra I40E_VLAN_TAG_SIZE.
>>>  Delete i40e_dev_rx_init(pf) cause it will be called when port starts.
>>> 
>>> v2 changes:
>>>  If mtu is not within the allowed range, return -EINVAL instead of -EBUSY.
>>>  Delete rxq reconfigure cause rxq reconfigure will be finished in
>>>  i40e_dev_rx_init.
>>>
>>>  drivers/net/i40e/i40e_ethdev.c | 34 ++
>>>  1 file changed, 34 insertions(+)
>>> 
>>> [...]
>>> +static int
>>> +i40e_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu)
>>> +{
>>> +   struct i40e_pf *pf = I40E_DEV_PRIVATE_TO_PF(dev->data->dev_private);
>>> +   struct rte_eth_dev_data *dev_data = pf->dev_data;
>>> +   uint32_t frame_size = mtu + ETHER_HDR_LEN
>>> + + ETHER_CRC_LEN + I40E_VLAN_TAG_SIZE;
>>> +   int ret = 0;
>>> +
>>> +   /* check if mtu is within the allowed range */
>>> +   if ((mtu < ETHER_MIN_MTU) || (frame_size > I40E_FRAME_SIZE_MAX))
>>> +   return -EINVAL;
>>> +
>>> +   /* mtu setting is forbidden if port is start */
>>> +   if (dev_data->dev_started) {
>>> +   PMD_DRV_LOG(ERR,
>>> +   "port %d must be stopped before configuration\n",
>>> +   dev_data->port_id);
>>> +   return -ENOTSUP;
>>> +   }
>>
>>I'm not convinced that ENOTSUP is the proper return value here.
>>It is usually returned when a function is not implemented, which
>>is not the case here: the function is implemented but is forbidden
>>because the port is running.
>>
>>I saw that Julien commented on your v1 that the return value should
>>be one of:
>> - (0) if successful.
>> - (-ENOTSUP) if operation is not supported.
>> - (-ENODEV) if *port_id* invalid.
>> - (-EINVAL) if *mtu* invalid.
>>
>>But I think your initial value (-EBUSY) was fine. Maybe it should be
>>added in the API instead, with the following description:
>>  (-EBUSY) if the operation is not allowed when the port is running
>
>AFAICT, the same check is not done for other drivers that implement
>the mac_set op. Wouldn?t it make more sense to have the driver disable

Correction: this should read as mtu_set.

>the port, reconfigure and re-enable it in this case, instead of returning
>error code?  If the consensus in DPDK is to have the application disable
>the port first, we need to enforce this policy across all devices and
>clearly document this behavior.
>
>>This would allow the application to take its dispositions to stop the
>>port and restart it with the proper jumbo_frame argument.
>>
>>+CC Thomas which maintains ethdev API.
>>
>>
>>Regards,
>>Olivier
>



[dpdk-dev] [PATCH v3] i40e: configure MTU

2016-06-16 Thread Yong Wang
On 5/16/16, 5:27 AM, "dev on behalf of Olivier Matz"  wrote:

>Hi Beilei,
>
>On 05/13/2016 10:15 AM, Beilei Xing wrote:
>> This patch enables configuring MTU for i40e.
>> Since changing MTU needs to reconfigure queue, stop port first
>> before configuring MTU.
>> 
>> Signed-off-by: Beilei Xing 
>> ---
>> v3 changes:
>>  Add frame size with extra I40E_VLAN_TAG_SIZE.
>>  Delete i40e_dev_rx_init(pf) cause it will be called when port starts.
>> 
>> v2 changes:
>>  If mtu is not within the allowed range, return -EINVAL instead of -EBUSY.
>>  Delete rxq reconfigure cause rxq reconfigure will be finished in
>>  i40e_dev_rx_init.
>>
>>  drivers/net/i40e/i40e_ethdev.c | 34 ++
>>  1 file changed, 34 insertions(+)
>> 
>> [...]
>> +static int
>> +i40e_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu)
>> +{
>> +struct i40e_pf *pf = I40E_DEV_PRIVATE_TO_PF(dev->data->dev_private);
>> +struct rte_eth_dev_data *dev_data = pf->dev_data;
>> +uint32_t frame_size = mtu + ETHER_HDR_LEN
>> +  + ETHER_CRC_LEN + I40E_VLAN_TAG_SIZE;
>> +int ret = 0;
>> +
>> +/* check if mtu is within the allowed range */
>> +if ((mtu < ETHER_MIN_MTU) || (frame_size > I40E_FRAME_SIZE_MAX))
>> +return -EINVAL;
>> +
>> +/* mtu setting is forbidden if port is start */
>> +if (dev_data->dev_started) {
>> +PMD_DRV_LOG(ERR,
>> +"port %d must be stopped before configuration\n",
>> +dev_data->port_id);
>> +return -ENOTSUP;
>> +}
>
>I'm not convinced that ENOTSUP is the proper return value here.
>It is usually returned when a function is not implemented, which
>is not the case here: the function is implemented but is forbidden
>because the port is running.
>
>I saw that Julien commented on your v1 that the return value should
>be one of:
> - (0) if successful.
> - (-ENOTSUP) if operation is not supported.
> - (-ENODEV) if *port_id* invalid.
> - (-EINVAL) if *mtu* invalid.
>
>But I think your initial value (-EBUSY) was fine. Maybe it should be
>added in the API instead, with the following description:
>  (-EBUSY) if the operation is not allowed when the port is running

AFAICT, the same check is not done for other drivers that implement
the mac_set op. Wouldn?t it make more sense to have the driver disable
the port, reconfigure and re-enable it in this case, instead of returning
error code?  If the consensus in DPDK is to have the application disable
the port first, we need to enforce this policy across all devices and
clearly document this behavior.

>This would allow the application to take its dispositions to stop the
>port and restart it with the proper jumbo_frame argument.
>
>+CC Thomas which maintains ethdev API.
>
>
>Regards,
>Olivier



[dpdk-dev] [PATCH v5 1/4] lib/librte_ether: support device reset

2016-06-16 Thread Thomas Monjalon
2016-06-15 11:03, Wenzhuo Lu:
> +/**
> + * Reset an Ethernet device.
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + */
> +int
> +rte_eth_dev_reset(uint8_t port_id);

Please explain in the doxygen comment what means a reset.
We must understand why and when an application should call it.
And it must be clear for a PMD developper how to implement it.
What is the return value?


[dpdk-dev] [PATCH v3 1/2] ethdev: add tunnel and port RSS offload types

2016-06-16 Thread Jerin Jacob
On Fri, Apr 01, 2016 at 07:59:33PM +0530, Jerin Jacob wrote:
> On Fri, Apr 01, 2016 at 04:04:13PM +0200, Thomas Monjalon wrote:
> > 2016-03-31 02:21, Jerin Jacob:
> > > - added VXLAN, GENEVE and NVGRE tunnel flow types
> > > - added PORT flow type for accounting physical/virtual
> > > port or channel number in flow creation
> >
> > These API change could be considered for 16.07 if they are motivated
> > by any use. Please bring some use cases, thanks.
> 
> The use case is to spray the packets to multiple queues using RSS on
> Tunnel type packets.
> 
> Considering the case if RSS hash does not account inner packet in tunnel
> case, the packet always to go a particular queue as mostly likely
> outer header remains same in tunnel packets and RSS spread
> will not be achieved in tunnel packets case.
> 
> This feature is part of the RSS capability of ThunderX
> NIC HW. Which, we are planning to upstream on next release.
> 
> I thought of pushing the common code changes first.

Ping

Can we merge this changeset if their are no concerns?
and their is a real consumer for this,
http://dpdk.org/ml/archives/dev/2016-June/041374.html

Jerin


[dpdk-dev] Performance hit - NICs on different CPU sockets

2016-06-16 Thread Take Ceara
On Thu, Jun 16, 2016 at 4:58 PM, Wiles, Keith  wrote:
>
> From the output below it appears the x710 devices 01:00.[0-3] are on socket 0
> And the x710 devices 02:00.[0-3] sit on socket 1.
>

I assume there's a mistake here. The x710 devices on socket 0 are:
$ lspci | grep -ie "01:.*x710"
01:00.0 Ethernet controller: Intel Corporation Ethernet Controller
X710 for 10GbE SFP+ (rev 01)
01:00.1 Ethernet controller: Intel Corporation Ethernet Controller
X710 for 10GbE SFP+ (rev 01)
01:00.2 Ethernet controller: Intel Corporation Ethernet Controller
X710 for 10GbE SFP+ (rev 01)
01:00.3 Ethernet controller: Intel Corporation Ethernet Controller
X710 for 10GbE SFP+ (rev 01)

and the X710 devices on socket 1 are:
$ lspci | grep -ie "81:.*x710"
81:00.0 Ethernet controller: Intel Corporation Ethernet Controller
X710 for 10GbE SFP+ (rev 01)
81:00.1 Ethernet controller: Intel Corporation Ethernet Controller
X710 for 10GbE SFP+ (rev 01)
81:00.2 Ethernet controller: Intel Corporation Ethernet Controller
X710 for 10GbE SFP+ (rev 01)
81:00.3 Ethernet controller: Intel Corporation Ethernet Controller
X710 for 10GbE SFP+ (rev 01)

> This means the ports on 01.00.xx should be handled by socket 0 CPUs and 
> 02:00.xx should be handled by Socket 1. I can not tell if that is the case 
> for you here. The CPUs or lcores from the cpu_layout.py should help 
> understand the layout.
>

That was the first scenario I tried:
- assign 16 CPUs from socket 0 to port 0 (01:00.3)
- assign 16 CPUs from socket 1 to port 1 (81:00.3)

Our performance measurements show then a setup rate of 1.6M sess/s
which is less then half of what I get when i install both X710 on
socket 1 and use only 16 CPUs from socket 1 for both ports.

I double checked the cpu layout. We also have our own CLI and warnings
when using cores that are not on the same socket as the port they're
assigned too so the mapping should be fine.

Thanks,
Dumitru


[dpdk-dev] [PATCH v5 1/1] eal: fix resource leak of mapped memory

2016-06-16 Thread Marcin Kerlin
Patch fixes resource leak in rte_eal_hugepage_attach() where mapped files
were not freed back to the OS in case of failure. Patch uses the behavior
of Linux munmap: "It is not an error if the indicated range does not
contain any mapped pages".

Coverity issue: 13295, 13296, 13303
Fixes: af75078fece3 ("first public release")

Signed-off-by: Marcin Kerlin 
Acked-by: Sergio Gonzalez Monroy 
---
v5:
 -shift the history of changes
v4:
 -removed keyword const from pointer and dependent on that casting (void *)
v3:
 -removed redundant casting
 -removed update error message
v2:
 -unmapping also previous addresses

 lib/librte_eal/linuxapp/eal/eal_memory.c | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c 
b/lib/librte_eal/linuxapp/eal/eal_memory.c
index 79d1d2d..c935765 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -1399,7 +1399,7 @@ int
 rte_eal_hugepage_attach(void)
 {
const struct rte_mem_config *mcfg = 
rte_eal_get_configuration()->mem_config;
-   const struct hugepage_file *hp = NULL;
+   struct hugepage_file *hp = NULL;
unsigned num_hp = 0;
unsigned i, s = 0; /* s used to track the segment number */
off_t size;
@@ -1481,7 +1481,7 @@ rte_eal_hugepage_attach(void)

size = getFileSize(fd_hugepage);
hp = mmap(NULL, size, PROT_READ, MAP_PRIVATE, fd_hugepage, 0);
-   if (hp == NULL) {
+   if (hp == MAP_FAILED) {
RTE_LOG(ERR, EAL, "Could not mmap %s\n", 
eal_hugepage_info_path());
goto error;
}
@@ -1545,12 +1545,19 @@ rte_eal_hugepage_attach(void)
s++;
}
/* unmap the hugepage config file, since we are done using it */
-   munmap((void *)(uintptr_t)hp, size);
+   munmap(hp, size);
close(fd_zero);
close(fd_hugepage);
return 0;

 error:
+   s = 0;
+   while (s < RTE_MAX_MEMSEG && mcfg->memseg[s].len > 0) {
+   munmap(mcfg->memseg[s].addr, mcfg->memseg[s].len);
+   s++;
+   }
+   if (hp != NULL && hp != MAP_FAILED)
+   munmap(hp, size);
if (fd_zero >= 0)
close(fd_zero);
if (fd_hugepage >= 0)
-- 
1.9.1



[dpdk-dev] [PATCHv7 1/6] pmdinfogen: Add buildtools and pmdinfogen utility

2016-06-16 Thread Panu Matilainen
On 06/16/2016 04:33 PM, Neil Horman wrote:
> On Thu, Jun 16, 2016 at 03:29:57PM +0300, Panu Matilainen wrote:
>> On 06/09/2016 08:46 PM, Neil Horman wrote:
>>> pmdinfogen is a tool used to parse object files and build json strings for
>>> use in later determining hardware support in a dso or application binary.
>>> pmdinfo looks for the non-exported symbol names this_pmd_name and
>>> this_pmd_tbl (where n is a integer counter).  It records the name of
>>> each of these tuples, using the later to find the symbolic name of the
>>> pci_table for physical devices that the object supports.  With this
>>> information, it outputs a C file with a single line of the form:
>>>
>>> static char *_driver_info[] __attribute__((used)) = " \
>>> PMD_DRIVER_INFO=";
>>>
>>> Where  is the arbitrary name of the pmd, and  is the
>>> json encoded string that hold relevant pmd information, including the pmd
>>> name, type and optional array of pci device/vendor ids that the driver
>>> supports.
>>>
>>> This c file is suitable for compiling to object code, then relocatably
>>> linking into the parent file from which the C was generated.  This creates
>>> an entry in the string table of the object that can inform a later tool
>>> about hardware support.
>>>
>>> Signed-off-by: Neil Horman 
>>> CC: Bruce Richardson 
>>> CC: Thomas Monjalon 
>>> CC: Stephen Hemminger 
>>> CC: Panu Matilainen 
>>> ---
>>
>> Unlike earlier versions, pmdinfogen ends up installed in bindir during "make
>> install". Is that intentional, or just a side-effect from using
>> rte.hostapp.mk? If its intentional it probably should be prefixed with dpdk_
>> like the other tools.
>>
> Im not sure what the answer is here.  As you can see, Thomas and I argued at
> length over which makefile to use, and I gave up, so I suppose you can call it
> intentional.  Being in bindir makes a reasonable amount of sense I suppose, as
> 3rd party developers can use it during their independent driver development.

Right, it'd be useful for 3rd party driver developer, so lets consider 
it intentional :)

> I'm not sure I agree with prefixing it though.  Given that the hostapp.mk file
> installs everything there, and nothing that previously used that make file 
> had a
> dpdk_ prefix that I can tell, I'm not sure why this would.  pmdinfogen seems
> like a pretty unique name, and I know of no other project that uses the term 
> pmd
> to describe anything.

I agree about "pmd" being fairly unique as is, but if pmdinfo is dpdk_ 
prefixed then this should be too, or neither should be prefixed. I dont 
personally care which way, but it should be consistent.

- Panu -

>
> Neil
>
>>  - Panu -
>>
>>



[dpdk-dev] [PATCH v3 3/4] bonding: take queue spinlock in rx/tx burst functions

2016-06-16 Thread Thomas Monjalon
2016-06-16 15:32, Bruce Richardson:
> On Mon, Jun 13, 2016 at 01:28:08PM +0100, Iremonger, Bernard wrote:
> > > Why does this particular PMD need spinlocks when doing RX and TX, while
> > > other device types do not? How is adding/removing devices from a bonded
> > > device different to other control operations that can be done on physical
> > > PMDs? Is this not similar to say bringing down or hotplugging out a 
> > > physical
> > > port just before an RX or TX operation takes place?
> > > For all other PMDs we rely on the app to synchronise control and data 
> > > plane
> > > operation - why not here?
> > > 
> > > /Bruce
> > 
> > This issue arose during VM live migration testing. 
> > For VM live migration it is necessary (while traffic is running) to be able 
> > to remove a bonded slave device, stop it, close it and detach it.
> > It a slave device is removed from a bonded device while traffic is running 
> > a segmentation fault may occur in the rx/tx burst function. The spinlock 
> > has been added to prevent this occurring.
> > 
> > The bonding device already uses a spinlock to synchronise between the add 
> > and remove functionality and the slave_link_status_change_monitor code. 
> > 
> > Previously testpmd did not allow, stop, close or detach of PMD while 
> > traffic was running. Testpmd has been modified with the following patchset 
> > 
> > http://dpdk.org/dev/patchwork/patch/13472/
> > 
> > It now allows stop, close and detach of a PMD provided in it is not 
> > forwarding and is not a slave of bonded PMD.
> > 
> I will admit to not being fully convinced, but if nobody else has any serious
> objections, and since this patch has been reviewed and acked, I'm ok to merge 
> it
> in. I'll do so shortly.

Please hold on.
Seeing locks introduced in the Rx/Tx path is an alert.
We clearly need a design document to explain where locks can be used
and what are the responsibility of the control plane.
If everybody agrees in this document that DPDK can have some locks
in the fast path, then OK to merge it.

So I would say NACK for 16.07 and maybe postpone to 16.11.



[dpdk-dev] Performance hit - NICs on different CPU sockets

2016-06-16 Thread Wiles, Keith

On 6/16/16, 11:56 AM, "dev on behalf of Wiles, Keith"  wrote:

>
>On 6/16/16, 11:20 AM, "Take Ceara"  wrote:
>
>>On Thu, Jun 16, 2016 at 5:29 PM, Wiles, Keith  
>>wrote:
>>
>>>
>>> Right now I do not know what the issue is with the system. Could be too 
>>> many Rx/Tx ring pairs per port and limiting the memory in the NICs, which 
>>> is why you get better performance when you have 8 core per port. I am not 
>>> really seeing the whole picture and how DPDK is configured to help more. 
>>> Sorry.
>>
>>I doubt that there is a limitation wrt running 16 cores per port vs 8
>>cores per port as I've tried with two different machines connected
>>back to back each with one X710 port and 16 cores on each of them
>>running on that port. In that case our performance doubled as
>>expected.
>>
>>>
>>> Maybe seeing the DPDK command line would help.
>>
>>The command line I use with ports 01:00.3 and 81:00.3 is:
>>./warp17 -c 0xF3   -m 32768 -w :81:00.3 -w :01:00.3 --
>>--qmap 0.0x003FF003F0 --qmap 1.0x0FC00FFC00
>>
>>Our own qmap args allow the user to control exactly how cores are
>>split between ports. In this case we end up with:
>>
>>warp17> show port map
>>Port 0[socket: 0]:
>>   Core 4[socket:0] (Tx: 0, Rx: 0)
>>   Core 5[socket:0] (Tx: 1, Rx: 1)
>>   Core 6[socket:0] (Tx: 2, Rx: 2)
>>   Core 7[socket:0] (Tx: 3, Rx: 3)
>>   Core 8[socket:0] (Tx: 4, Rx: 4)
>>   Core 9[socket:0] (Tx: 5, Rx: 5)
>>   Core 20[socket:0] (Tx: 6, Rx: 6)
>>   Core 21[socket:0] (Tx: 7, Rx: 7)
>>   Core 22[socket:0] (Tx: 8, Rx: 8)
>>   Core 23[socket:0] (Tx: 9, Rx: 9)
>>   Core 24[socket:0] (Tx: 10, Rx: 10)
>>   Core 25[socket:0] (Tx: 11, Rx: 11)
>>   Core 26[socket:0] (Tx: 12, Rx: 12)
>>   Core 27[socket:0] (Tx: 13, Rx: 13)
>>   Core 28[socket:0] (Tx: 14, Rx: 14)
>>   Core 29[socket:0] (Tx: 15, Rx: 15)
>>
>>Port 1[socket: 1]:
>>   Core 10[socket:1] (Tx: 0, Rx: 0)
>>   Core 11[socket:1] (Tx: 1, Rx: 1)
>>   Core 12[socket:1] (Tx: 2, Rx: 2)
>>   Core 13[socket:1] (Tx: 3, Rx: 3)
>>   Core 14[socket:1] (Tx: 4, Rx: 4)
>>   Core 15[socket:1] (Tx: 5, Rx: 5)
>>   Core 16[socket:1] (Tx: 6, Rx: 6)
>>   Core 17[socket:1] (Tx: 7, Rx: 7)
>>   Core 18[socket:1] (Tx: 8, Rx: 8)
>>   Core 19[socket:1] (Tx: 9, Rx: 9)
>>   Core 30[socket:1] (Tx: 10, Rx: 10)
>>   Core 31[socket:1] (Tx: 11, Rx: 11)
>>   Core 32[socket:1] (Tx: 12, Rx: 12)
>>   Core 33[socket:1] (Tx: 13, Rx: 13)
>>   Core 34[socket:1] (Tx: 14, Rx: 14)
>>   Core 35[socket:1] (Tx: 15, Rx: 15)
>
>On each socket you have 10 physical cores or 20 lcores per socket for 40 
>lcores total.
>
>The above is listing the LCORES (or hyper-threads) and not COREs, which I 
>understand some like to think they are interchangeable. The problem is the 
>hyper-threads are logically interchangeable, but not performance wise. If you 
>have two run-to-completion threads on a single physical core each on a 
>different hyper-thread of that core [0,1], then the second lcore or thread (1) 
>on that physical core will only get at most about 30-20% of the CPU cycles. 
>Normally it is much less, unless you tune the code to make sure each thread is 
>not trying to share the internal execution units, but some internal execution 
>units are always shared.
>
>To get the best performance when hyper-threading is enable is to not run both 
>threads on a single physical core, but only run one hyper-thread-0.
>
>In the table below the table lists the physical core id and each of the lcore 
>ids per socket. Use the first lcore per socket for the best performance:
>Core 1 [1, 21][11, 31]
>Use lcore 1 or 11 depending on the socket you are on.
>
>The info below is most likely the best performance and utilization of your 
>system. If I got the values right ?
>
>./warp17 -c 0x0FFFe0   -m 32768 -w :81:00.3 -w :01:00.3 --
>--qmap 0.0x0003FE --qmap 1.0x0FFE00
>
>Port 0[socket: 0]:
>   Core 2[socket:0] (Tx: 0, Rx: 0)
>   Core 3[socket:0] (Tx: 1, Rx: 1)
>   Core 4[socket:0] (Tx: 2, Rx: 2)
>   Core 5[socket:0] (Tx: 3, Rx: 3)
>   Core 6[socket:0] (Tx: 4, Rx: 4)
>   Core 7[socket:0] (Tx: 5, Rx: 5)
>   Core 8[socket:0] (Tx: 6, Rx: 6)
>   Core 9[socket:0] (Tx: 7, Rx: 7)
>
>8 cores on first socket leaving 0-1 lcores for Linux.

9 cores and leaving the first core or two lcores for Linux
>
>Port 1[socket: 1]:
>   Core 10[socket:1] (Tx: 0, Rx: 0)
>   Core 11[socket:1] (Tx: 1, Rx: 1)
>   Core 12[socket:1] (Tx: 2, Rx: 2)
>   Core 13[socket:1] (Tx: 3, Rx: 3)
>   Core 14[socket:1] (Tx: 4, Rx: 4)
>   Core 15[socket:1] (Tx: 5, Rx: 5)
>   Core 16[socket:1] (Tx: 6, Rx: 6)
>   Core 17[socket:1] (Tx: 7, Rx: 7)
>   Core 18[socket:1] (Tx: 8, Rx: 8)
>   Core 19[socket:1] (Tx: 9, Rx: 9)
>
>All 10 cores on the second socket.
>
>++Keith
>
>>
>>Just for reference, the cpu_layout script shows:
>>$ $RTE_SDK/tools/cpu_layout.py
>>
>>Core and Socket Information (as reported by '/proc/cpuinfo')
>>
>>
>>cores =  [0, 1, 

[dpdk-dev] Performance hit - NICs on different CPU sockets

2016-06-16 Thread Wiles, Keith

On 6/16/16, 11:20 AM, "Take Ceara"  wrote:

>On Thu, Jun 16, 2016 at 5:29 PM, Wiles, Keith  wrote:
>
>>
>> Right now I do not know what the issue is with the system. Could be too many 
>> Rx/Tx ring pairs per port and limiting the memory in the NICs, which is why 
>> you get better performance when you have 8 core per port. I am not really 
>> seeing the whole picture and how DPDK is configured to help more. Sorry.
>
>I doubt that there is a limitation wrt running 16 cores per port vs 8
>cores per port as I've tried with two different machines connected
>back to back each with one X710 port and 16 cores on each of them
>running on that port. In that case our performance doubled as
>expected.
>
>>
>> Maybe seeing the DPDK command line would help.
>
>The command line I use with ports 01:00.3 and 81:00.3 is:
>./warp17 -c 0xF3   -m 32768 -w :81:00.3 -w :01:00.3 --
>--qmap 0.0x003FF003F0 --qmap 1.0x0FC00FFC00
>
>Our own qmap args allow the user to control exactly how cores are
>split between ports. In this case we end up with:
>
>warp17> show port map
>Port 0[socket: 0]:
>   Core 4[socket:0] (Tx: 0, Rx: 0)
>   Core 5[socket:0] (Tx: 1, Rx: 1)
>   Core 6[socket:0] (Tx: 2, Rx: 2)
>   Core 7[socket:0] (Tx: 3, Rx: 3)
>   Core 8[socket:0] (Tx: 4, Rx: 4)
>   Core 9[socket:0] (Tx: 5, Rx: 5)
>   Core 20[socket:0] (Tx: 6, Rx: 6)
>   Core 21[socket:0] (Tx: 7, Rx: 7)
>   Core 22[socket:0] (Tx: 8, Rx: 8)
>   Core 23[socket:0] (Tx: 9, Rx: 9)
>   Core 24[socket:0] (Tx: 10, Rx: 10)
>   Core 25[socket:0] (Tx: 11, Rx: 11)
>   Core 26[socket:0] (Tx: 12, Rx: 12)
>   Core 27[socket:0] (Tx: 13, Rx: 13)
>   Core 28[socket:0] (Tx: 14, Rx: 14)
>   Core 29[socket:0] (Tx: 15, Rx: 15)
>
>Port 1[socket: 1]:
>   Core 10[socket:1] (Tx: 0, Rx: 0)
>   Core 11[socket:1] (Tx: 1, Rx: 1)
>   Core 12[socket:1] (Tx: 2, Rx: 2)
>   Core 13[socket:1] (Tx: 3, Rx: 3)
>   Core 14[socket:1] (Tx: 4, Rx: 4)
>   Core 15[socket:1] (Tx: 5, Rx: 5)
>   Core 16[socket:1] (Tx: 6, Rx: 6)
>   Core 17[socket:1] (Tx: 7, Rx: 7)
>   Core 18[socket:1] (Tx: 8, Rx: 8)
>   Core 19[socket:1] (Tx: 9, Rx: 9)
>   Core 30[socket:1] (Tx: 10, Rx: 10)
>   Core 31[socket:1] (Tx: 11, Rx: 11)
>   Core 32[socket:1] (Tx: 12, Rx: 12)
>   Core 33[socket:1] (Tx: 13, Rx: 13)
>   Core 34[socket:1] (Tx: 14, Rx: 14)
>   Core 35[socket:1] (Tx: 15, Rx: 15)

On each socket you have 10 physical cores or 20 lcores per socket for 40 lcores 
total.

The above is listing the LCORES (or hyper-threads) and not COREs, which I 
understand some like to think they are interchangeable. The problem is the 
hyper-threads are logically interchangeable, but not performance wise. If you 
have two run-to-completion threads on a single physical core each on a 
different hyper-thread of that core [0,1], then the second lcore or thread (1) 
on that physical core will only get at most about 30-20% of the CPU cycles. 
Normally it is much less, unless you tune the code to make sure each thread is 
not trying to share the internal execution units, but some internal execution 
units are always shared.

To get the best performance when hyper-threading is enable is to not run both 
threads on a single physical core, but only run one hyper-thread-0.

In the table below the table lists the physical core id and each of the lcore 
ids per socket. Use the first lcore per socket for the best performance:
Core 1 [1, 21][11, 31]
Use lcore 1 or 11 depending on the socket you are on.

The info below is most likely the best performance and utilization of your 
system. If I got the values right ?

./warp17 -c 0x0FFFe0   -m 32768 -w :81:00.3 -w :01:00.3 --
--qmap 0.0x0003FE --qmap 1.0x0FFE00

Port 0[socket: 0]:
   Core 2[socket:0] (Tx: 0, Rx: 0)
   Core 3[socket:0] (Tx: 1, Rx: 1)
   Core 4[socket:0] (Tx: 2, Rx: 2)
   Core 5[socket:0] (Tx: 3, Rx: 3)
   Core 6[socket:0] (Tx: 4, Rx: 4)
   Core 7[socket:0] (Tx: 5, Rx: 5)
   Core 8[socket:0] (Tx: 6, Rx: 6)
   Core 9[socket:0] (Tx: 7, Rx: 7)

8 cores on first socket leaving 0-1 lcores for Linux.

Port 1[socket: 1]:
   Core 10[socket:1] (Tx: 0, Rx: 0)
   Core 11[socket:1] (Tx: 1, Rx: 1)
   Core 12[socket:1] (Tx: 2, Rx: 2)
   Core 13[socket:1] (Tx: 3, Rx: 3)
   Core 14[socket:1] (Tx: 4, Rx: 4)
   Core 15[socket:1] (Tx: 5, Rx: 5)
   Core 16[socket:1] (Tx: 6, Rx: 6)
   Core 17[socket:1] (Tx: 7, Rx: 7)
   Core 18[socket:1] (Tx: 8, Rx: 8)
   Core 19[socket:1] (Tx: 9, Rx: 9)

All 10 cores on the second socket.

++Keith

>
>Just for reference, the cpu_layout script shows:
>$ $RTE_SDK/tools/cpu_layout.py
>
>Core and Socket Information (as reported by '/proc/cpuinfo')
>
>
>cores =  [0, 1, 2, 3, 4, 8, 9, 10, 11, 12]
>sockets =  [0, 1]
>
>Socket 0Socket 1
>
>Core 0  [0, 20] [10, 30]
>Core 1  [1, 21] [11, 31]
>Core 2  [2, 22] [12, 32]
>Core 3  [3, 23] [13, 33]
>Core 4  

[dpdk-dev] [PATCH v5] eal: out-of-bounds write

2016-06-16 Thread Slawomir Mrozowicz
Overrunning array mcfg->memseg of 256 44-byte elements
at element index 257 using index j.
Fixed by add condition with message information.

Fixes: af75078fece3 ("first public release")
Coverity ID 13282

Signed-off-by: Slawomir Mrozowicz 
---
v5:
- update message
v4:
- remove check condition from loop
v3:
- add check condition inside and outside the loop
v2:
- add message information
---
 lib/librte_eal/linuxapp/eal/eal_memory.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c 
b/lib/librte_eal/linuxapp/eal/eal_memory.c
index 5b9132c..ffe069c 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -1301,6 +1301,14 @@ rte_eal_hugepage_init(void)
break;
}

+   if (j >= RTE_MAX_MEMSEG) {
+   RTE_LOG(ERR, EAL,
+   "All memory segments exhausted by IVSHMEM. "
+   "Try recompiling with larger RTE_MAX_MEMSEG "
+   "then current %d\n", RTE_MAX_MEMSEG);
+   return -ENOMEM;
+   }
+
for (i = 0; i < nr_hugefiles; i++) {
new_memseg = 0;

-- 
1.9.1



[dpdk-dev] [PATCH v2] xenvirt: fix compilation after mempool changes

2016-06-16 Thread Bruce Richardson
On Mon, Jun 13, 2016 at 01:54:29PM +0200, Christian Ehrhardt wrote:
> Yeah, working now - thanks for the fast update!
> 
> Kind Regards,
> Christian
> 
> Christian Ehrhardt
> Software Engineer, Ubuntu Server
> Canonical Ltd
> 
Applied to dpdk-next-net/rel_16_07

/Bruce


[dpdk-dev] [PATCH] app/testpmd: unchecked return value

2016-06-16 Thread Thomas Monjalon
> > Calling rte_eth_dev_rss_hash_update without checking return value.
> > Fixed by handle return value and print out error status.
> > 
> > Fixes: ce8d561418d4 ("app/testpmd: add port configuration settings")
> > Coverity ID 119251
> > 
> > Signed-off-by: Slawomir Mrozowicz 
> 
> Acked-by: Pablo de Lara 

Applied, thanks


[dpdk-dev] [PATCH v5 00/25] DPDK PMD for ThunderX NIC device

2016-06-16 Thread Jerin Jacob
On Thu, Jun 16, 2016 at 11:58:27AM +0100, Bruce Richardson wrote:
> On Thu, Jun 16, 2016 at 03:01:02PM +0530, Jerin Jacob wrote:
> > On Wed, Jun 15, 2016 at 03:39:25PM +0100, Bruce Richardson wrote:
> > > On Wed, Jun 15, 2016 at 12:36:15AM +0530, Jerin Jacob wrote:
> > > > This patch set provides the initial version of DPDK PMD for the
> > > > built-in NIC device in Cavium ThunderX SoC family.
> > > > 
> > > > Implemented features and ThunderX nicvf PMD documentation added
> > > > in doc/guides/nics/overview.rst and doc/guides/nics/thunderx.rst
> > > > respectively in this patch set.
> > > > 
> > > > These patches are checked using checkpatch.sh with following
> > > > additional ignore option:
> > > > options="$options --ignore=CAMELCASE,BRACKET_SPACE"
> > > > CAMELCASE - To accommodate PRIx64
> > > > BRACKET_SPACE - To accommodate AT inline line assembly in two places
> > > > 
> > > > This patch set is based on DPDK 16.07-RC1
> > > > and tested with git HEAD change-set
> > > > ca173a909538a2f1082cd0dcb4d778a97dab69c3 along with
> > > > following depended patch
> > > > 
> > > > http://dpdk.org/dev/patchwork/patch/11826/
> > > > ethdev: add tunnel and port RSS offload types
> > > > 
> > > Hi Jerin,
> > > 
> > > hopefully a final set of comments before merge on this set, as it's 
> > > looking
> > > very good now.
> > > 
> > > * Two patches look like they need to be split, as they are combining 
> > > multiple
> > >   functions into one patch. They are:
> > > [dpdk-dev,v5,16/25] net/thunderx: add MTU set and promiscuous enable 
> > > support
> > > [dpdk-dev,v5,20/25] net/thunderx: implement supported ptype get and 
> > > Rx queue count
> > >   For the other patches which add multiple functions, the functions seem 
> > > to be
> > >   logically related so I don't think there is a problem
> > > 
> > > * check-git-logs.sh is warning about a few of the commit messages being 
> > > too long.
> > >   Splitting patch 20 should fix one of those, but there are a few 
> > > remaining.
> > >   A number of titles refer to ThunderX in the message, but this is 
> > > probably
> > >   unnecessary, as the prefix already contains "net/thunderx" in it.
> > 
> > OK. I will send the next revision.
> > 
> 
> Please hold off a few hours, as I'm hoping to merge in the bnxt driver this
> afternoon. If all goes well, I would appreciate it if you could base your 
> patchset
> off the rel_16_07 tree with that set applied - save me having to resolve 
> conflicts
> in files like the nic overview doc, which is always a pain to try and edit. 
> :-)

OK. I will re-base the changes once you have done with bnxt merge.
Let me know once its done.

> 
> Regards,
> /Bruce


[dpdk-dev] [PATCH 0/2] vhost: Fix leaks on migration.

2016-06-16 Thread Yuanhan Liu
Thanks for fixing them!

Would you please resend them, with a rebase based on master branch
of following tree:

http://dpdk.org/browse/next/dpdk-next-virtio/

--yliu

On Thu, Jun 16, 2016 at 11:32:03AM +0300, Ilya Maximets wrote:
> Ilya Maximets (2):
>   vhost: fix leak of file descriptors.
>   vhost: unmap log memory on cleanup.
> 
>  lib/librte_vhost/rte_virtio_net.h |  3 ++-
>  lib/librte_vhost/vhost_user/virtio-net-user.c | 16 ++--
>  2 files changed, 16 insertions(+), 3 deletions(-)
> 
> -- 
> 2.7.4


[dpdk-dev] [PATCH v3 3/4] bonding: take queue spinlock in rx/tx burst functions

2016-06-16 Thread Iremonger, Bernard
Hi Thomas,

> 2016-06-16 15:32, Bruce Richardson:
> > On Mon, Jun 13, 2016 at 01:28:08PM +0100, Iremonger, Bernard wrote:
> > > > Why does this particular PMD need spinlocks when doing RX and TX,
> > > > while other device types do not? How is adding/removing devices
> > > > from a bonded device different to other control operations that
> > > > can be done on physical PMDs? Is this not similar to say bringing
> > > > down or hotplugging out a physical port just before an RX or TX
> operation takes place?
> > > > For all other PMDs we rely on the app to synchronise control and
> > > > data plane operation - why not here?
> > > >
> > > > /Bruce
> > >
> > > This issue arose during VM live migration testing.
> > > For VM live migration it is necessary (while traffic is running) to be 
> > > able to
> remove a bonded slave device, stop it, close it and detach it.
> > > It a slave device is removed from a bonded device while traffic is running
> a segmentation fault may occur in the rx/tx burst function. The spinlock has
> been added to prevent this occurring.
> > >
> > > The bonding device already uses a spinlock to synchronise between the
> add and remove functionality and the slave_link_status_change_monitor
> code.
> > >
> > > Previously testpmd did not allow, stop, close or detach of PMD while
> > > traffic was running. Testpmd has been modified with the following
> > > patchset
> > >
> > > http://dpdk.org/dev/patchwork/patch/13472/
> > >
> > > It now allows stop, close and detach of a PMD provided in it is not
> forwarding and is not a slave of bonded PMD.
> > >
> > I will admit to not being fully convinced, but if nobody else has any
> > serious objections, and since this patch has been reviewed and acked,
> > I'm ok to merge it in. I'll do so shortly.
> 
> Please hold on.
> Seeing locks introduced in the Rx/Tx path is an alert.
> We clearly need a design document to explain where locks can be used and
> what are the responsibility of the control plane.
> If everybody agrees in this document that DPDK can have some locks in the
> fast path, then OK to merge it.
> 
> So I would say NACK for 16.07 and maybe postpone to 16.11.

Looking at the documentation for the bonding PMD.

http://dpdk.org/doc/guides/prog_guide/link_bonding_poll_mode_drv_lib.html

In section 10.2 it states the following:

Bonded devices support the dynamical addition and removal of slave devices 
using the rte_eth_bond_slave_add / rte_eth_bond_slave_remove APIs.

If a slave device is added or removed while traffic is running, there is the 
possibility of a segmentation fault in the rx/tx burst functions. This is most 
likely to occur in the round robin bonding mode.

This patch set fixes what appears to be a bug in the bonding PMD.

Performance measurements have been made with this patch set applied and without 
the patches applied using 64 byte packets. 

With the patches applied the following drop in performance was observed:

% drop for fwd+io:  0.16%
% drop for fwd+mac: 0.39%

This patch set has been reviewed and ack'ed, so I think it should be applied in 
16.07

Regards,

Bernard.




[dpdk-dev] [PATCH v3 0/5] vhost/virtio performance loopback utility

2016-06-16 Thread Thomas Monjalon
> > Zhihong Wang (5):
> >   testpmd: add retry option
> >   testpmd: configurable tx_first burst number
> >   testpmd: show throughput in port stats
> >   testpmd: handle all rxqs in rss setup
> >   testpmd: show topology at forwarding start
> 
> Series-acked-by: Pablo de Lara 

Applied, thanks


[dpdk-dev] [PATCH v3 5/5] testpmd: show topology at forwarding start

2016-06-16 Thread Thomas Monjalon
2016-06-16 11:09, De Lara Guarch, Pablo:
> > --- a/app/test-pmd/testpmd.c
> > +++ b/app/test-pmd/testpmd.c
> > @@ -1016,6 +1016,7 @@ start_packet_forwarding(int with_tx_first)
> > flush_fwd_rx_queues();
> > 
> > fwd_config_setup();
> > +   fwd_config_display();
> > rxtx_config_display();
> > 
> > for (i = 0; i < cur_fwd_config.nb_fwd_ports; i++) {
> > --
> > 2.5.0
> 
> Already acked this, but note that fwd_config_display() has been renamed to 
> pkt_fwd_config_display().
> Thomas, can you make that change when merging this?

Yes done :)


[dpdk-dev] [PATCH v5 00/25] DPDK PMD for ThunderX NIC device

2016-06-16 Thread Bruce Richardson
On Thu, Jun 16, 2016 at 04:47:39PM +0530, Jerin Jacob wrote:
> On Thu, Jun 16, 2016 at 11:58:27AM +0100, Bruce Richardson wrote:
> > On Thu, Jun 16, 2016 at 03:01:02PM +0530, Jerin Jacob wrote:
> > > On Wed, Jun 15, 2016 at 03:39:25PM +0100, Bruce Richardson wrote:
> > > > On Wed, Jun 15, 2016 at 12:36:15AM +0530, Jerin Jacob wrote:
> > > > > This patch set provides the initial version of DPDK PMD for the
> > > > > built-in NIC device in Cavium ThunderX SoC family.
> > > > > 
> > > > > Implemented features and ThunderX nicvf PMD documentation added
> > > > > in doc/guides/nics/overview.rst and doc/guides/nics/thunderx.rst
> > > > > respectively in this patch set.
> > > > > 
> > > > > These patches are checked using checkpatch.sh with following
> > > > > additional ignore option:
> > > > > options="$options --ignore=CAMELCASE,BRACKET_SPACE"
> > > > > CAMELCASE - To accommodate PRIx64
> > > > > BRACKET_SPACE - To accommodate AT inline line assembly in two places
> > > > > 
> > > > > This patch set is based on DPDK 16.07-RC1
> > > > > and tested with git HEAD change-set
> > > > > ca173a909538a2f1082cd0dcb4d778a97dab69c3 along with
> > > > > following depended patch
> > > > > 
> > > > > http://dpdk.org/dev/patchwork/patch/11826/
> > > > > ethdev: add tunnel and port RSS offload types
> > > > > 
> > > > Hi Jerin,
> > > > 
> > > > hopefully a final set of comments before merge on this set, as it's 
> > > > looking
> > > > very good now.
> > > > 
> > > > * Two patches look like they need to be split, as they are combining 
> > > > multiple
> > > >   functions into one patch. They are:
> > > > [dpdk-dev,v5,16/25] net/thunderx: add MTU set and promiscuous 
> > > > enable support
> > > > [dpdk-dev,v5,20/25] net/thunderx: implement supported ptype get and 
> > > > Rx queue count
> > > >   For the other patches which add multiple functions, the functions 
> > > > seem to be
> > > >   logically related so I don't think there is a problem
> > > > 
> > > > * check-git-logs.sh is warning about a few of the commit messages being 
> > > > too long.
> > > >   Splitting patch 20 should fix one of those, but there are a few 
> > > > remaining.
> > > >   A number of titles refer to ThunderX in the message, but this is 
> > > > probably
> > > >   unnecessary, as the prefix already contains "net/thunderx" in it.
> > > 
> > > OK. I will send the next revision.
> > > 
> > 
> > Please hold off a few hours, as I'm hoping to merge in the bnxt driver this
> > afternoon. If all goes well, I would appreciate it if you could base your 
> > patchset
> > off the rel_16_07 tree with that set applied - save me having to resolve 
> > conflicts
> > in files like the nic overview doc, which is always a pain to try and edit. 
> > :-)
> 
> OK. I will re-base the changes once you have done with bnxt merge.
> Let me know once its done.
> 
Done now. Feel free to submit a new version based on rel_16_07 branch.

Thanks,
/Bruce


[dpdk-dev] [PATCHv7 5/6] pmdinfo.py: Add tool to query binaries for hw and other support information

2016-06-16 Thread Panu Matilainen
On 06/09/2016 08:47 PM, Neil Horman wrote:
> This tool searches for the primer sting PMD_DRIVER_INFO= in any ELF binary,
> and, if found parses the remainder of the string as a json encoded string,
> outputting the results in either a human readable or raw, script parseable
> format
>
> Note that, in the case of dynamically linked applications, pmdinfo.py will
> scan for implicitly linked PMDs by searching the specified binaries
> .dynamic section for DT_NEEDED entries that contain the substring
> librte_pmd.  The DT_RUNPATH, LD_LIBRARY_PATH, /usr/lib and /lib are
> searched for these libraries, in that order
>
> If a file is specified with no path, it is assumed to be a PMD DSO, and the
> LD_LIBRARY_PATH, /usr/lib[64]/ and /lib[64] is searched for it
>
> Currently the tool can output data in 3 formats:
>
> a) raw, suitable for scripting, where the raw JSON strings are dumped out
> b) table format (default) where hex pci ids are dumped in a table format
> c) pretty, where a user supplied pci.ids file is used to print out vendor
> and device strings
>
> Signed-off-by: Neil Horman 
> CC: Bruce Richardson 
> CC: Thomas Monjalon 
> CC: Stephen Hemminger 
> CC: Panu Matilainen 
> ---
>  mk/rte.sdkinstall.mk |   2 +
>  tools/pmdinfo.py | 629 
> +++
>  2 files changed, 631 insertions(+)
>  create mode 100755 tools/pmdinfo.py
>
> diff --git a/mk/rte.sdkinstall.mk b/mk/rte.sdkinstall.mk
> index 68e56b6..dc36df5 100644
> --- a/mk/rte.sdkinstall.mk
> +++ b/mk/rte.sdkinstall.mk
> @@ -126,6 +126,8 @@ install-runtime:
>   $(Q)$(call rte_mkdir,  $(DESTDIR)$(sbindir))
>   $(Q)$(call rte_symlink,$(DESTDIR)$(datadir)/tools/dpdk_nic_bind.py, 
> \
>  $(DESTDIR)$(sbindir)/dpdk_nic_bind)
> + $(Q)$(call rte_symlink,$(DESTDIR)$(datadir)/tools/pmdinfo.py, \
> +$(DESTDIR)$(bindir)/dpdk-pmdinfo)

The symlink should be with underscore instead of dash for consistency 
with all the other tools, ie dpdk_pmdinfo.

Neil, I already gave you an ack on the series as per the functionality, 
feel free to include that in any future versions of the patch series. 
Minor nits like these are ... well, minor nits from my POV at least.

- Panu -


[dpdk-dev] [PATCH v3 3/4] bonding: take queue spinlock in rx/tx burst functions

2016-06-16 Thread Bruce Richardson
On Mon, Jun 13, 2016 at 01:28:08PM +0100, Iremonger, Bernard wrote:
> Hi Bruce,
> 
> 
> 
> > Subject: Re: [dpdk-dev] [PATCH v3 3/4] bonding: take queue spinlock in rx/tx
> > burst functions
> > 
> > On Sun, Jun 12, 2016 at 06:11:28PM +0100, Bernard Iremonger wrote:
> > > Use rte_spinlock_trylock() in the rx/tx burst functions to take the
> > > queue spinlock.
> > >
> > > Signed-off-by: Bernard Iremonger 
> > > Acked-by: Konstantin Ananyev 
> > > ---
> > 
> > Why does this particular PMD need spinlocks when doing RX and TX, while
> > other device types do not? How is adding/removing devices from a bonded
> > device different to other control operations that can be done on physical
> > PMDs? Is this not similar to say bringing down or hotplugging out a physical
> > port just before an RX or TX operation takes place?
> > For all other PMDs we rely on the app to synchronise control and data plane
> > operation - why not here?
> > 
> > /Bruce
> 
> This issue arose during VM live migration testing. 
> For VM live migration it is necessary (while traffic is running) to be able 
> to remove a bonded slave device, stop it, close it and detach it.
> It a slave device is removed from a bonded device while traffic is running a 
> segmentation fault may occur in the rx/tx burst function. The spinlock has 
> been added to prevent this occurring.
> 
> The bonding device already uses a spinlock to synchronise between the add and 
> remove functionality and the slave_link_status_change_monitor code. 
> 
> Previously testpmd did not allow, stop, close or detach of PMD while traffic 
> was running. Testpmd has been modified with the following patchset 
> 
> http://dpdk.org/dev/patchwork/patch/13472/
> 
> It now allows stop, close and detach of a PMD provided in it is not 
> forwarding and is not a slave of bonded PMD.
> 
I will admit to not being fully convinced, but if nobody else has any serious
objections, and since this patch has been reviewed and acked, I'm ok to merge it
in. I'll do so shortly.

/Bruce


[dpdk-dev] [PATCHv7 1/6] pmdinfogen: Add buildtools and pmdinfogen utility

2016-06-16 Thread Panu Matilainen
On 06/09/2016 08:46 PM, Neil Horman wrote:
> pmdinfogen is a tool used to parse object files and build json strings for
> use in later determining hardware support in a dso or application binary.
> pmdinfo looks for the non-exported symbol names this_pmd_name and
> this_pmd_tbl (where n is a integer counter).  It records the name of
> each of these tuples, using the later to find the symbolic name of the
> pci_table for physical devices that the object supports.  With this
> information, it outputs a C file with a single line of the form:
>
> static char *_driver_info[] __attribute__((used)) = " \
>   PMD_DRIVER_INFO=";
>
> Where  is the arbitrary name of the pmd, and  is the
> json encoded string that hold relevant pmd information, including the pmd
> name, type and optional array of pci device/vendor ids that the driver
> supports.
>
> This c file is suitable for compiling to object code, then relocatably
> linking into the parent file from which the C was generated.  This creates
> an entry in the string table of the object that can inform a later tool
> about hardware support.
>
> Signed-off-by: Neil Horman 
> CC: Bruce Richardson 
> CC: Thomas Monjalon 
> CC: Stephen Hemminger 
> CC: Panu Matilainen 
> ---

Unlike earlier versions, pmdinfogen ends up installed in bindir during 
"make install". Is that intentional, or just a side-effect from using 
rte.hostapp.mk? If its intentional it probably should be prefixed with 
dpdk_ like the other tools.

- Panu -



[dpdk-dev] [PATCH v6 00/38] new bnxt poll mode driver library

2016-06-16 Thread Bruce Richardson
On Wed, Jun 15, 2016 at 02:23:00PM -0700, Stephen Hurd wrote:
> The bnxt poll mode library (librte_pmd_bnxt) implements support for
> Broadcom NetXtreme C-Series.  These adapters support Standards-
> compliant 10/25/50Gbps 30MPPS full-duplex throughput.
> 
> Information about this family of adapters can be found in the
> NetXtreme Brand section https://goo.gl/4H7q63 of the Broadcom web
> site http://www.broadcom.com/
> 
> With the current driver, allocated mbufs must be large enough to hold
> the entire received frame.  If the mbufs are not large enough, the
> packets will be dropped.  This is most limiting when jumbo frames are
> used.
> 

Applied to dpdk-next-net/rel_16_07

On apply I got conflicts with the nic overview document, so please check the
resulting information in that document is correct in the next-net tree.
I also added a very short entry to the release notes for this new driver as
part of patch 1, since that was missing. Please also check that for correctness
and send on any additional comments/corrections you want on that.

Thanks for all the work on this driver.

Regards,
/Bruce


[dpdk-dev] [PATCH v3] rte_hash: add scalable multi-writer insertion w/ Intel TSX

2016-06-16 Thread Wei Shen
This patch introduced scalable multi-writer Cuckoo Hash insertion
based on a split Cuckoo Search and Move operation using Intel
TSX. It can do scalable hash insertion with 22 cores with little
performance loss and negligible TSX abortion rate.

* Added an extra rte_hash flag definition to switch default single writer
  Cuckoo Hash behavior to multiwriter.
- If HTM is available, it would use hardware feature for concurrency.
- If HTM is not available, it would fall back to spinlock.

* Created a rte_cuckoo_hash_x86.h file to hold all x86-arch related
  cuckoo_hash functions. And rte_cuckoo_hash.c uses compile time flag to
  select x86 file or other platform-specific implementations. While HTM check
  is still done at runtime (same idea with
  RTE_HASH_EXTRA_FLAGS_TRANS_MEM_SUPPORT)

* Moved rte_hash private struct definitions to rte_cuckoo_hash.h, to allow
  rte_cuckoo_hash_x86.h or future platform dependent functions to include.

* Following new functions are created for consistent names when new platform
  TM support are added.
- rte_hash_cuckoo_move_insert_mw_tm: do insertion with bucket movement.
- rte_hash_cuckoo_insert_mw_tm: do insertion without bucket movement.

* One extra multi-writer test case is added.

Signed-off-by: Shen Wei 
Signed-off-by: Sameh Gobriel 
---
 app/test/Makefile  |   1 +
 app/test/test_hash_multiwriter.c   | 287 +
 doc/guides/rel_notes/release_16_07.rst |  12 ++
 lib/librte_hash/rte_cuckoo_hash.c  | 258 ++---
 lib/librte_hash/rte_cuckoo_hash.h  | 219 +
 lib/librte_hash/rte_cuckoo_hash_x86.h  | 193 ++
 lib/librte_hash/rte_hash.h |   3 +
 7 files changed, 796 insertions(+), 177 deletions(-)
 create mode 100644 app/test/test_hash_multiwriter.c
 create mode 100644 lib/librte_hash/rte_cuckoo_hash.h
 create mode 100644 lib/librte_hash/rte_cuckoo_hash_x86.h

diff --git a/app/test/Makefile b/app/test/Makefile
index 053f3a2..5476300 100644
--- a/app/test/Makefile
+++ b/app/test/Makefile
@@ -120,6 +120,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_thash.c
 SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_perf.c
 SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_functions.c
 SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_scaling.c
+SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_multiwriter.c

 SRCS-$(CONFIG_RTE_LIBRTE_LPM) += test_lpm.c
 SRCS-$(CONFIG_RTE_LIBRTE_LPM) += test_lpm_perf.c
diff --git a/app/test/test_hash_multiwriter.c b/app/test/test_hash_multiwriter.c
new file mode 100644
index 000..b0f31b0
--- /dev/null
+++ b/app/test/test_hash_multiwriter.c
@@ -0,0 +1,287 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *  notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *  notice, this list of conditions and the following disclaimer in
+ *  the documentation and/or other materials provided with the
+ *  distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *  contributors may be used to endorse or promote products derived
+ *  from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "test.h"
+
+/*
+ * Check condition and return an error if true. Assumes that "handle" is the
+ * name of the hash structure pointer to be freed.
+ */
+#define RETURN_IF_ERROR(cond, str, ...) do {\
+   if (cond) { \
+   printf("ERROR line %d: " str "\n", __LINE__,\
+   ##__VA_ARGS__); \
+   if (handle)

[dpdk-dev] [PATCH v3] rte_hash: add scalable multi-writer insertion w/ Intel TSX

2016-06-16 Thread Wei Shen
Here's the latest version of the rte_hash multi-writer patch.
It's re-based on top of the latest head as of Jun 16, 2016.

http://dpdk.org/dev/patchwork/patch/13886/
http://dpdk.org/dev/patchwork/patch/12589/

v3 changes:

* Made spinlock as fall back behavior when developer choose to use multi-writer
  behavior while HTM is not available.
* Created a rte_cuckoo_hash_x86.h file to hold all x86-specific related
  cuckoo_hash functions. And rte_cuckoo_hash.c uses compile time flag to
  select x86 file or other platform-specific implementations. While HTM check
  is still done at runtime (same with RTE_HASH_EXTRA_FLAGS_TRANS_MEM_SUPPORT)
* Moved rte_hash private struct definitions to rte_cuckoo_hash.h, to allow
  rte_cuckoo_hash_x86.h or future platform dependent functions to include.
* Following renaming for consistent names when new platform TM support are
  added.
  - rte_hash_cuckoo_insert_mw_tm for trying insertion without moving buckets
   around.
  - rte_hash_cuckoo_move_insert_mw_tm for trying insertion by moving buckets
   around.

v2 changes:

* Address issues pointed out by reviews on mailing list.
* Removed the RTE_HASH_KEY_FLAG_MOVED flag used in v1, which would cause
  problem when key deletion happens.

Wei Shen (1):
  rte_hash: add scalable multi-writer insertion w/ Intel TSX

 app/test/Makefile  |   1 +
 app/test/test_hash_multiwriter.c   | 287 +
 doc/guides/rel_notes/release_16_07.rst |  12 ++
 lib/librte_hash/rte_cuckoo_hash.c  | 258 ++---
 lib/librte_hash/rte_cuckoo_hash.h  | 219 +
 lib/librte_hash/rte_cuckoo_hash_x86.h  | 193 ++
 lib/librte_hash/rte_hash.h |   3 +
 7 files changed, 796 insertions(+), 177 deletions(-)
 create mode 100644 app/test/test_hash_multiwriter.c
 create mode 100644 lib/librte_hash/rte_cuckoo_hash.h
 create mode 100644 lib/librte_hash/rte_cuckoo_hash_x86.h

-- 
2.5.5



[dpdk-dev] [PATCH v5 00/25] DPDK PMD for ThunderX NIC device

2016-06-16 Thread Jerin Jacob
On Wed, Jun 15, 2016 at 03:39:25PM +0100, Bruce Richardson wrote:
> On Wed, Jun 15, 2016 at 12:36:15AM +0530, Jerin Jacob wrote:
> > This patch set provides the initial version of DPDK PMD for the
> > built-in NIC device in Cavium ThunderX SoC family.
> > 
> > Implemented features and ThunderX nicvf PMD documentation added
> > in doc/guides/nics/overview.rst and doc/guides/nics/thunderx.rst
> > respectively in this patch set.
> > 
> > These patches are checked using checkpatch.sh with following
> > additional ignore option:
> > options="$options --ignore=CAMELCASE,BRACKET_SPACE"
> > CAMELCASE - To accommodate PRIx64
> > BRACKET_SPACE - To accommodate AT inline line assembly in two places
> > 
> > This patch set is based on DPDK 16.07-RC1
> > and tested with git HEAD change-set
> > ca173a909538a2f1082cd0dcb4d778a97dab69c3 along with
> > following depended patch
> > 
> > http://dpdk.org/dev/patchwork/patch/11826/
> > ethdev: add tunnel and port RSS offload types
> > 
> Hi Jerin,
> 
> hopefully a final set of comments before merge on this set, as it's looking
> very good now.
> 
> * Two patches look like they need to be split, as they are combining multiple
>   functions into one patch. They are:
> [dpdk-dev,v5,16/25] net/thunderx: add MTU set and promiscuous enable 
> support
> [dpdk-dev,v5,20/25] net/thunderx: implement supported ptype get and Rx 
> queue count
>   For the other patches which add multiple functions, the functions seem to be
>   logically related so I don't think there is a problem
> 
> * check-git-logs.sh is warning about a few of the commit messages being too 
> long.
>   Splitting patch 20 should fix one of those, but there are a few remaining.
>   A number of titles refer to ThunderX in the message, but this is probably
>   unnecessary, as the prefix already contains "net/thunderx" in it.

OK. I will send the next revision.

> 
> Regards,
> /Bruce
> 
> PS: Please also baseline patches on dpdk-next-net/rel_16_07 tree. They 
> currently
> apply fine to that tree so there is no problem, but just in case later commits
> break things, that is the tree that net patches should be based on.


[dpdk-dev] Performance hit - NICs on different CPU sockets

2016-06-16 Thread Wiles, Keith
On 6/16/16, 9:36 AM, "Take Ceara"  wrote:

>Hi Keith,
>
>On Tue, Jun 14, 2016 at 3:47 PM, Wiles, Keith  wrote:
 Normally the limitation is in the hardware, basically how the PCI bus is 
 connected to the CPUs (or sockets). How the PCI buses are connected to the 
 system depends on the Mother board design. I normally see the buses 
 attached to socket 0, but you could have some of the buses attached to the 
 other sockets or all on one socket via a PCI bridge device.

 No easy way around the problem if some of your PCI buses are split or all 
 on a single socket. Need to look at your system docs or look at lspci it 
 has an option to dump the PCI bus as an ASCII tree, at least on Ubuntu.
>>>
>>>This is the motherboard we use on our system:
>>>
>>>http://www.supermicro.com/products/motherboard/Xeon/C600/X10DRX.cfm
>>>
>>>I need to swap some NICs around (as now we moved everything on socket
>>>1) before I can share the lspci output.
>>
>> FYI: the option for lspci is ?lspci ?tv?, but maybe more options too.
>>
>
>I retested with two 10G X710 ports connected back to back:
>port 0: :01:00.3 - socket 0
>port 1: :81:00.3 - socket 1

Please provide the output from tools/cpu_layout.py.

>
>I ran the following scenarios:
>- assign 16 threads from CPU 0 on socket 0 to port 0 and 16 threads
>from CPU 1 to port 1 => setup rate of 1.6M sess/s
>- assign only the 16 threads from CPU0 for both ports (so 8 threads on
>socket 0 for port 0 and 8 threads on socket 0 for port 1) => setup
>rate of 3M sess/s
>- assign only the 16 threads from CPU1 for both ports (so 8 threads on
>socket 1 for port 0 and 8 threads on socket 1 for port 1) => setup
>rate of 3M sess/s
>
>I also tried a scenario with two machines connected back to back each
>of which had a NIC on socket 1. I assigned 16 threads from socket 1 on
>each machine to the port and performance scaled to 6M sess/s as
>expected.
>
>I double checked all our memory allocations and, at least in the
>tested scenario, we never use memory that's not on the same socket as
>the core.
>
>I pasted below the output of lspci -tv. I see that :01:00.3 and
>:81:00.3 are connected to different PCI bridges but on each of
>those bridges there are also "Intel Corporation Xeon E7 v3/Xeon E5
>v3/Core i7 DMA Channel " devices.
>
>It would be great if you could also take a look in case I
>missed/misunderstood something.
>
>Thanks,
>Dumitru
>


[dpdk-dev] [PATCH v4] eal: out-of-bounds write

2016-06-16 Thread Panu Matilainen
On 06/15/2016 04:25 PM, Slawomir Mrozowicz wrote:
> Overrunning array mcfg->memseg of 256 44-byte elements
> at element index 257 using index j.
> Fixed by add condition with message information.
>
> Fixes: af75078fece3 ("first public release")
> Coverity ID 13282
>
> Signed-off-by: Slawomir Mrozowicz 
> ---
>  lib/librte_eal/linuxapp/eal/eal_memory.c | 9 +
>  1 file changed, 9 insertions(+)
>
> diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c 
> b/lib/librte_eal/linuxapp/eal/eal_memory.c
> index 5b9132c..19753b1 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_memory.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
> @@ -1301,6 +1301,15 @@ rte_eal_hugepage_init(void)
>   break;
>   }
>
> + if (j >= RTE_MAX_MEMSEG) {
> + RTE_LOG(ERR, EAL,
> + "Failed: all memsegs used by ivshmem.\n"
> + "Current %d is not enough.\n"
> + "Please either increase the RTE_MAX_MEMSEG\n",
> + RTE_MAX_MEMSEG);
> + return -ENOMEM;
> + }


The error message is either incomplete or not coherent: "please either 
increase..." or what?

Also no need for that "Failed:" because its already prefixed by 
"Error:". I'm not sure how helpful it is to have an error message 
suggest increasing a value that requires recomplication, but maybe 
something more in the lines of:

("All memory segments exhausted by IVSHMEM. Try recompiling with larger 
RTE_MAX_MEMSEG than current %d?", RTE_MAX_MEMSEG)

- Panu -



[dpdk-dev] [PATCH v4] e1000: configure VLAN TPID

2016-06-16 Thread Zhang, Helin


> -Original Message-
> From: Xing, Beilei
> Sent: Thursday, June 16, 2016 9:36 PM
> To: Zhang, Helin 
> Cc: dev at dpdk.org; Xing, Beilei 
> Subject: [PATCH v4] e1000: configure VLAN TPID
> 
> This patch enables configuring the outer TPID for double VLAN.
> Note that all other TPID values are read only.
> 
> Signed-off-by: Beilei Xing 
Acked-by: Helin Zhang 



[dpdk-dev] [PATCH v4 1/1] eal: fix resource leak of mapped memory

2016-06-16 Thread Sergio Gonzalez Monroy
On 15/06/2016 13:25, Marcin Kerlin wrote:
> Patch fixes resource leak in rte_eal_hugepage_attach() where mapped files
> were not freed back to the OS in case of failure. Patch uses the behavior
> of Linux munmap: "It is not an error if the indicated range does not
> contain any mapped pages".
>
> v4:
> 1)removed keyword const from pointer and dependent on that casting (void *)
> v3:
> 1)removed redundant casting
> 2)removed update error message
> v2:
> 1)unmapping also previous addresses

The patch version history should be after the triple dash below so it 
won't show up
on git log.

> Coverity issue: 13295, 13296, 13303
> Fixes: af75078fece3 ("first public release")
>
> Signed-off-by: Marcin Kerlin 
> ---

Insert here patch version history.

>   lib/librte_eal/linuxapp/eal/eal_memory.c | 13 ++---
>   1 file changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c 
> b/lib/librte_eal/linuxapp/eal/eal_memory.c
> index 79d1d2d..c935765 100644

Thomas, are you ok to update the commit message? Otherwise, please 
Mercin do v5 with changes and keep my ack.

Acked-by: Sergio Gonzalez Monroy 


[dpdk-dev] [PATCH v6 00/38] new bnxt poll mode driver library

2016-06-16 Thread Ajit Khaparde
On Thu, Jun 16, 2016 at 9:24 AM, Bruce Richardson <
bruce.richardson at intel.com> wrote:

> On Wed, Jun 15, 2016 at 02:23:00PM -0700, Stephen Hurd wrote:
> > The bnxt poll mode library (librte_pmd_bnxt) implements support for
> > Broadcom NetXtreme C-Series.  These adapters support Standards-
> > compliant 10/25/50Gbps 30MPPS full-duplex throughput.
> >
> > Information about this family of adapters can be found in the
> > NetXtreme Brand section https://goo.gl/4H7q63 of the Broadcom web
> > site http://www.broadcom.com/
> >
> > With the current driver, allocated mbufs must be large enough to hold
> > the entire received frame.  If the mbufs are not large enough, the
> > packets will be dropped.  This is most limiting when jumbo frames are
> > used.
> >
>
> Applied to dpdk-next-net/rel_16_07
>
> On apply I got conflicts with the nic overview document, so please check
> the
> resulting information in that document is correct in the next-net tree.
> I also added a very short entry to the release notes for this new driver as
> part of patch 1, since that was missing. Please also check that for
> correctness
> and send on any additional comments/corrections you want on that.
>
Thanks Bruce. I had a cursory glance ?and it looked good.
We will update the
?m? further
if necessary.
?


> Thanks for all the work on this driver.
>
> Regards,
> /Bruce
>


[dpdk-dev] [PATCH] examples/ip_pipeline: fix build error for gcc 4.8

2016-06-16 Thread Jastrzebski, MichalX K
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas Monjalon
> Sent: Tuesday, June 14, 2016 9:04 PM
> To: Mrzyglod, DanielX T 
> Cc: dev at dpdk.org; Singh, Jasvinder ;
> Dumitrescu, Cristian 
> Subject: Re: [dpdk-dev] [PATCH] examples/ip_pipeline: fix build error for gcc
> 4.8
> 
> 2016-06-09 13:38, Daniel Mrzyglod:
> > This patch fixes a maybe-uninitialized warning when compiling DPDK with
> GCC 4.8
> >
> > examples/ip_pipeline/pipeline/pipeline_common_fe.c: In function
> 'app_pipeline_track_pktq_out_to_link':
> > examples/ip_pipeline/pipeline/pipeline_common_fe.c:66:31: error:
> > 'reader' may be used uninitialized in this function [-Werror=maybe-
> uninitialized]
> >
> >struct app_pktq_out_params *pktq_out =
> >
> > Fixes: 760064838ec0 ("examples/ip_pipeline: link routing output ports to
> devices")
> >
> > Signed-off-by: Daniel Mrzyglod 
> 
> For a weird reason, this patch triggers a new error:
> 
> examples/ip_pipeline/pipeline/pipeline_common_fe.c:In function
> ?app_pipeline_track_pktq_out_to_link?:
> examples/ip_pipeline/pipeline/pipeline_common_fe.c:124:11:
> error: ?id? may be used uninitialized in this function [-Werror=maybe-
> uninitialized]
> status = ptype->fe_ops->f_track(,
>^
> In file included from
> examples/ip_pipeline/pipeline/pipeline_common_fe.h:44:0,
>  from examples/ip_pipeline/pipeline/pipeline_common_fe.c:47:
> examples/ip_pipeline/app.h:734:26: note: ?id? was declared here
>   uint32_t n_readers = 0, id, i;
>   ^
> examples/ip_pipeline/pipeline/pipeline_common_fe.c:97:11:
> error: ?id? may be used uninitialized in this function [-Werror=maybe-
> uninitialized]
> status = ptype->fe_ops->f_track(,
>^
> In file included from
> examples/ip_pipeline/pipeline/pipeline_common_fe.h:44:0,
>  from examples/ip_pipeline/pipeline/pipeline_common_fe.c:47:
> examples/ip_pipeline/app.h:674:26: note: ?id? was declared here
>   uint32_t n_readers = 0, id, i;
>   ^

Hi Thomas,
Do You have this error on the same environment? 


[dpdk-dev] [PATCH v13 2/3] app/test: test external mempool manager

2016-06-16 Thread David Hunt
Use a minimal custom mempool external ops and check that it also
passes basic mempool autotests.

Signed-off-by: Olivier Matz 
Signed-off-by: David Hunt 
Acked-by: Shreyansh Jain 
Acked-by: Olivier Matz 
---
 app/test/test_mempool.c | 122 +++-
 1 file changed, 120 insertions(+), 2 deletions(-)

diff --git a/app/test/test_mempool.c b/app/test/test_mempool.c
index b586249..31582d8 100644
--- a/app/test/test_mempool.c
+++ b/app/test/test_mempool.c
@@ -83,6 +83,99 @@
 static rte_atomic32_t synchro;

 /*
+ * Simple example of custom mempool structure. Holds pointers to all the
+ * elements which are simply malloc'd in this example.
+ */
+struct custom_mempool {
+   rte_spinlock_t lock;
+   unsigned count;
+   unsigned size;
+   void *elts[];
+};
+
+/*
+ * Loop through all the element pointers and allocate a chunk of memory, then
+ * insert that memory into the ring.
+ */
+static int
+custom_mempool_alloc(struct rte_mempool *mp)
+{
+   struct custom_mempool *cm;
+
+   cm = rte_zmalloc("custom_mempool",
+   sizeof(struct custom_mempool) + mp->size * sizeof(void *), 0);
+   if (cm == NULL)
+   return -ENOMEM;
+
+   rte_spinlock_init(>lock);
+   cm->count = 0;
+   cm->size = mp->size;
+   mp->pool_data = cm;
+   return 0;
+}
+
+static void
+custom_mempool_free(struct rte_mempool *mp)
+{
+   rte_free((void *)(mp->pool_data));
+}
+
+static int
+custom_mempool_enqueue(struct rte_mempool *mp, void * const *obj_table,
+   unsigned n)
+{
+   struct custom_mempool *cm = (struct custom_mempool *)(mp->pool_data);
+   int ret = 0;
+
+   rte_spinlock_lock(>lock);
+   if (cm->count + n > cm->size) {
+   ret = -ENOBUFS;
+   } else {
+   memcpy(>elts[cm->count], obj_table, sizeof(void *) * n);
+   cm->count += n;
+   }
+   rte_spinlock_unlock(>lock);
+   return ret;
+}
+
+
+static int
+custom_mempool_dequeue(struct rte_mempool *mp, void **obj_table, unsigned n)
+{
+   struct custom_mempool *cm = (struct custom_mempool *)(mp->pool_data);
+   int ret = 0;
+
+   rte_spinlock_lock(>lock);
+   if (n > cm->count) {
+   ret = -ENOENT;
+   } else {
+   cm->count -= n;
+   memcpy(obj_table, >elts[cm->count], sizeof(void *) * n);
+   }
+   rte_spinlock_unlock(>lock);
+   return ret;
+}
+
+static unsigned
+custom_mempool_get_count(const struct rte_mempool *mp)
+{
+   struct custom_mempool *cm = (struct custom_mempool *)(mp->pool_data);
+
+   return cm->count;
+}
+
+static struct rte_mempool_ops mempool_ops_custom = {
+   .name = "custom_handler",
+   .alloc = custom_mempool_alloc,
+   .free = custom_mempool_free,
+   .enqueue = custom_mempool_enqueue,
+   .dequeue = custom_mempool_dequeue,
+   .get_count = custom_mempool_get_count,
+};
+
+MEMPOOL_REGISTER_OPS(mempool_ops_custom);
+
+/*
  * save the object number in the first 4 bytes of object data. All
  * other bytes are set to 0.
  */
@@ -292,12 +385,14 @@ static int test_mempool_single_consumer(void)
  * test function for mempool test based on singple consumer and single 
producer,
  * can run on one lcore only
  */
-static int test_mempool_launch_single_consumer(__attribute__((unused)) void 
*arg)
+static int
+test_mempool_launch_single_consumer(__attribute__((unused)) void *arg)
 {
return test_mempool_single_consumer();
 }

-static void my_mp_init(struct rte_mempool * mp, __attribute__((unused)) void * 
arg)
+static void
+my_mp_init(struct rte_mempool *mp, __attribute__((unused)) void *arg)
 {
printf("mempool name is %s\n", mp->name);
/* nothing to be implemented here*/
@@ -477,6 +572,7 @@ test_mempool(void)
 {
struct rte_mempool *mp_cache = NULL;
struct rte_mempool *mp_nocache = NULL;
+   struct rte_mempool *mp_ext = NULL;

rte_atomic32_init();

@@ -505,6 +601,27 @@ test_mempool(void)
goto err;
}

+   /* create a mempool with an external handler */
+   mp_ext = rte_mempool_create_empty("test_ext",
+   MEMPOOL_SIZE,
+   MEMPOOL_ELT_SIZE,
+   RTE_MEMPOOL_CACHE_MAX_SIZE, 0,
+   SOCKET_ID_ANY, 0);
+
+   if (mp_ext == NULL) {
+   printf("cannot allocate mp_ext mempool\n");
+   goto err;
+   }
+   if (rte_mempool_set_ops_byname(mp_ext, "custom_handler", NULL) < 0) {
+   printf("cannot set custom handler\n");
+   goto err;
+   }
+   if (rte_mempool_populate_default(mp_ext) < 0) {
+   printf("cannot populate mp_ext mempool\n");
+   goto err;
+   }
+   rte_mempool_obj_iter(mp_ext, my_obj_init, NULL);
+
/* retrieve the mempool from its name */
if (rte_mempool_lookup("test_nocache") != mp_nocache) {
printf("Cannot lookup mempool from its 

[dpdk-dev] [PATCH v13 1/3] mempool: support external mempool operations

2016-06-16 Thread David Hunt
Until now, the objects stored in a mempool were internally stored in a
ring. This patch introduces the possibility to register external handlers
replacing the ring.

The default behavior remains unchanged, but calling the new function
rte_mempool_set_ops_byname() right after rte_mempool_create_empty() allows
the user to change the handler that will be used when populating
the mempool.

This patch also adds a set of default ops (function callbacks) based
on rte_ring.

Signed-off-by: Olivier Matz 
Signed-off-by: David Hunt 
Acked-by: Shreyansh Jain 
Acked-by: Olivier Matz 
---
 app/test/test_mempool_perf.c   |   1 -
 doc/guides/prog_guide/mempool_lib.rst  |  31 +++-
 doc/guides/rel_notes/deprecation.rst   |   9 -
 lib/librte_mempool/Makefile|   2 +
 lib/librte_mempool/rte_mempool.c   |  66 +++-
 lib/librte_mempool/rte_mempool.h   | 253 ++---
 lib/librte_mempool/rte_mempool_ops.c   | 150 +
 lib/librte_mempool/rte_mempool_ring.c  | 161 ++
 lib/librte_mempool/rte_mempool_version.map |  13 +-
 9 files changed, 605 insertions(+), 81 deletions(-)
 create mode 100644 lib/librte_mempool/rte_mempool_ops.c
 create mode 100644 lib/librte_mempool/rte_mempool_ring.c

diff --git a/app/test/test_mempool_perf.c b/app/test/test_mempool_perf.c
index c5e3576..c5f8455 100644
--- a/app/test/test_mempool_perf.c
+++ b/app/test/test_mempool_perf.c
@@ -161,7 +161,6 @@ per_lcore_mempool_test(__attribute__((unused)) void *arg)
   n_get_bulk);
if (unlikely(ret < 0)) {
rte_mempool_dump(stdout, mp);
-   rte_ring_dump(stdout, mp->ring);
/* in this case, objects are lost... */
return -1;
}
diff --git a/doc/guides/prog_guide/mempool_lib.rst 
b/doc/guides/prog_guide/mempool_lib.rst
index c3afc2e..2e3116e 100644
--- a/doc/guides/prog_guide/mempool_lib.rst
+++ b/doc/guides/prog_guide/mempool_lib.rst
@@ -34,7 +34,7 @@ Mempool Library
 ===

 A memory pool is an allocator of a fixed-sized object.
-In the DPDK, it is identified by name and uses a ring to store free objects.
+In the DPDK, it is identified by name and uses a ring or an external mempool 
manager to store free objects.
 It provides some other optional services such as a per-core object cache and
 an alignment helper to ensure that objects are padded to spread them equally 
on all DRAM or DDR3 channels.

@@ -127,6 +127,35 @@ The maximum size of the cache is static and is defined at 
compilation time (CONF
A mempool in Memory with its Associated Ring


+External Mempool Manager
+
+
+This allows external memory subsystems, such as external hardware memory
+management systems and software based memory allocators, to be used with DPDK.
+
+There are two aspects to external mempool manager.
+
+* Adding the code for your new mempool operations (ops). This is achieved by
+  adding a new mempool ops code, and using the ``REGISTER_MEMPOOL_OPS`` macro.
+
+* Using the new API to call ``rte_mempool_create_empty()`` and
+  ``rte_mempool_set_ops_byname()`` to create a new mempool and specifying which
+  ops to use.
+
+Several external mempool managers may be used in the same application. A new
+mempool can be created by using the ``rte_mempool_create_empty()`` function,
+then using ``rte_mempool_set_ops_byname()`` to point the mempool to the
+relevant mempool manager callback (ops) structure.
+
+Legacy applications may continue to use the old ``rte_mempool_create()`` API
+call, which uses a ring based mempool manager by default. These applications
+will need to be modified to use a new external mempool manager.
+
+For applications that use ``rte_pktmbuf_create()``, there is a config setting
+(``RTE_MBUF_DEFAULT_MEMPOOL_OPS``) that allows the application to make use of
+an external mempool manager.
+
+
 Use Cases
 -

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index 7d947ae..c415095 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -39,15 +39,6 @@ Deprecation Notices
   compact API. The ones that remain are backwards compatible and use the
   per-lcore default cache if available. This change targets release 16.07.

-* The rte_mempool struct will be changed in 16.07 to facilitate the new
-  external mempool manager functionality.
-  The ring element will be replaced with a more generic 'pool' opaque pointer
-  to allow new mempool handlers to use their own user-defined mempool
-  layout. Also newly added to rte_mempool is a handler index.
-  The existing API will be backward compatible, but there will be new API
-  functions added to facilitate the creation of mempools using 

[dpdk-dev] [PATCH v13 0/3] mempool: add external mempool manager

2016-06-16 Thread David Hunt
Here's the latest version of the External Mempool Manager patchset.
It's re-based on top of the latest head as of 15/6/2016, including
Olivier's 35-part patch series on mempool re-org [1]

[1] http://dpdk.org/ml/archives/dev/2016-May/039229.html

v13 changes:

 * Added in extra opaque data (pool_config) to mempool struct for mempool
   configuration by the ops functions. For example, this can be used to pass
  device names or device flags to the underlying alloc function.
 * Added mempool_config param to rte_mempool_set_ops_byname()

v12 changes:

 * Fixed a comment (function pram h -> ops)
 * fixed a typo (callbacki)

v11 changes:

 * Fixed comments (added '.' where needed for consistency)
 * removed ABI breakage notice for mempool manager in deprecation.rst
 * Added description of the external mempool manager functionality to
   doc/guides/prog_guide/mempool_lib.rst (John Mc reviewed)
 * renamed rte_mempool_default.c to rte_mempool_ring.c

v10 changes:

 * changed the _put/_get op names to _enqueue/_dequeue to be consistent
   with the function names
 * some rte_errno cleanup
 * comment tweaks about when to set pool_data
 * removed an un-needed check for ops->alloc == NULL

v9 changes:

 * added a check for NULL alloc in rte_mempool_ops_register
 * rte_mempool_alloc_t now returns int instead of void*
 * fixed some comment typo's
 * removed some unneeded typecasts
 * changed a return NULL to return -EEXIST in rte_mempool_ops_register
 * fixed rte_mempool_version.map file so builds ok as shared libs
 * moved flags check from rte_mempool_create_empty to rte_mempool_create

v8 changes:

 * merged first three patches in the series into one.
 * changed parameters to ops callback to all be rte_mempool pointer
   rather than than pointer to opaque data or uint64.
 * comment fixes.
 * fixed parameter to _free function (was inconsistent).
 * changed MEMPOOL_F_RING_CREATED to MEMPOOL_F_POOL_CREATED

v7 changes:

 * Changed rte_mempool_handler_table to rte_mempool_ops_table
 * Changed hander_idx to ops_index in rte_mempool struct
 * Reworked comments in rte_mempool.h around ops functions
 * Changed rte_mempool_hander.c to rte_mempool_ops.c
 * Changed all functions containing _handler_ to _ops_
 * Now there is no mention of 'handler' left
 * Other small changes out of review of mailing list

v6 changes:

 * Moved the flags handling from rte_mempool_create_empty to
   rte_mempool_create, as it's only there for backward compatibility
 * Various comment additions and cleanup
 * Renamed rte_mempool_handler to rte_mempool_ops
 * Added a union for *pool and u64 pool_id in struct rte_mempool
 * split the original patch into a few parts for easier review.
 * rename functions with _ext_ to _ops_.
 * addressed review comments
 * renamed put and get functions to enqueue and dequeue
 * changed occurences of rte_mempool_ops to const, as they
   contain function pointers (security)
 * split out the default external mempool handler into a separate
   patch for easier review

v5 changes:
 * rebasing, as it is dependent on another patch series [1]

v4 changes (Olivier Matz):
 * remove the rte_mempool_create_ext() function. To change the handler, the
   user has to do the following:
   - mp = rte_mempool_create_empty()
   - rte_mempool_set_handler(mp, "my_handler")
   - rte_mempool_populate_default(mp)
   This avoids to add another function with more than 10 arguments, duplicating
   the doxygen comments
 * change the api of rte_mempool_alloc_t: only the mempool pointer is required
   as all information is available in it
 * change the api of rte_mempool_free_t: remove return value
 * move inline wrapper functions from the .c to the .h (else they won't be
   inlined). This implies to have one header file (rte_mempool.h), or it
   would have generate cross dependencies issues.
 * remove now unused MEMPOOL_F_INT_HANDLER (note: it was misused anyway due
   to the use of && instead of &)
 * fix build in debug mode (__MEMPOOL_STAT_ADD(mp, put_pool, n) remaining)
 * fix build with shared libraries (global handler has to be declared in
   the .map file)
 * rationalize #include order
 * remove unused function rte_mempool_get_handler_name()
 * rename some structures, fields, functions
 * remove the static in front of rte_tailq_elem rte_mempool_tailq (comment
   from Yuanhan)
 * test the ext mempool handler in the same file than standard mempool tests,
   avoiding to duplicate the code
 * rework the custom handler in mempool_test
 * rework a bit the patch selecting default mbuf pool handler
 * fix some doxygen comments

v3 changes:
 * simplified the file layout, renamed to rte_mempool_handler.[hc]
 * moved the default handlers into rte_mempool_default.c
 * moved the example handler out into app/test/test_ext_mempool.c
 * removed is_mc/is_mp change, slight perf degredation on sp cached operation
 * removed stack hanler, may re-introduce at a later date
 * Changes out of code reviews

v2 changes:
 * There was a lot of duplicate code between 

[dpdk-dev] enic in passhtrough mode tx drops

2016-06-16 Thread Ruth Christen
Hi all,

I'm running a vm attached to 2 cisco Virtual Card Interfaces in passthrough 
mode in a cisco UCS. The vNICs are configured on access mode without VLAN ID.

The incoming packets are arriving with 802.1q header containing vlan priority 
bit according to the class of service configured on the vNIC. I understood this 
is expected from a fiber channel Ethernet card.

According to dpdk documentation there's a need to set the VLAN_STRIP_OFFLOAD 
flag and call rte_eth_dev_set_vlan_offload on the ports.

If I run a simple l2fwd application where the same packet received in one port 
is sent through the other the traffic works ok.

If I generate the packets in my vm and send them out traffic doesn't work. (I 
tried send the traffic out with/without a 802.1q header with priority bit)



Is there a specific configuration to be added to the mbuff for the tx packets 
generated in the VM? Could be the vlan_tci/ ol_flags/ or any other missing flag 
set?

Does somebody know the exact behavior of the enic card with the priority 
tagging?



BTW in virtio mode the traffic works in both the flows.



Thanks a lot!





[dpdk-dev] [PATCH v2 00/17] prepare for rte_device / rte_driver

2016-06-16 Thread Jan Viktorin
On Thu, 16 Jun 2016 08:42:29 +
Shreyansh Jain  wrote:

> Hi,
> 
> > -Original Message-
> > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > Sent: Thursday, June 16, 2016 1:04 PM
> > To: Shreyansh Jain 
> > Cc: David Marchand ; viktorin at 
> > rehivetech.com;
> > dev at dpdk.org; Iremonger, Bernard 
> > Subject: Re: [dpdk-dev] [PATCH v2 00/17] prepare for rte_device / rte_driver
> > 
> > 2016-06-16 06:32, Shreyansh Jain:  
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Iremonger, 
> > > Bernard  
> > > > Patches 3,8,16 and 17 no longer apply to the latest master branch.
> > > > A rebase is needed.  
> > >
> > > With the recent most head (04920e6): 01, 03, 08, 15, 16 and 17 are 
> > > failing.
> > >
> > > Just wanted to check if there is a rebase of this series anytime soon?  
> > 
> > I will take care of this series if time permit.  
> 
> Ok.
> By the way, I have already rebased it on master. I can post the patches here 
> if you want.
> (only trivial conflicts were there)

Sounds good. +1

I'd rebase my patchset on top of it and repost.

> 
> > It would help to have more reviews on other series touching EAL, like
> > pmdinfo.  
> 
> Ok. I can try and review this, non-PCI/SoC and similar patchset in next few 
> days.

The original David's patchset was quite OK. I didn't have any comments. The 
thing
is (from my POV) that it was incomplete.

Jan

> 
> >   
> > > I was looking at Jan's non-PCI patchset [1] and they are based on this  
> > series.  
> > >
> > > [1]  
> > http://thread.gmane.org/gmane.comp.networking.dpdk.devel/30913/focus=38486  
> 
> -
> Shreyansh
> 


[dpdk-dev] [PATCH v2 00/17] prepare for rte_device / rte_driver

2016-06-16 Thread Jan Viktorin
On Thu, 16 Jun 2016 11:19:59 +0200
Thomas Monjalon  wrote:

> 2016-06-16 10:23, Jan Viktorin:
> > I think, we should consider to move it to somebody else. I would work on 
> > it, however, I don't see all the tasks that are to be done. That's why I 
> > was waiting to finalize those patchs by David or Thomas. For me, the 
> > important things were to generalize certain things to remove dependency on 
> > PCI. This is mostly done (otherwise the SoC patchset couldn't be done in 
> > the way I've posted it).
> > 
> > Now, there is some pending work to remove pmd_type. Next, to find out some 
> > generalization of rte_pci_device/driver to create rte_device/driver (I've 
> > posted several suggestions in the  of SoC patchset).

For the pmd_type removal, I am not very sure about the original David's 
intentions. What should be the result?

Should there be a special struct rte_virt_device or something like that?

> > 
> > What more?  
> 
> We need a clean devargs API in EAL, not directly related to hotplug.
> Then the hotplug can benefit of the devargs API as any other device config.

Do we have some requirements for this? Would it be a complete redefinition
of the API? I don't see the relations to hotplug.

> 
> The EAL resources (also called devices) need an unique naming convention.
> 

No idea about this. What do you mean by the unique naming convention?

Jan


[dpdk-dev] [PATCH v12 0/3] mempool: add external mempool manager

2016-06-16 Thread Hunt, David


On 16/6/2016 9:58 AM, Olivier MATZ wrote:
>>>
>>> So I don't think we should have more cache misses whether it's
>>> placed at the beginning or at the end. Maybe I'm missing something...
>>>
>>> I still believe it's better to group the 2 fields as they are
>>> tightly linked together. It could be at the end if you see better
>>> performance.
>>>
>>
>> OK, I'll leave at the end because of the performance hit.
>
> Sorry, my message was not clear.
> I mean, having both at the end. Do you see a performance
> impact in that case?
>

I ran multiple more tests, and average drop I'm seeing on an older 
server reduced to 1% average (local cached use-case), with 0% change on 
a newer Haswell server, so I think at this stage we're safe to put it up 
alongside pool_data. There was 0% reduction when I moved both to the 
bottom of the struct. So on the Haswell, it seems to have minimal impact 
regardless of where they go.

I'll post the patch up soon.

Regards,
Dave.







[dpdk-dev] [PATCH] hash: new function to retrieve a key given its position

2016-06-16 Thread De Lara Guarch, Pablo


> -Original Message-
> From: Yari Adan Petralanda [mailto:yari.adan.petralanda at ericsson.com]
> Sent: Thursday, June 16, 2016 9:23 AM
> To: Richardson, Bruce; De Lara Guarch, Pablo; Juan Antonio Montesinos
> Delgado
> Cc: dev at dpdk.org
> Subject: [PATCH] hash: new function to retrieve a key given its position
> 
> The function rte_hash_get_key_with_position is added in this patch.
> As the position returned when adding a key is frequently used as an
> offset into an array of user data, this function performs the operation
> of retrieving a key given this offset.
> 
> A possible use case would be to delete a key from the hash table when
> its entry in the array of data has certain value. For instance, the key
> could be a flow 5-tuple, and the value stored in the array a time stamp.
> 
> Signed-off-by: Juan Antonio Montesinos
> 
> Signed-off-by: Yari Adan Petralanda 
> 
> ---
>   app/test/test_hash.c | 42
> 
>   lib/librte_hash/rte_cuckoo_hash.c| 18 
>   lib/librte_hash/rte_hash.h   | 18 
>   lib/librte_hash/rte_hash_version.map |  7 ++
>   4 files changed, 85 insertions(+)
> 

[...]

> diff --git a/lib/librte_hash/rte_hash_version.map
> b/lib/librte_hash/rte_hash_version.map
> index 4f25436..19a7b26 100644
> --- a/lib/librte_hash/rte_hash_version.map
> +++ b/lib/librte_hash/rte_hash_version.map
> @@ -38,3 +38,10 @@ DPDK_2.2 {
>   rte_hash_set_cmp_func;
> 
>   } DPDK_2.1;
> +
> +DPDK_16.04 {

This should be DPDK_16.07.

> + global:
> +
> + rte_hash_get_key_with_position;
> +
> +}; DPDK_2.2
> --
> 2.1.4
> 



[dpdk-dev] [PATCH v4] eal: out-of-bounds write

2016-06-16 Thread Sergio Gonzalez Monroy
On 15/06/2016 14:25, Slawomir Mrozowicz wrote:
> Overrunning array mcfg->memseg of 256 44-byte elements
> at element index 257 using index j.
> Fixed by add condition with message information.
>
> Fixes: af75078fece3 ("first public release")
> Coverity ID 13282
>
> Signed-off-by: Slawomir Mrozowicz 
> ---

Acked-by: Sergio Gonzalez Monroy 


[dpdk-dev] random pkt generator PMD

2016-06-16 Thread Yerden Zhumabekov
On 15.06.2016 19:02, Neil Horman wrote:
> On Wed, Jun 15, 2016 at 03:43:56PM +0600, Yerden Zhumabekov wrote:
>> Hello everybody,
>>
>> DPDK already got a number of PMDs for various eth devices, it even has PMD
>> emulations for backends such as pcap, sw rings etc.
>>
>> I've been thinking about the idea of having PMD which would generate mbufs
>> on the fly in some randomized fashion. This would serve goals like, for
>> example:
>>
>> 1) running tests for applications with network processing capabilities
>> without additional software packet generators;
>> 2) making performance measurements with no hw inteference;
>> 3) ability to run without root privileges, --no-pci, --no-huge, for CI
>> build, so on.
>>
>> Maybe there's no such need, and these goals may be achieved by other means
>> and this idea is flawed? Any thoughts?
>>
> I think you already have a solution to this problem.  Linux/BSD have multiple
> user space packet generators that can dump thier output to a pcap format file,
> and dpdk has a pcap pmd that accepts a pcap file as input to send in packets.

Things that I don't like about the idea of using PCAP PMD:

1) the need to create additional files with additional scripts and keep 
those with your test suite;
2) the need to rewind pcap once you played it (fixable);
3) reading packets one-by-one, file operations which may lead to perf 
impact;
4) low variability among source packets.

Those are things which put me on idea of randomized packet generator 
PMD. Possible devargs could be:
1) id of a template, like "ipv4", "ipv6", "dot1q" etc;
2) size of mbuf payload;
3) array of tuples like (offset, size, value) with value being exact 
value or "rnd" keyword.


[dpdk-dev] [PATCH v2] enic: scattered Rx

2016-06-16 Thread Nelson Escobar
For performance reasons, this patch uses 2 VIC RQs per RQ presented to
DPDK.

The VIC requires that each descriptor be marked as either a start of
packet (SOP) descriptor or a non-SOP descriptor.  A one RQ solution
requires skipping descriptors when receiving small packets and results
in bad performance when receiving many small packets.

The 2 RQ solution makes use of the VIC feature that allows a receive
on primary queue to 'spill over' into another queue if the receive is
too large to fit in the buffer assigned to the descriptor on the
primary queue.  This means that there is no skipping of descriptors
when receiving small packets and results in much better performance.

Signed-off-by: Nelson Escobar 
Reviewed-by: John Daley 
---

v2:
 - fixes upstream checkpatch complaint
 - fixes bug where packet type and flags were set on last mbuf
   instead of first mbuf of scattered receive
 - adds ethernet hdr length to mtu when calculating the number of
   mbufs it would take to receive maximum sized packet

 doc/guides/nics/overview.rst |   2 +-
 drivers/net/enic/base/rq_enet_desc.h |   2 +-
 drivers/net/enic/base/vnic_rq.c  |   8 +-
 drivers/net/enic/base/vnic_rq.h  |  18 ++-
 drivers/net/enic/enic.h  |  22 ++-
 drivers/net/enic/enic_ethdev.c   |  10 +-
 drivers/net/enic/enic_main.c | 277 +++
 drivers/net/enic/enic_res.c  |   5 +-
 drivers/net/enic/enic_rxtx.c | 140 --
 9 files changed, 361 insertions(+), 123 deletions(-)

diff --git a/doc/guides/nics/overview.rst b/doc/guides/nics/overview.rst
index 2200171..d0ae847 100644
--- a/doc/guides/nics/overview.rst
+++ b/doc/guides/nics/overview.rst
@@ -94,7 +94,7 @@ Most of these differences are summarized below.
Queue start/stop Y   Y Y Y Y Y Y Y Y Y Y Y Y Y Y
   Y   Y Y
MTU update   Y Y Y   Y   Y Y Y Y Y Y
Jumbo frame  Y Y Y Y Y Y Y Y Y   Y Y Y Y Y Y Y Y Y Y   
Y Y Y
-   Scattered Rx Y Y Y   Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y
   Y   Y
+   Scattered Rx Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y
   Y   Y
LRO  Y Y Y Y
TSO  Y   Y   Y Y Y Y Y Y Y Y Y Y Y Y Y Y
Promiscuous mode   Y Y   Y Y   Y Y Y Y Y Y Y Y Y Y Y Y Y
 Y Y   Y   Y Y
diff --git a/drivers/net/enic/base/rq_enet_desc.h 
b/drivers/net/enic/base/rq_enet_desc.h
index 7292d9d..13e24b4 100644
--- a/drivers/net/enic/base/rq_enet_desc.h
+++ b/drivers/net/enic/base/rq_enet_desc.h
@@ -55,7 +55,7 @@ enum rq_enet_type_types {
 #define RQ_ENET_TYPE_BITS  2
 #define RQ_ENET_TYPE_MASK  ((1 << RQ_ENET_TYPE_BITS) - 1)

-static inline void rq_enet_desc_enc(struct rq_enet_desc *desc,
+static inline void rq_enet_desc_enc(volatile struct rq_enet_desc *desc,
u64 address, u8 type, u16 length)
 {
desc->address = cpu_to_le64(address);
diff --git a/drivers/net/enic/base/vnic_rq.c b/drivers/net/enic/base/vnic_rq.c
index cb62c5e..0e700a1 100644
--- a/drivers/net/enic/base/vnic_rq.c
+++ b/drivers/net/enic/base/vnic_rq.c
@@ -84,11 +84,12 @@ void vnic_rq_init_start(struct vnic_rq *rq, unsigned int 
cq_index,
iowrite32(cq_index, >ctrl->cq_index);
iowrite32(error_interrupt_enable, >ctrl->error_interrupt_enable);
iowrite32(error_interrupt_offset, >ctrl->error_interrupt_offset);
-   iowrite32(0, >ctrl->dropped_packet_count);
iowrite32(0, >ctrl->error_status);
iowrite32(fetch_index, >ctrl->fetch_index);
iowrite32(posted_index, >ctrl->posted_index);
-
+   if (rq->is_sop)
+   iowrite32(((rq->is_sop << 10) | rq->data_queue_idx),
+ >ctrl->data_ring);
 }

 void vnic_rq_init(struct vnic_rq *rq, unsigned int cq_index,
@@ -96,6 +97,7 @@ void vnic_rq_init(struct vnic_rq *rq, unsigned int cq_index,
unsigned int error_interrupt_offset)
 {
u32 fetch_index = 0;
+
/* Use current fetch_index as the ring starting point */
fetch_index = ioread32(>ctrl->fetch_index);

@@ -110,6 +112,8 @@ void vnic_rq_init(struct vnic_rq *rq, unsigned int cq_index,
error_interrupt_offset);
rq->rxst_idx = 0;
rq->tot_pkts = 0;
+   rq->pkt_first_seg = NULL;
+   rq->pkt_last_seg = NULL;
 }

 void vnic_rq_error_out(struct vnic_rq *rq, unsigned int error)
diff --git a/drivers/net/enic/base/vnic_rq.h b/drivers/net/enic/base/vnic_rq.h
index e083ccc..fd9e170 100644
--- a/drivers/net/enic/base/vnic_rq.h
+++ b/drivers/net/enic/base/vnic_rq.h
@@ -60,10 +60,18 @@ struct vnic_rq_ctrl {
u32 pad7;
u32 error_status;   /* 0x48 */
u32 pad8;
-   u32 dropped_packet_count;   /* 0x50 */
+   u32 tcp_sn; /* 0x50 */
u32 pad9;
-   u32 dropped_packet_count_rc;/* 0x58 */
+   

[dpdk-dev] [PATCH v2 2/2] vhost: unmap log memory on cleanup.

2016-06-16 Thread Ilya Maximets
Fixes memory leak on QEMU migration.

Fixes: 54f9e32305d4 ("vhost: handle dirty pages logging request")
Signed-off-by: Ilya Maximets 
---
 lib/librte_vhost/vhost-net.h  |  1 +
 lib/librte_vhost/vhost_user/virtio-net-user.c | 15 +--
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/lib/librte_vhost/vhost-net.h b/lib/librte_vhost/vhost-net.h
index ec8f964..38593a2 100644
--- a/lib/librte_vhost/vhost-net.h
+++ b/lib/librte_vhost/vhost-net.h
@@ -134,6 +134,7 @@ struct virtio_net {
charifname[IF_NAME_SZ];
uint64_tlog_size;
uint64_tlog_base;
+   uint64_tlog_addr;
struct ether_addr   mac;

 } __rte_cache_aligned;
diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.c 
b/lib/librte_vhost/vhost_user/virtio-net-user.c
index e6a2aed..a867a43 100644
--- a/lib/librte_vhost/vhost_user/virtio-net-user.c
+++ b/lib/librte_vhost/vhost_user/virtio-net-user.c
@@ -95,6 +95,10 @@ vhost_backend_cleanup(struct virtio_net *dev)
free(dev->mem);
dev->mem = NULL;
}
+   if (dev->log_addr) {
+   munmap((void *)(uintptr_t)dev->log_addr, dev->log_size);
+   dev->log_addr = 0;
+   }
 }

 int
@@ -407,8 +411,15 @@ user_set_log_base(int vid, struct VhostUserMsg *msg)
return -1;
}

-   /* TODO: unmap on stop */
-   dev->log_base = (uint64_t)(uintptr_t)addr + off;
+   /*
+* Free previously mapped log memory on occasionally
+* multiple VHOST_USER_SET_LOG_BASE.
+*/
+   if (dev->log_addr) {
+   munmap((void *)(uintptr_t)dev->log_addr, dev->log_size);
+   }
+   dev->log_addr = (uint64_t)(uintptr_t)addr;
+   dev->log_base = dev->log_addr + off;
dev->log_size = size;

return 0;
-- 
2.7.4



[dpdk-dev] [PATCH v2 1/2] vhost: fix leak of file descriptors.

2016-06-16 Thread Ilya Maximets
While migration of vhost-user device QEMU allocates memfd
to store information about dirty pages and sends fd to
vhost-user process.

File descriptor for this memory should be closed to prevent
"Too many open files" error for vhost-user process after
some amount of migrations.

Ex.:
 # ls /proc//fd/ -alh
 total 0
 root qemu  .
 root qemu  ..
 root qemu  0 -> /dev/pts/0
 root qemu  1 -> pipe:[1804353]
 root qemu  10 -> socket:[1782240]
 root qemu  100 -> /memfd:vhost-log (deleted)
 root qemu  1000 -> /memfd:vhost-log (deleted)
 root qemu  1001 -> /memfd:vhost-log (deleted)
 root qemu  1004 -> /memfd:vhost-log (deleted)
 [...]
 root qemu  996 -> /memfd:vhost-log (deleted)
 root qemu  997 -> /memfd:vhost-log (deleted)

 ovs-vswitchd.log:
 |WARN|punix:ovs-vswitchd.ctl: accept failed: Too many open files

Fixes: 54f9e32305d4 ("vhost: handle dirty pages logging request")
Signed-off-by: Ilya Maximets 
---
 lib/librte_vhost/vhost_user/virtio-net-user.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.c 
b/lib/librte_vhost/vhost_user/virtio-net-user.c
index 64a6ec4..e6a2aed 100644
--- a/lib/librte_vhost/vhost_user/virtio-net-user.c
+++ b/lib/librte_vhost/vhost_user/virtio-net-user.c
@@ -401,6 +401,7 @@ user_set_log_base(int vid, struct VhostUserMsg *msg)
 * fail when offset is not page size aligned.
 */
addr = mmap(0, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
+   close(fd);
if (addr == MAP_FAILED) {
RTE_LOG(ERR, VHOST_CONFIG, "mmap log base failed!\n");
return -1;
-- 
2.7.4



[dpdk-dev] [PATCH v2 0/2] vhost: Fix leaks on migration.

2016-06-16 Thread Ilya Maximets
v2:
* rebased on top of dpdk-next-virtio/master

Ilya Maximets (2):
  vhost: fix leak of file descriptors.
  vhost: unmap log memory on cleanup.

 lib/librte_vhost/vhost-net.h  |  1 +
 lib/librte_vhost/vhost_user/virtio-net-user.c | 16 ++--
 2 files changed, 15 insertions(+), 2 deletions(-)

-- 
2.7.4



[dpdk-dev] [PATCH v2] rte_hash: add scalable multi-writer insertion w/ Intel TSX

2016-06-16 Thread Ananyev, Konstantin
Hi Wei,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Wei Shen
> Sent: Thursday, June 16, 2016 5:53 AM
> To: dev at dpdk.org
> Cc: De Lara Guarch, Pablo; stephen at networkplumber.org; Tai, Charlie; 
> Maciocco, Christian; Gobriel, Sameh; Shen, Wei1
> Subject: [dpdk-dev] [PATCH v2] rte_hash: add scalable multi-writer insertion 
> w/ Intel TSX
> 
> This patch introduced scalable multi-writer Cuckoo Hash insertion
> based on a split Cuckoo Search and Move operation using Intel
> TSX. It can do scalable hash insertion with 22 cores with little
> performance loss and negligible TSX abortion rate.
> 
> * Added an extra rte_hash flag definition to switch default
>   single writer Cuckoo Hash behavior to multiwriter.
> 
> * Added a make_space_insert_bfs_mw() function to do split Cuckoo
>   search in BFS order.
> 
> * Added tsx_cuckoo_move_insert() to do Cuckoo move in Intel TSX
>   protected manner.
> 
> * Added test_hash_multiwriter() as test case for multi-writer
>   Cuckoo Hash.
> 
> Signed-off-by: Shen Wei 
> Signed-off-by: Sameh Gobriel 
> ---
>  app/test/Makefile  |   1 +
>  app/test/test_hash_multiwriter.c   | 272 
> +
>  doc/guides/rel_notes/release_16_07.rst |  12 ++
>  lib/librte_hash/rte_cuckoo_hash.c  | 231 +---
>  lib/librte_hash/rte_hash.h |   3 +
>  5 files changed, 494 insertions(+), 25 deletions(-)
>  create mode 100644 app/test/test_hash_multiwriter.c
> 
> diff --git a/app/test/Makefile b/app/test/Makefile
> index 053f3a2..5476300 100644
> --- a/app/test/Makefile
> +++ b/app/test/Makefile
> @@ -120,6 +120,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_thash.c
>  SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_perf.c
>  SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_functions.c
>  SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_scaling.c
> +SRCS-$(CONFIG_RTE_LIBRTE_HASH) += test_hash_multiwriter.c
> 
>  SRCS-$(CONFIG_RTE_LIBRTE_LPM) += test_lpm.c
>  SRCS-$(CONFIG_RTE_LIBRTE_LPM) += test_lpm_perf.c
> diff --git a/app/test/test_hash_multiwriter.c 
> b/app/test/test_hash_multiwriter.c
> new file mode 100644
> index 000..54a0d2c
> --- /dev/null
> +++ b/app/test/test_hash_multiwriter.c
> @@ -0,0 +1,272 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2016 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + * * Redistributions of source code must retain the above copyright
> + *notice, this list of conditions and the following disclaimer.
> + * * Redistributions in binary form must reproduce the above copyright
> + *notice, this list of conditions and the following disclaimer in
> + *the documentation and/or other materials provided with the
> + *distribution.
> + * * Neither the name of Intel Corporation nor the names of its
> + *contributors may be used to endorse or promote products derived
> + *from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +#include 
> +#include 
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "test.h"
> +
> +/*
> + * Check condition and return an error if true. Assumes that "handle" is the
> + * name of the hash structure pointer to be freed.
> + */
> +#define RETURN_IF_ERROR(cond, str, ...) do {\
> + if (cond) { \
> + printf("ERROR line %d: " str "\n", __LINE__,\
> + ##__VA_ARGS__); \
> + if (handle) \
> + rte_hash_free(handle);  \
> + return -1;  \
> + }   \
> +} while (0)
> +
> +#define 

[dpdk-dev] [PATCH v3 3/3] mempool: allow for user-owned mempool caches

2016-06-16 Thread Lazaros Koromilas
The mempool cache is only available to EAL threads as a per-lcore
resource. Change this so that the user can create and provide their own
cache on mempool get and put operations. This works with non-EAL threads
too. This commit introduces the new API calls:

rte_mempool_cache_create(size, socket_id)
rte_mempool_cache_free(cache)
rte_mempool_cache_flush(cache, mp)
rte_mempool_default_cache(mp, lcore_id)

Changes the API calls:

rte_mempool_generic_put(mp, obj_table, n, cache, flags)
rte_mempool_generic_get(mp, obj_table, n, cache, flags)

The cache-oblivious API calls use the per-lcore default local cache.

Signed-off-by: Lazaros Koromilas 
---
 app/test/test_mempool.c  |  94 --
 app/test/test_mempool_perf.c |  70 ++---
 lib/librte_mempool/rte_mempool.c |  66 +++-
 lib/librte_mempool/rte_mempool.h | 163 ---
 4 files changed, 310 insertions(+), 83 deletions(-)

diff --git a/app/test/test_mempool.c b/app/test/test_mempool.c
index 10d706f..723cd39 100644
--- a/app/test/test_mempool.c
+++ b/app/test/test_mempool.c
@@ -79,6 +79,9 @@
printf("test failed at %s():%d\n", __func__, __LINE__); \
return -1;  \
} while (0)
+#define LOG_ERR() do { \
+   printf("test failed at %s():%d\n", __func__, __LINE__); \
+   } while (0)

 static rte_atomic32_t synchro;

@@ -191,7 +194,7 @@ my_obj_init(struct rte_mempool *mp, __attribute__((unused)) 
void *arg,

 /* basic tests (done on one core) */
 static int
-test_mempool_basic(struct rte_mempool *mp)
+test_mempool_basic(struct rte_mempool *mp, int use_external_cache)
 {
uint32_t *objnum;
void **objtable;
@@ -199,47 +202,79 @@ test_mempool_basic(struct rte_mempool *mp)
char *obj_data;
int ret = 0;
unsigned i, j;
+   int offset;
+   struct rte_mempool_cache *cache;
+
+   if (use_external_cache) {
+   /* Create a user-owned mempool cache. */
+   cache = rte_mempool_cache_create(RTE_MEMPOOL_CACHE_MAX_SIZE,
+SOCKET_ID_ANY);
+   if (cache == NULL)
+   RET_ERR();
+   } else {
+   /* May be NULL if cache is disabled. */
+   cache = rte_mempool_default_cache(mp, rte_lcore_id());
+   }

/* dump the mempool status */
rte_mempool_dump(stdout, mp);

printf("get an object\n");
-   if (rte_mempool_get(mp, ) < 0)
-   RET_ERR();
+   if (rte_mempool_generic_get(mp, , 1, cache, 0) < 0) {
+   LOG_ERR();
+   ret = -1;
+   goto out;
+   }
rte_mempool_dump(stdout, mp);

/* tests that improve coverage */
printf("get object count\n");
-   if (rte_mempool_count(mp) != MEMPOOL_SIZE - 1)
-   RET_ERR();
+   /* We have to count the extra caches, one in this case. */
+   offset = use_external_cache ? 1 * cache->len : 0;
+   if (rte_mempool_count(mp) + offset != MEMPOOL_SIZE - 1) {
+   LOG_ERR();
+   ret = -1;
+   goto out;
+   }

printf("get private data\n");
if (rte_mempool_get_priv(mp) != (char *)mp +
-   MEMPOOL_HEADER_SIZE(mp, mp->cache_size))
-   RET_ERR();
+   MEMPOOL_HEADER_SIZE(mp, mp->cache_size)) {
+   LOG_ERR();
+   ret = -1;
+   goto out;
+   }

 #ifndef RTE_EXEC_ENV_BSDAPP /* rte_mem_virt2phy() not supported on bsd */
printf("get physical address of an object\n");
-   if (rte_mempool_virt2phy(mp, obj) != rte_mem_virt2phy(obj))
-   RET_ERR();
+   if (rte_mempool_virt2phy(mp, obj) != rte_mem_virt2phy(obj)) {
+   LOG_ERR();
+   ret = -1;
+   goto out;
+   }
 #endif

printf("put the object back\n");
-   rte_mempool_put(mp, obj);
+   rte_mempool_generic_put(mp, , 1, cache, 0);
rte_mempool_dump(stdout, mp);

printf("get 2 objects\n");
-   if (rte_mempool_get(mp, ) < 0)
-   RET_ERR();
-   if (rte_mempool_get(mp, ) < 0) {
-   rte_mempool_put(mp, obj);
-   RET_ERR();
+   if (rte_mempool_generic_get(mp, , 1, cache, 0) < 0) {
+   LOG_ERR();
+   ret = -1;
+   goto out;
+   }
+   if (rte_mempool_generic_get(mp, , 1, cache, 0) < 0) {
+   rte_mempool_generic_put(mp, , 1, cache, 0);
+   LOG_ERR();
+   ret = -1;
+   goto out;
}
rte_mempool_dump(stdout, mp);

printf("put the objects back\n");
-   rte_mempool_put(mp, obj);
-   rte_mempool_put(mp, obj2);
+   rte_mempool_generic_put(mp, , 1, cache, 0);
+   

[dpdk-dev] [PATCH v3 2/3] mempool: use bit flags instead of is_mp and is_mc

2016-06-16 Thread Lazaros Koromilas
Pass the same flags as in rte_mempool_create().  Changes API calls:

rte_mempool_generic_put(mp, obj_table, n, flags)
rte_mempool_generic_get(mp, obj_table, n, flags)

Signed-off-by: Lazaros Koromilas 
---
 lib/librte_mempool/rte_mempool.h | 58 +---
 1 file changed, 30 insertions(+), 28 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 7446843..191edba 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -949,12 +949,13 @@ void rte_mempool_dump(FILE *f, struct rte_mempool *mp);
  * @param n
  *   The number of objects to store back in the mempool, must be strictly
  *   positive.
- * @param is_mp
- *   Mono-producer (0) or multi-producers (1).
+ * @param flags
+ *   The flags used for the mempool creation.
+ *   Single-producer (MEMPOOL_F_SP_PUT flag) or multi-producers.
  */
 static inline void __attribute__((always_inline))
 __mempool_generic_put(struct rte_mempool *mp, void * const *obj_table,
- unsigned n, int is_mp)
+ unsigned n, int flags)
 {
struct rte_mempool_cache *cache;
uint32_t index;
@@ -967,7 +968,7 @@ __mempool_generic_put(struct rte_mempool *mp, void * const 
*obj_table,
__MEMPOOL_STAT_ADD(mp, put, n);

/* cache is not enabled or single producer or non-EAL thread */
-   if (unlikely(cache_size == 0 || is_mp == 0 ||
+   if (unlikely(cache_size == 0 || flags & MEMPOOL_F_SP_PUT ||
 lcore_id >= RTE_MAX_LCORE))
goto ring_enqueue;

@@ -1020,15 +1021,16 @@ ring_enqueue:
  *   A pointer to a table of void * pointers (objects).
  * @param n
  *   The number of objects to add in the mempool from the obj_table.
- * @param is_mp
- *   Mono-producer (0) or multi-producers (1).
+ * @param flags
+ *   The flags used for the mempool creation.
+ *   Single-producer (MEMPOOL_F_SP_PUT flag) or multi-producers.
  */
 static inline void __attribute__((always_inline))
 rte_mempool_generic_put(struct rte_mempool *mp, void * const *obj_table,
-   unsigned n, int is_mp)
+   unsigned n, int flags)
 {
__mempool_check_cookies(mp, obj_table, n, 0);
-   __mempool_generic_put(mp, obj_table, n, is_mp);
+   __mempool_generic_put(mp, obj_table, n, flags);
 }

 /**
@@ -1046,7 +1048,7 @@ __rte_deprecated static inline void 
__attribute__((always_inline))
 rte_mempool_mp_put_bulk(struct rte_mempool *mp, void * const *obj_table,
unsigned n)
 {
-   rte_mempool_generic_put(mp, obj_table, n, 1);
+   rte_mempool_generic_put(mp, obj_table, n, 0);
 }

 /**
@@ -1064,7 +1066,7 @@ __rte_deprecated static inline void 
__attribute__((always_inline))
 rte_mempool_sp_put_bulk(struct rte_mempool *mp, void * const *obj_table,
unsigned n)
 {
-   rte_mempool_generic_put(mp, obj_table, n, 0);
+   rte_mempool_generic_put(mp, obj_table, n, MEMPOOL_F_SP_PUT);
 }

 /**
@@ -1085,8 +1087,7 @@ static inline void __attribute__((always_inline))
 rte_mempool_put_bulk(struct rte_mempool *mp, void * const *obj_table,
 unsigned n)
 {
-   rte_mempool_generic_put(mp, obj_table, n,
-   !(mp->flags & MEMPOOL_F_SP_PUT));
+   rte_mempool_generic_put(mp, obj_table, n, mp->flags);
 }

 /**
@@ -1101,7 +1102,7 @@ rte_mempool_put_bulk(struct rte_mempool *mp, void * const 
*obj_table,
 __rte_deprecated static inline void __attribute__((always_inline))
 rte_mempool_mp_put(struct rte_mempool *mp, void *obj)
 {
-   rte_mempool_generic_put(mp, , 1, 1);
+   rte_mempool_generic_put(mp, , 1, 0);
 }

 /**
@@ -1116,7 +1117,7 @@ rte_mempool_mp_put(struct rte_mempool *mp, void *obj)
 __rte_deprecated static inline void __attribute__((always_inline))
 rte_mempool_sp_put(struct rte_mempool *mp, void *obj)
 {
-   rte_mempool_generic_put(mp, , 1, 0);
+   rte_mempool_generic_put(mp, , 1, MEMPOOL_F_SP_PUT);
 }

 /**
@@ -1145,15 +1146,16 @@ rte_mempool_put(struct rte_mempool *mp, void *obj)
  *   A pointer to a table of void * pointers (objects).
  * @param n
  *   The number of objects to get, must be strictly positive.
- * @param is_mc
- *   Mono-consumer (0) or multi-consumers (1).
+ * @param flags
+ *   The flags used for the mempool creation.
+ *   Single-consumer (MEMPOOL_F_SC_GET flag) or multi-consumers.
  * @return
  *   - >=0: Success; number of objects supplied.
  *   - <0: Error; code of ring dequeue function.
  */
 static inline int __attribute__((always_inline))
 __mempool_generic_get(struct rte_mempool *mp, void **obj_table,
- unsigned n, int is_mc)
+ unsigned n, int flags)
 {
int ret;
struct rte_mempool_cache *cache;
@@ -1163,7 +1165,7 @@ __mempool_generic_get(struct rte_mempool *mp, void 
**obj_table,
uint32_t cache_size = mp->cache_size;

/* cache is not enabled or 

[dpdk-dev] [PATCH v3 1/3] mempool: deprecate specific get/put functions

2016-06-16 Thread Lazaros Koromilas
This commit introduces the API calls:

rte_mempool_generic_put(mp, obj_table, n, is_mp)
rte_mempool_generic_get(mp, obj_table, n, is_mc)

Deprecates the API calls:

rte_mempool_mp_put_bulk(mp, obj_table, n)
rte_mempool_sp_put_bulk(mp, obj_table, n)
rte_mempool_mp_put(mp, obj)
rte_mempool_sp_put(mp, obj)
rte_mempool_mc_get_bulk(mp, obj_table, n)
rte_mempool_sc_get_bulk(mp, obj_table, n)
rte_mempool_mc_get(mp, obj_p)
rte_mempool_sc_get(mp, obj_p)

We also check cookies in one place now.

Signed-off-by: Lazaros Koromilas 
---
 app/test/test_mempool.c  |  10 ++--
 lib/librte_mempool/rte_mempool.h | 115 +++
 2 files changed, 85 insertions(+), 40 deletions(-)

diff --git a/app/test/test_mempool.c b/app/test/test_mempool.c
index bcf379b..10d706f 100644
--- a/app/test/test_mempool.c
+++ b/app/test/test_mempool.c
@@ -338,7 +338,7 @@ static int test_mempool_single_producer(void)
printf("obj not owned by this mempool\n");
RET_ERR();
}
-   rte_mempool_sp_put(mp_spsc, obj);
+   rte_mempool_put(mp_spsc, obj);
rte_spinlock_lock(_spinlock);
scsp_obj_table[i] = NULL;
rte_spinlock_unlock(_spinlock);
@@ -371,7 +371,7 @@ static int test_mempool_single_consumer(void)
rte_spinlock_unlock(_spinlock);
if (i >= MAX_KEEP)
continue;
-   if (rte_mempool_sc_get(mp_spsc, ) < 0)
+   if (rte_mempool_get(mp_spsc, ) < 0)
break;
rte_spinlock_lock(_spinlock);
scsp_obj_table[i] = obj;
@@ -477,13 +477,13 @@ test_mempool_basic_ex(struct rte_mempool *mp)
}

for (i = 0; i < MEMPOOL_SIZE; i ++) {
-   if (rte_mempool_mc_get(mp, [i]) < 0) {
+   if (rte_mempool_get(mp, [i]) < 0) {
printf("test_mp_basic_ex fail to get object for [%u]\n",
i);
goto fail_mp_basic_ex;
}
}
-   if (rte_mempool_mc_get(mp, _obj) == 0) {
+   if (rte_mempool_get(mp, _obj) == 0) {
printf("test_mempool_basic_ex get an impossible obj\n");
goto fail_mp_basic_ex;
}
@@ -494,7 +494,7 @@ test_mempool_basic_ex(struct rte_mempool *mp)
}

for (i = 0; i < MEMPOOL_SIZE; i++)
-   rte_mempool_mp_put(mp, obj[i]);
+   rte_mempool_put(mp, obj[i]);

if (rte_mempool_full(mp) != 1) {
printf("test_mempool_basic_ex the mempool should be full\n");
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 92deb42..7446843 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -953,8 +953,8 @@ void rte_mempool_dump(FILE *f, struct rte_mempool *mp);
  *   Mono-producer (0) or multi-producers (1).
  */
 static inline void __attribute__((always_inline))
-__mempool_put_bulk(struct rte_mempool *mp, void * const *obj_table,
-   unsigned n, int is_mp)
+__mempool_generic_put(struct rte_mempool *mp, void * const *obj_table,
+ unsigned n, int is_mp)
 {
struct rte_mempool_cache *cache;
uint32_t index;
@@ -1012,7 +1012,7 @@ ring_enqueue:


 /**
- * Put several objects back in the mempool (multi-producers safe).
+ * Put several objects back in the mempool.
  *
  * @param mp
  *   A pointer to the mempool structure.
@@ -1020,16 +1020,37 @@ ring_enqueue:
  *   A pointer to a table of void * pointers (objects).
  * @param n
  *   The number of objects to add in the mempool from the obj_table.
+ * @param is_mp
+ *   Mono-producer (0) or multi-producers (1).
  */
 static inline void __attribute__((always_inline))
+rte_mempool_generic_put(struct rte_mempool *mp, void * const *obj_table,
+   unsigned n, int is_mp)
+{
+   __mempool_check_cookies(mp, obj_table, n, 0);
+   __mempool_generic_put(mp, obj_table, n, is_mp);
+}
+
+/**
+ * @deprecated
+ * Put several objects back in the mempool (multi-producers safe).
+ *
+ * @param mp
+ *   A pointer to the mempool structure.
+ * @param obj_table
+ *   A pointer to a table of void * pointers (objects).
+ * @param n
+ *   The number of objects to add in the mempool from the obj_table.
+ */
+__rte_deprecated static inline void __attribute__((always_inline))
 rte_mempool_mp_put_bulk(struct rte_mempool *mp, void * const *obj_table,
unsigned n)
 {
-   __mempool_check_cookies(mp, obj_table, n, 0);
-   __mempool_put_bulk(mp, obj_table, n, 1);
+   rte_mempool_generic_put(mp, obj_table, n, 1);
 }

 /**
+ * @deprecated
  * Put several objects back in the mempool (NOT multi-producers safe).
  *
  * @param mp
@@ -1039,12 +1060,11 @@ rte_mempool_mp_put_bulk(struct rte_mempool *mp, void * 
const *obj_table,
  * @param 

[dpdk-dev] [PATCH v3 0/3] mempool: user-owned mempool caches

2016-06-16 Thread Lazaros Koromilas
Updated version of the user-owned cache patchset.  It applies on top of
the latest external mempool manager patches from David Hunt [1].

[1] http://dpdk.org/ml/archives/dev/2016-June/041479.html

v3 changes:

 * Deprecate specific mempool API calls instead of removing them.
 * Split deprecation into a separate commit to limit noise.
 * Fix cache flush by setting cache->len = 0 and make it inline.
 * Remove cache->size == 0 checks and ensure size != 0 at creation.
 * Fix tests to check if cache creation succeeded.
 * Fix tests to free allocated resources on error.

The mempool cache is only available to EAL threads as a per-lcore
resource. Change this so that the user can create and provide their own
cache on mempool get and put operations. This works with non-EAL threads
too.

Also, deprecate the explicit {mp,sp}_put and {mc,sc}_get calls and
re-route them through the new generic calls. Minor cleanup to pass the
mempool bit flags instead of using specific is_mp and is_mc. The old
cache-oblivious API calls use the per-lcore default local cache. The
mempool and mempool_perf tests are also updated to handle the
user-owned cache case.

Introduced API calls:

rte_mempool_cache_create(size, socket_id)
rte_mempool_cache_free(cache)
rte_mempool_cache_flush(cache, mp)
rte_mempool_default_cache(mp, lcore_id)

rte_mempool_generic_put(mp, obj_table, n, cache, flags)
rte_mempool_generic_get(mp, obj_table, n, cache, flags)

Deprecated API calls:

rte_mempool_mp_put_bulk(mp, obj_table, n)
rte_mempool_sp_put_bulk(mp, obj_table, n)
rte_mempool_mp_put(mp, obj)
rte_mempool_sp_put(mp, obj)
rte_mempool_mc_get_bulk(mp, obj_table, n)
rte_mempool_sc_get_bulk(mp, obj_table, n)
rte_mempool_mc_get(mp, obj_p)
rte_mempool_sc_get(mp, obj_p)

Lazaros Koromilas (3):
  mempool: deprecate specific get/put functions
  mempool: use bit flags instead of is_mp and is_mc
  mempool: allow for user-owned mempool caches

 app/test/test_mempool.c  | 104 +++-
 app/test/test_mempool_perf.c |  70 +--
 lib/librte_mempool/rte_mempool.c |  66 +-
 lib/librte_mempool/rte_mempool.h | 256 +--
 4 files changed, 385 insertions(+), 111 deletions(-)

-- 
1.9.1



[dpdk-dev] [PATCH v5 00/25] DPDK PMD for ThunderX NIC device

2016-06-16 Thread Bruce Richardson
On Thu, Jun 16, 2016 at 03:01:02PM +0530, Jerin Jacob wrote:
> On Wed, Jun 15, 2016 at 03:39:25PM +0100, Bruce Richardson wrote:
> > On Wed, Jun 15, 2016 at 12:36:15AM +0530, Jerin Jacob wrote:
> > > This patch set provides the initial version of DPDK PMD for the
> > > built-in NIC device in Cavium ThunderX SoC family.
> > > 
> > > Implemented features and ThunderX nicvf PMD documentation added
> > > in doc/guides/nics/overview.rst and doc/guides/nics/thunderx.rst
> > > respectively in this patch set.
> > > 
> > > These patches are checked using checkpatch.sh with following
> > > additional ignore option:
> > > options="$options --ignore=CAMELCASE,BRACKET_SPACE"
> > > CAMELCASE - To accommodate PRIx64
> > > BRACKET_SPACE - To accommodate AT inline line assembly in two places
> > > 
> > > This patch set is based on DPDK 16.07-RC1
> > > and tested with git HEAD change-set
> > > ca173a909538a2f1082cd0dcb4d778a97dab69c3 along with
> > > following depended patch
> > > 
> > > http://dpdk.org/dev/patchwork/patch/11826/
> > > ethdev: add tunnel and port RSS offload types
> > > 
> > Hi Jerin,
> > 
> > hopefully a final set of comments before merge on this set, as it's looking
> > very good now.
> > 
> > * Two patches look like they need to be split, as they are combining 
> > multiple
> >   functions into one patch. They are:
> > [dpdk-dev,v5,16/25] net/thunderx: add MTU set and promiscuous enable 
> > support
> > [dpdk-dev,v5,20/25] net/thunderx: implement supported ptype get and Rx 
> > queue count
> >   For the other patches which add multiple functions, the functions seem to 
> > be
> >   logically related so I don't think there is a problem
> > 
> > * check-git-logs.sh is warning about a few of the commit messages being too 
> > long.
> >   Splitting patch 20 should fix one of those, but there are a few remaining.
> >   A number of titles refer to ThunderX in the message, but this is probably
> >   unnecessary, as the prefix already contains "net/thunderx" in it.
> 
> OK. I will send the next revision.
> 

Please hold off a few hours, as I'm hoping to merge in the bnxt driver this
afternoon. If all goes well, I would appreciate it if you could base your 
patchset
off the rel_16_07 tree with that set applied - save me having to resolve 
conflicts
in files like the nic overview doc, which is always a pain to try and edit. :-)

Regards,
/Bruce


[dpdk-dev] [PATCH] hash: new function to retrieve a key given its position

2016-06-16 Thread Bruce Richardson
On Thu, Jun 16, 2016 at 10:23:42AM +, Juan Antonio Montesinos Delgado wrote:
> Hi,
> 
> As I understand it, the hash table entry can change position in the first 
> hash table but the index in the second hash table remains the same. So, 
> regardless the bucket the entry is in, the index (of the second hash table) 
> stored in that entry will be the same. Am I right?
> 
> Best,
> 
> Juan Antonio
> 

Ah, yes, you are right. The key data should not move, only the hash value. I'd
forgotten that.

/Bruce

> -Original Message-
> From: Bruce Richardson [mailto:bruce.richardson at intel.com] 
> Sent: jueves, 16 de junio de 2016 11:50
> To: Yari Adan PETRALANDA 
> Cc: pablo.de.lara.guarch at intel.com; Juan Antonio Montesinos Delgado 
> ; dev at dpdk.org
> Subject: Re: [PATCH] hash: new function to retrieve a key given its position
> 
> On Thu, Jun 16, 2016 at 10:22:30AM +0200, Yari Adan Petralanda wrote:
> > The function rte_hash_get_key_with_position is added in this patch.
> > As the position returned when adding a key is frequently used as an 
> > offset into an array of user data, this function performs the 
> > operation of retrieving a key given this offset.
> > 
> > A possible use case would be to delete a key from the hash table when 
> > its entry in the array of data has certain value. For instance, the 
> > key could be a flow 5-tuple, and the value stored in the array a time stamp.
> > 
> 
> I have my doubts that this will work. With cuckoo hashing, a hash table entry 
> can change position multiple times after it is added, as the table is 
> reorganised to make room for new entries.
> 
> Regards,
> /Bruce
> 


[dpdk-dev] [PATCH v1 02/28] eal: extract function eal_parse_sysfs_valuef

2016-06-16 Thread Shreyansh Jain
Sorry, didn't notice this email earlier...
Comments inline

> -Original Message-
> From: Jan Viktorin [mailto:viktorin at rehivetech.com]
> Sent: Wednesday, June 15, 2016 3:26 PM
> To: Shreyansh Jain 
> Cc: dev at dpdk.org; David Marchand ; Thomas 
> Monjalon
> ; Bruce Richardson  intel.com>;
> Declan Doherty ; jianbo.liu at linaro.org;
> jerin.jacob at caviumnetworks.com; Keith Wiles ; 
> Stephen
> Hemminger 
> Subject: Re: [dpdk-dev] [PATCH v1 02/28] eal: extract function
> eal_parse_sysfs_valuef
> 
> On Tue, 14 Jun 2016 04:30:57 +
> Shreyansh Jain  wrote:
> 
> > Hi Jan,
> >

[...]


> > > >
> > > > I almost skipped the '..f' in the name and wondered how two functions
> > > having same name exist :D
> > >
> > > I agree that a better name would be nice here. This convention was based
> on
> > > the libc naming
> > > (fopen, fclose) but the "f" letter could not be at the beginning.
> > >
> > > What about one of those?
> > >
> > > * eal_parse_sysfs_fd_value
> > > * eal_parse_sysfs_file_value
> >
> > I don't have any better idea than above.
> >
> > Though, I still feel that 'eal_parse_sysfs_value ->
> eal_parse_sysfs_file_value' would be slightly asymmetrical - but again, this
> is highly subjective argument.
> 
> I don't see any asymmetry here. The functions equal, just the new one accepts
> a file pointer instead of a path
> and we don't have function name overloading in C.

Asymmetrical because cascading function names maybe additive for easy 
reading/recall.

'eal_parse_sysfs_value ==> eal_parse_sysfs_value_ ==> 
eal_parse_sysfs_value__'

Obviously, this is not a rule - it just makes reading and recalling of cascade 
easier.
As for:

eal_parse_sysfs_value => eal_parse_sysfs_file_value

inserts an identifier between a name, making it (slightly) difficult to 
correlate.

Again, as I mentioned earlier, this is subjective argument and matter of 
(personal!) choice.

> 
> >
> > Or, eal_parse_sysfs_value -> eal_parse_sysfs_value_read() may be...
> 
> I think, I'll go with eal_parse_sysfs_file_value for v2. Ideally, it should
> be
> eal_parse_sysfs_path_value and eal_parse_sysfs_file_value. Thus, this looks
> like
> a good way.
> 
> >
> > But, eal_parse_sysfs_file_value is still preferred than
> eal_parse_sysfs_fd_value, for me.
> 
> Agree.
> 
[...]

-
Shreyansh


[dpdk-dev] [PATCH] igb_uio: fix build with backported kernel

2016-06-16 Thread Martinx - ジェームズ
On 15 June 2016 at 11:59, Ferruh Yigit  wrote:

> On 6/15/2016 4:57 PM, Ferruh Yigit wrote:
> > Following compile error observed with CentOS 6.8, which uses kernel
> > kernel-devel-2.6.32-642.el6.x86_64:
> >
> > CC eal_thread.o
> > .../build/lib/librte_eal/linuxapp/igb_uio/igb_uio.c:
> > In function 'igbuio_msix_mask_irq':
> > .../build/lib/librte_eal/linuxapp/igb_uio/igb_uio.c:157:
> > error: 'PCI_MSIX_ENTRY_CTRL_MASKBIT' undeclared (first use in this
> > function)
> >
> > Reported-by: Thiago 
> > Signed-off-by: Ferruh Yigit 
>
> Hi Thiago,
>
> Can you please test this patch?
>
> Thanks,
> ferruh
>
>
Hi Ferruh,

That patch applied and worked (kind of):

---
[root at centos6-1 dpdk-16.04]# patch -p1 < ../dpdk-centos6.patch
patching file lib/librte_eal/linuxapp/igb_uio/compat.h
Hunk #1 succeeded at 24 with fuzz 2.
---

 It passed that broken step, however, it is failing in a different part of
build process now, as follows:

---
[root at centos6-1 ~]# time rpmbuild --ba /root/rpmbuild/SPECS/dpdk.spec
...
...
  LD librte_eal.so.2
  INSTALL-LIB librte_eal.so.2
== Build lib/librte_eal/linuxapp/kni
  LD
 
/root/rpmbuild/BUILD/dpdk-16.04/x86_64-default-linuxapp-gcc/build/lib/librte_eal/linuxapp/kni/built-in.o
  CC [M]
 
/root/rpmbuild/BUILD/dpdk-16.04/x86_64-default-linuxapp-gcc/build/lib/librte_eal/linuxapp/kni/ixgbe_main.o
  CC [M]
 
/root/rpmbuild/BUILD/dpdk-16.04/x86_64-default-linuxapp-gcc/build/lib/librte_eal/linuxapp/kni/ixgbe_api.o
In file included from
/root/rpmbuild/BUILD/dpdk-16.04/lib/librte_eal/linuxapp/kni/ethtool/ixgbe/ixgbe_osdep.h:41,
 from
/root/rpmbuild/BUILD/dpdk-16.04/lib/librte_eal/linuxapp/kni/ethtool/ixgbe/ixgbe_type.h:31,
 from
/root/rpmbuild/BUILD/dpdk-16.04/lib/librte_eal/linuxapp/kni/ethtool/ixgbe/ixgbe_api.h:31,
 from
/root/rpmbuild/BUILD/dpdk-16.04/x86_64-default-linuxapp-gcc/build/lib/librte_eal/linuxapp/kni/ixgbe_api.c:28:
/root/rpmbuild/BUILD/dpdk-16.04/lib/librte_eal/linuxapp/kni/ethtool/ixgbe/kcompat.h:
In function '__kc_vlan_get_protocol':
/root/rpmbuild/BUILD/dpdk-16.04/lib/librte_eal/linuxapp/kni/ethtool/ixgbe/kcompat.h:2836:
error: implicit declaration of function 'vlan_tx_tag_present'
make[8]: ***
[/root/rpmbuild/BUILD/dpdk-16.04/x86_64-default-linuxapp-gcc/build/lib/librte_eal/linuxapp/kni/ixgbe_api.o]
Error 1
make[8]: *** Waiting for unfinished jobs
In file included from
/root/rpmbuild/BUILD/dpdk-16.04/lib/librte_eal/linuxapp/kni/ethtool/ixgbe/ixgbe_osdep.h:41,
 from
/root/rpmbuild/BUILD/dpdk-16.04/lib/librte_eal/linuxapp/kni/ethtool/ixgbe/ixgbe_type.h:31,
 from
/root/rpmbuild/BUILD/dpdk-16.04/lib/librte_eal/linuxapp/kni/ethtool/ixgbe/ixgbe_dcb.h:32,
 from
/root/rpmbuild/BUILD/dpdk-16.04/lib/librte_eal/linuxapp/kni/ethtool/ixgbe/ixgbe.h:52,
 from
/root/rpmbuild/BUILD/dpdk-16.04/x86_64-default-linuxapp-gcc/build/lib/librte_eal/linuxapp/kni/ixgbe_main.c:56:
/root/rpmbuild/BUILD/dpdk-16.04/lib/librte_eal/linuxapp/kni/ethtool/ixgbe/kcompat.h:
In function '__kc_vlan_get_protocol':
/root/rpmbuild/BUILD/dpdk-16.04/lib/librte_eal/linuxapp/kni/ethtool/ixgbe/kcompat.h:2836:
error: implicit declaration of function 'vlan_tx_tag_present'
make[8]: ***
[/root/rpmbuild/BUILD/dpdk-16.04/x86_64-default-linuxapp-gcc/build/lib/librte_eal/linuxapp/kni/ixgbe_main.o]
Error 1
make[7]: ***
[_module_/root/rpmbuild/BUILD/dpdk-16.04/x86_64-default-linuxapp-gcc/build/lib/librte_eal/linuxapp/kni]
Error 2
make[6]: *** [sub-make] Error 2
make[5]: *** [rte_kni.ko] Error 2
make[4]: *** [kni] Error 2
make[3]: *** [linuxapp] Error 2
make[2]: *** [librte_eal] Error 2
make[1]: *** [lib] Error 2
make: *** [all] Error 2
error: Bad exit status from /var/tmp/rpm-tmp.Naoj9c (%build)
---

Might be a totally different problem now, I don't know...   :-)

Best,
Thiago


  1   2   >