date:20150507

[dpdk-dev] How to use dpdk ovs

2015-05-07 Thread topperxin

Hi all
   I'm freshman of dpdk.
   And , I want to use dpdk ovs. I compiled successfully.
   like:
   Bridge "br0"
  Port "br0"
   Interface "br0"
   type: internal
  Port "dpdk0"
   Interface "dpdk0"
   type: dpdk
But ,I don't know how to use the dpdk port, how to let the data flow go 
through the dpdk0?
Who can tell me, thanks a lot.

lx

[dpdk-dev] [RFC PATCH 0/2] Move PMDs out of lib directory

2015-05-07 Thread Wiles, Keith



On 5/7/15, 9:04 AM, "Bruce Richardson"  wrote:

>On Thu, May 07, 2015 at 05:45:20PM +0200, Marc Sune wrote:
>> 
>> 
>> On 07/05/15 17:35, Bruce Richardson wrote:
>> >The "lib" directory is getting very crowded, with both general libs and
>> >poll mode drivers in it. This patch set proposes to move the PMDs out
>>of the
>> >lib folder and to put them in a separate "pmds" folder. This should
>>help
>> >with code browse-ability as the number of libs, and pmds increases.
>> >
>> >Comments or objections?
>> >
>> >Bruce Richardson (2):
>> >   pmds: Use relative rather than absolute paths
>> >   pmds: move pmds from lib to separate pmd dir
>> >
>> >  create mode 100644 pmds/librte_pmd_xenvirt/rte_mempool_gntalloc.c
>> >  create mode 100644 pmds/librte_pmd_xenvirt/rte_xen_lib.c
>> >  create mode 100644 pmds/librte_pmd_xenvirt/rte_xen_lib.h
>> >  create mode 100644 pmds/librte_pmd_xenvirt/virtio_logs.h
>> >  create mode 100644 pmds/librte_pmd_xenvirt/virtqueue.h
>> >
>> 
>> But at the end they are also libraries. What about something like:
>> 
>> * libs/core <= fundamental libraries (eal, mbuf rings...)
>> * libs/pmds <= all pmds
>> 
>> And other feature-group oriented, higher level lib, directories (not
>>sure
>> right now how to better classify them right now):
>> * libs/processing <= packet processing
>> * libs/utils
>> ...
>> 
>Yes, they are all just libs, so we could make "pmds" be a sub-dir of the
>lib
>folder. I prefer the shorter path myself, but if others want a multi-level
>hierarchy it's no big deal.

I like the dpdk/pmds as dpdk/lib/pmds is a bit longer, but I also see if
we want to move the pmds to other repo(s) in the future it would be easier
(I think) to have the subtree at the top. To me pmds are not really
libraries as I think of libc or libcrypto or something along that path.

The PMDs need to be plug able and they maybe more like loadable modules
then libraries in the future.

>
>For the other libs, I'm not sure we need to split them up, and I also
>think
>that trying to divide them into categories - and what those categories
>should
>be could - cause endless discussion. However, maybe I'm overly
>pessimistic... :-)

I agree with Bruce here we just need the PMDS split out for now.
>
>/Bruce
>

[dpdk-dev] [RFC PATCH 6/6] virtio: Resolve for control queue

2015-05-07 Thread Ouyang Changchun

Control queue can't work for vhost-user mulitple queue mode,
so workaround to return a value directly in send_command function.

Signed-off-by: Changchun Ouyang 
---
 lib/librte_pmd_virtio/virtio_ethdev.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c 
b/lib/librte_pmd_virtio/virtio_ethdev.c
index 603be2d..603aaa6 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -128,6 +128,12 @@ virtio_send_command(struct virtqueue *vq, struct 
virtio_pmd_ctrl *ctrl,
return -1;
}

+   /*
+* FIXME: The control queue doesn't work for vhost-user
+* multiple queue, workaround it to return directly.
+*/
+   return 0;
+
PMD_INIT_LOG(DEBUG, "vq->vq_desc_head_idx = %d, status = %d, "
"vq->hw->cvq = %p vq = %p",
vq->vq_desc_head_idx, status, vq->hw->cvq, vq);
-- 
1.8.4.2

[dpdk-dev] [RFC PATCH 5/6] vhost: Support multiple queues

2015-05-07 Thread Ouyang Changchun

Sample vhost leverage the VMDq+RSS in HW to receive packets and distribute them
into different queue in the pool according to their 5 tuples.

On the other hand, it enables multiple queues mode in vhost/virtio layer.

HW queue numbers in pool is required to be exactly same with the queue number 
in virtio
device, e.g. rxq = 4, the queue number is 4, it means 4 HW queues in each VMDq 
pool,
and 4 queues in each virtio device/port, every queue in pool maps to one queue 
of its
virtio device.

=
==|   |==|
   vport0 |   |  vport1  |
---  ---  ---  ---|   |---  ---  ---  ---|
q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |
/\= =/\= =/\= =/\=|   |/\= =/\= =/\= =/\=|
||   ||   ||   ||  ||   ||   ||   ||
||   ||   ||   ||  ||   ||   ||   ||
||= =||= =||= =||=|   =||== ||== ||== ||=|
q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |

--|   |--|
 VMDq pool0   |   |VMDq pool1|
==|   |==|

In RX side, it firstly polls each queue of the pool and gets the packets from
it and enqueue them into its corresponding queue in virtio device/port.
In TX side, it dequeue packets from each queue of virtio device/port and send
to either physical port or another virtio device according to its destination
MAC address.

Signed-off-by: Changchun Ouyang 
---
 examples/vhost/main.c | 132 ++
 1 file changed, 79 insertions(+), 53 deletions(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 6a38b42..f70eff8 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -999,8 +999,9 @@ link_vmdq(struct vhost_dev *vdev, struct rte_mbuf *m)

/* Enable stripping of the vlan tag as we handle routing. */
if (vlan_strip)
-   rte_eth_dev_set_vlan_strip_on_queue(ports[0],
-   (uint16_t)vdev->vmdq_rx_q, 1);
+   for (i = 0; i < (int)rxq; i++)
+   rte_eth_dev_set_vlan_strip_on_queue(ports[0],
+   (uint16_t)(vdev->vmdq_rx_q + i), 1);

/* Set device as ready for RX. */
vdev->ready = DEVICE_RX;
@@ -1015,7 +1016,7 @@ link_vmdq(struct vhost_dev *vdev, struct rte_mbuf *m)
 static inline void
 unlink_vmdq(struct vhost_dev *vdev)
 {
-   unsigned i = 0;
+   unsigned i = 0, j = 0;
unsigned rx_count;
struct rte_mbuf *pkts_burst[MAX_PKT_BURST];

@@ -1028,15 +1029,19 @@ unlink_vmdq(struct vhost_dev *vdev)
vdev->vlan_tag = 0;

/*Clear out the receive buffers*/
-   rx_count = rte_eth_rx_burst(ports[0],
-   (uint16_t)vdev->vmdq_rx_q, pkts_burst, 
MAX_PKT_BURST);
+   for (i = 0; i < rxq; i++) {
+   rx_count = rte_eth_rx_burst(ports[0],
+   (uint16_t)vdev->vmdq_rx_q + i,
+   pkts_burst, MAX_PKT_BURST);

-   while (rx_count) {
-   for (i = 0; i < rx_count; i++)
-   rte_pktmbuf_free(pkts_burst[i]);
+   while (rx_count) {
+   for (j = 0; j < rx_count; j++)
+   rte_pktmbuf_free(pkts_burst[j]);

-   rx_count = rte_eth_rx_burst(ports[0],
-   (uint16_t)vdev->vmdq_rx_q, pkts_burst, 
MAX_PKT_BURST);
+   rx_count = rte_eth_rx_burst(ports[0],
+   (uint16_t)vdev->vmdq_rx_q + i,
+   pkts_burst, MAX_PKT_BURST);
+   }
}

vdev->ready = DEVICE_MAC_LEARNING;
@@ -1048,7 +1053,7 @@ unlink_vmdq(struct vhost_dev *vdev)
  * the packet on that devices RX queue. If not then return.
  */
 static inline int __attribute__((always_inline))
-virtio_tx_local(struct vhost_dev *vdev, struct rte_mbuf *m)
+virtio_tx_local(struct vhost_dev *vdev, struct rte_mbuf *m, uint32_t q_idx)
 {
struct virtio_net_data_ll *dev_ll;
struct ether_hdr *pkt_hdr;
@@ -1063,7 +1068,7 @@ virtio_tx_local(struct vhost_dev *vdev, struct rte_mbuf 
*m)

while (dev_ll != NULL) {
if ((dev_ll->vdev->ready == DEVICE_RX) && 
ether_addr_cmp(&(pkt_hdr->d_addr),
- _ll->vdev->mac_address)) {
+   _ll->vdev->mac_address)) {

/* Drop the packet if the TX packet is destined for the 
TX device. */
if (dev_ll->vdev->dev->device_fh == dev->device_fh) {
@@ -1081,7 +1086,9 @@ virtio_tx_local(struct vhost_dev *vdev, struct rte_mbuf 
*m)
LOG_DEBUG(VHOST_DATA, "(%"PRIu64") Device is 
marked for removal\n", tdev->device_fh);
} else {

[dpdk-dev] [RFC PATCH 4/6] vhost: Add new command line option: rxq

2015-05-07 Thread Ouyang Changchun

Sample vhost need know the queue number user want to enable for each virtio 
device,
so add the new option '--rxq' into it.

Signed-off-by: Changchun Ouyang 
---
 examples/vhost/main.c | 46 ++
 1 file changed, 42 insertions(+), 4 deletions(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 87dfc67..6a38b42 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -164,6 +164,9 @@ static int mergeable;
 /* Do vlan strip on host, enabled on default */
 static uint32_t vlan_strip = 1;

+/* Rx queue number per virtio device */
+static uint32_t rxq = 1;
+
 /* number of descriptors to apply*/
 static uint32_t num_rx_descriptor = RTE_TEST_RX_DESC_DEFAULT_ZCP;
 static uint32_t num_tx_descriptor = RTE_TEST_TX_DESC_DEFAULT_ZCP;
@@ -409,8 +412,14 @@ port_init(uint8_t port)
txconf->tx_deferred_start = 1;
}

-   /*configure the number of supported virtio devices based on VMDQ limits 
*/
-   num_devices = dev_info.max_vmdq_pools;
+   /* Configure the virtio devices num based on VMDQ limits */
+   switch (rxq) {
+   case 1:
+   case 2: num_devices = dev_info.max_vmdq_pools;
+   break;
+   case 4: num_devices = dev_info.max_vmdq_pools / 2;
+   break;
+   }

if (zero_copy) {
rx_ring_size = num_rx_descriptor;
@@ -432,7 +441,7 @@ port_init(uint8_t port)
return retval;
/* NIC queues are divided into pf queues and vmdq queues.  */
num_pf_queues = dev_info.max_rx_queues - dev_info.vmdq_queue_num;
-   queues_per_pool = dev_info.vmdq_queue_num / dev_info.max_vmdq_pools;
+   queues_per_pool = dev_info.vmdq_queue_num / num_devices;
num_vmdq_queues = num_devices * queues_per_pool;
num_queues = num_pf_queues + num_vmdq_queues;
vmdq_queue_base = dev_info.vmdq_queue_base;
@@ -577,7 +586,8 @@ us_vhost_usage(const char *prgname)
"   --rx-desc-num [0-N]: the number of descriptors on rx, "
"used only when zero copy is enabled.\n"
"   --tx-desc-num [0-N]: the number of descriptors on tx, "
-   "used only when zero copy is enabled.\n",
+   "used only when zero copy is enabled.\n"
+   "   --rxq [1-4]: rx queue number for each vhost device\n",
   prgname);
 }

@@ -603,6 +613,7 @@ us_vhost_parse_args(int argc, char **argv)
{"zero-copy", required_argument, NULL, 0},
{"rx-desc-num", required_argument, NULL, 0},
{"tx-desc-num", required_argument, NULL, 0},
+   {"rxq", required_argument, NULL, 0},
{NULL, 0, 0, 0},
};

@@ -779,6 +790,20 @@ us_vhost_parse_args(int argc, char **argv)
}
}

+   /* Specify the Rx queue number for each vhost dev. */
+   if (!strncmp(long_option[option_index].name,
+   "rxq", MAX_LONG_OPT_SZ)) {
+   ret = parse_num_opt(optarg, 4);
+   if ((ret == -1) || (!POWEROF2(ret))) {
+   RTE_LOG(INFO, VHOST_CONFIG,
+   "Invalid argument for rxq [1,2,4],"
+   "power of 2 required.\n");
+   us_vhost_usage(prgname);
+   return -1;
+   } else {
+   rxq = ret;
+   }
+   }
break;

/* Invalid option - print options. */
@@ -814,6 +839,19 @@ us_vhost_parse_args(int argc, char **argv)
return -1;
}

+   if (rxq > 1) {
+   vmdq_conf_default.rxmode.mq_mode = ETH_MQ_RX_VMDQ_RSS;
+   vmdq_conf_default.rx_adv_conf.rss_conf.rss_hf = ETH_RSS_IP |
+   ETH_RSS_UDP | ETH_RSS_TCP | ETH_RSS_SCTP;
+   }
+
+   if ((zero_copy == 1) && (rxq > 1)) {
+   RTE_LOG(INFO, VHOST_PORT,
+   "Vhost zero copy doesn't support mq mode,"
+   "please specify '--rxq 1' to disable it.\n");
+   return -1;
+   }
+
return 0;
 }

-- 
1.8.4.2

[dpdk-dev] [RFC PATCH 3/6] lib_vhost: Set memory layout for multiple queues mode

2015-05-07 Thread Ouyang Changchun

QEMU sends separate commands orderly to set the memory layout for each queue
in one virtio device, accordingly vhost need keep memory layout information
for each queue of the virtio device.

This also need adjust the interface a bit for function gpa_to_vva by
introducing the queue index to specify queue of device to look up its
virtual vhost address for the incoming guest physical address.

Signed-off-by: Changchun Ouyang 
---
 examples/vhost/main.c | 21 +-
 lib/librte_vhost/rte_virtio_net.h | 10 +++--
 lib/librte_vhost/vhost_cuse/virtio-net-cdev.c | 57 ++
 lib/librte_vhost/vhost_rxtx.c | 21 +-
 lib/librte_vhost/vhost_user/virtio-net-user.c | 59 ++-
 lib/librte_vhost/virtio-net.c | 26 +++-
 6 files changed, 106 insertions(+), 88 deletions(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index c3fcb80..87dfc67 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -1467,11 +1467,11 @@ attach_rxmbuf_zcp(struct virtio_net *dev)
desc = >desc[desc_idx];
if (desc->flags & VRING_DESC_F_NEXT) {
desc = >desc[desc->next];
-   buff_addr = gpa_to_vva(dev, desc->addr);
+   buff_addr = gpa_to_vva(dev, 0, desc->addr);
phys_addr = gpa_to_hpa(vdev, desc->addr, desc->len,
_type);
} else {
-   buff_addr = gpa_to_vva(dev,
+   buff_addr = gpa_to_vva(dev, 0,
desc->addr + vq->vhost_hlen);
phys_addr = gpa_to_hpa(vdev,
desc->addr + vq->vhost_hlen,
@@ -1723,7 +1723,7 @@ virtio_dev_rx_zcp(struct virtio_net *dev, struct rte_mbuf 
**pkts,
rte_pktmbuf_data_len(buff), 0);

/* Buffer address translation for virtio header. */
-   buff_hdr_addr = gpa_to_vva(dev, desc->addr);
+   buff_hdr_addr = gpa_to_vva(dev, 0, desc->addr);
packet_len = rte_pktmbuf_data_len(buff) + vq->vhost_hlen;

/*
@@ -1947,7 +1947,7 @@ virtio_dev_tx_zcp(struct virtio_net *dev)
desc = >desc[desc->next];

/* Buffer address translation. */
-   buff_addr = gpa_to_vva(dev, desc->addr);
+   buff_addr = gpa_to_vva(dev, 0, desc->addr);
/* Need check extra VLAN_HLEN size for inserting VLAN tag */
phys_addr = gpa_to_hpa(vdev, desc->addr, desc->len + VLAN_HLEN,
_type);
@@ -2605,13 +2605,14 @@ new_device (struct virtio_net *dev)
dev->priv = vdev;

if (zero_copy) {
-   vdev->nregions_hpa = dev->mem->nregions;
-   for (regionidx = 0; regionidx < dev->mem->nregions; 
regionidx++) {
+   struct virtio_memory *dev_mem = dev->mem_arr[0];
+   vdev->nregions_hpa = dev_mem->nregions;
+   for (regionidx = 0; regionidx < dev_mem->nregions; regionidx++) 
{
vdev->nregions_hpa
+= check_hpa_regions(
-   
dev->mem->regions[regionidx].guest_phys_address
-   + 
dev->mem->regions[regionidx].address_offset,
-   
dev->mem->regions[regionidx].memory_size);
+   
dev_mem->regions[regionidx].guest_phys_address
+   + 
dev_mem->regions[regionidx].address_offset,
+   
dev_mem->regions[regionidx].memory_size);

}

@@ -2627,7 +2628,7 @@ new_device (struct virtio_net *dev)


if (fill_hpa_memory_regions(
-   vdev->regions_hpa, dev->mem
+   vdev->regions_hpa, dev_mem
) != vdev->nregions_hpa) {

RTE_LOG(ERR, VHOST_CONFIG,
diff --git a/lib/librte_vhost/rte_virtio_net.h 
b/lib/librte_vhost/rte_virtio_net.h
index 5fb6006..c10c023 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -99,14 +99,15 @@ struct vhost_virtqueue {
  * Device structure contains all configuration information relating to the 
device.
  */
 struct virtio_net {
-   struct virtio_memory*mem;   /**< QEMU memory and memory 
region information. */
struct vhost_virtqueue  *virtqueue[VIRTIO_QNUM * 
VIRTIO_MAX_VIRTQUEUES]; /**< Contains all virtqueue information. */
+   struct virtio_memory*mem_arr[VIRTIO_MAX_VIRTQUEUES];/**< 
Array for QEMU memory and memory region information. */
uint64_tfeatures;   /**< Negotiated feature set. */
uint64_tdevice_fh;  /**< device identifier. */

[dpdk-dev] [RFC PATCH 2/6] lib_vhost: Support multiple queues in virtio dev

2015-05-07 Thread Ouyang Changchun

Each virtio device could have multiple queues, say 2 or 4, at most 8.
Enabling this feature allows virtio device/port on guest has the ability to
use different vCPU to receive/transmit packets from/to each queue.

In multiple queues mode, virtio device readiness means all queues of
this virtio device are ready, cleanup/destroy a virtio device also
requires clearing all queues belong to it.

Signed-off-by: Changchun Ouyang 
---
 lib/librte_vhost/rte_virtio_net.h |  15 +++-
 lib/librte_vhost/vhost_rxtx.c |  32 +++
 lib/librte_vhost/vhost_user/virtio-net-user.c |  41 -
 lib/librte_vhost/virtio-net.c | 117 +-
 4 files changed, 131 insertions(+), 74 deletions(-)

diff --git a/lib/librte_vhost/rte_virtio_net.h 
b/lib/librte_vhost/rte_virtio_net.h
index 2fc1c44..5fb6006 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -58,6 +58,10 @@
 /* Backend value set by guest. */
 #define VIRTIO_DEV_STOPPED -1

+/**
+ * Maximum number of virtqueues per device.
+ */
+#define VIRTIO_MAX_VIRTQUEUES 8

 /* Enum for virtqueue management. */
 enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};
@@ -95,13 +99,14 @@ struct vhost_virtqueue {
  * Device structure contains all configuration information relating to the 
device.
  */
 struct virtio_net {
-   struct vhost_virtqueue  *virtqueue[VIRTIO_QNUM];/**< Contains 
all virtqueue information. */
struct virtio_memory*mem;   /**< QEMU memory and memory 
region information. */
+   struct vhost_virtqueue  *virtqueue[VIRTIO_QNUM * 
VIRTIO_MAX_VIRTQUEUES]; /**< Contains all virtqueue information. */
uint64_tfeatures;   /**< Negotiated feature set. */
uint64_tdevice_fh;  /**< device identifier. */
uint32_tflags;  /**< Device flags. Only used to 
check if device is running on data core. */
 #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)
charifname[IF_NAME_SZ]; /**< Name of the tap 
device or socket path. */
+   uint32_tnum_virt_queues;
void*priv;  /**< private context */
 } __rte_cache_aligned;

@@ -215,4 +220,12 @@ uint16_t rte_vhost_enqueue_burst(struct virtio_net *dev, 
uint16_t queue_id,
 uint16_t rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t count);

+/**
+ * This function get the queue number of one vhost device.
+ * @param q_number
+ *  queue number one vhost device.
+ * @return
+ *  0 if success, -1 if q_number exceed the max.
+ */
+int rte_vhost_q_num_get(uint32_t q_number);
 #endif /* _VIRTIO_NET_H_ */
diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
index 535c7a1..d8dd5ec 100644
--- a/lib/librte_vhost/vhost_rxtx.c
+++ b/lib/librte_vhost/vhost_rxtx.c
@@ -67,12 +67,12 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
uint8_t success = 0;

LOG_DEBUG(VHOST_DATA, "(%"PRIu64") virtio_dev_rx()\n", dev->device_fh);
-   if (unlikely(queue_id != VIRTIO_RXQ)) {
-   LOG_DEBUG(VHOST_DATA, "mq isn't supported in this version.\n");
-   return 0;
+   if (unlikely(queue_id >= VIRTIO_QNUM * dev->num_virt_queues)) {
+   LOG_DEBUG(VHOST_DATA, "queue id: %d invalid.\n", queue_id);
+   return -1;
}

-   vq = dev->virtqueue[VIRTIO_RXQ];
+   vq = dev->virtqueue[queue_id];
count = (count > MAX_PKT_BURST) ? MAX_PKT_BURST : count;

/*
@@ -185,8 +185,9 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
 }

 static inline uint32_t __attribute__((always_inline))
-copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t res_base_idx,
-   uint16_t res_end_idx, struct rte_mbuf *pkt)
+copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t queue_id,
+   uint16_t res_base_idx, uint16_t res_end_idx,
+   struct rte_mbuf *pkt)
 {
uint32_t vec_idx = 0;
uint32_t entry_success = 0;
@@ -214,9 +215,9 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t 
res_base_idx,
 * Convert from gpa to vva
 * (guest physical addr -> vhost virtual addr)
 */
-   vq = dev->virtqueue[VIRTIO_RXQ];
vb_addr =
gpa_to_vva(dev, vq->buf_vec[vec_idx].buf_addr);
+   vq = dev->virtqueue[queue_id];
vb_hdr_addr = vb_addr;

/* Prefetch buffer address. */
@@ -404,11 +405,12 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t 
queue_id,

LOG_DEBUG(VHOST_DATA, "(%"PRIu64") virtio_dev_merge_rx()\n",
dev->device_fh);
-   if (unlikely(queue_id != VIRTIO_RXQ)) {
-   LOG_DEBUG(VHOST_DATA, "mq isn't supported in this version.\n");
+   if (unlikely(queue_id >= VIRTIO_QNUM * dev->num_virt_queues)) {
+   LOG_DEBUG(VHOST_DATA,

[dpdk-dev] [RFC PATCH 1/6] ixgbe: Support VMDq RSS in non-SRIOV environment

2015-05-07 Thread Ouyang Changchun

In non-SRIOV environment, VMDq RSS could be enabled by MRQC register.
In theory, the queue number per pool could be 2 or 4, but only 2 queues are
available due to HW limitation, the same limit also exist in Linux ixgbe driver.

Signed-off-by: Changchun Ouyang 
---
 lib/librte_ether/rte_ethdev.c | 40 +++
 lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 82 +--
 2 files changed, 111 insertions(+), 11 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 03fce08..be9105f 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -983,6 +983,16 @@ rte_eth_dev_check_vf_rss_rxq_num(uint8_t port_id, uint16_t 
nb_rx_q)
return 0;
 }

+#define VMDQ_RSS_RX_QUEUE_NUM_MAX 4
+
+static int
+rte_eth_dev_check_vmdq_rss_rxq_num(__rte_unused uint8_t port_id, uint16_t 
nb_rx_q)
+{
+   if (nb_rx_q > VMDQ_RSS_RX_QUEUE_NUM_MAX)
+   return -EINVAL;
+   return 0;
+}
+
 static int
 rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q,
  const struct rte_eth_conf *dev_conf)
@@ -1143,6 +1153,36 @@ rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t 
nb_rx_q, uint16_t nb_tx_q,
return (-EINVAL);
}
}
+
+   if (dev_conf->rxmode.mq_mode == ETH_MQ_RX_VMDQ_RSS) {
+   uint32_t nb_queue_pools =
+   
dev_conf->rx_adv_conf.vmdq_rx_conf.nb_queue_pools;
+   struct rte_eth_dev_info dev_info;
+
+   rte_eth_dev_info_get(port_id, _info);
+   dev->data->dev_conf.rxmode.mq_mode = ETH_MQ_RX_VMDQ_RSS;
+   if (nb_queue_pools == ETH_32_POOLS || nb_queue_pools == 
ETH_64_POOLS)
+   RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool =
+   dev_info.max_rx_queues/nb_queue_pools;
+   else {
+   PMD_DEBUG_TRACE("ethdev port_id=%d VMDQ "
+   "nb_queue_pools=%d invalid "
+   "in VMDQ RSS\n"
+   port_id,
+   nb_queue_pools);
+   return -EINVAL;
+   }
+
+   if (rte_eth_dev_check_vmdq_rss_rxq_num(port_id,
+   RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool) != 0) {
+   PMD_DEBUG_TRACE("ethdev port_id=%d"
+   " SRIOV active, invalid queue"
+   " number for VMDQ RSS, allowed"
+   " value are 1, 2 or 4\n",
+   port_id);
+   return -EINVAL;
+   }
+   }
}
return 0;
 }
diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c 
b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
index 96c4b98..5a6227f 100644
--- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
+++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
@@ -3172,15 +3172,15 @@ void ixgbe_configure_dcb(struct rte_eth_dev *dev)
 }

 /*
- * VMDq only support for 10 GbE NIC.
+ * Config pool for VMDq on 10 GbE NIC.
  */
 static void
-ixgbe_vmdq_rx_hw_configure(struct rte_eth_dev *dev)
+ixgbe_vmdq_pool_configure(struct rte_eth_dev *dev)
 {
struct rte_eth_vmdq_rx_conf *cfg;
struct ixgbe_hw *hw;
enum rte_eth_nb_pools num_pools;
-   uint32_t mrqc, vt_ctl, vlanctrl;
+   uint32_t vt_ctl, vlanctrl;
uint32_t vmolr = 0;
int i;

@@ -3189,12 +3189,6 @@ ixgbe_vmdq_rx_hw_configure(struct rte_eth_dev *dev)
cfg = >data->dev_conf.rx_adv_conf.vmdq_rx_conf;
num_pools = cfg->nb_queue_pools;

-   ixgbe_rss_disable(dev);
-
-   /* MRQC: enable vmdq */
-   mrqc = IXGBE_MRQC_VMDQEN;
-   IXGBE_WRITE_REG(hw, IXGBE_MRQC, mrqc);
-
/* PFVTCTL: turn on virtualisation and set the default pool */
vt_ctl = IXGBE_VT_CTL_VT_ENABLE | IXGBE_VT_CTL_REPLEN;
if (cfg->enable_default_pool)
@@ -3261,6 +3255,28 @@ ixgbe_vmdq_rx_hw_configure(struct rte_eth_dev *dev)
 }

 /*
+ * VMDq only support for 10 GbE NIC.
+ */
+static void
+ixgbe_vmdq_rx_hw_configure(struct rte_eth_dev *dev)
+{
+   struct ixgbe_hw *hw;
+   uint32_t mrqc;
+
+   PMD_INIT_FUNC_TRACE();
+   hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+   ixgbe_rss_disable(dev);
+
+   /* MRQC: enable vmdq */
+   mrqc = IXGBE_MRQC_VMDQEN;
+   IXGBE_WRITE_REG(hw, IXGBE_MRQC, mrqc);
+   IXGBE_WRITE_FLUSH(hw);
+
+   ixgbe_vmdq_pool_configure(dev);
+}
+
+/*
  * ixgbe_dcb_config_tx_hw_config - Configure general VMDq TX parameters
  * @hw: pointer to hardware structure
  */
@@ -3365,6 +3381,41 @@ ixgbe_config_vf_rss(struct rte_eth_dev

[dpdk-dev] [RFC PATCH 0/6] Support multiple queues in vhost

2015-05-07 Thread Ouyang Changchun

This RFC patch set supports the multiple queues for each virtio device in vhost.
The vhost-user is used to enable the multiple queues feature, It's not ready 
for vhost-cuse.

One prerequisite to enable this feature is that a QEMU patch plus a fix is 
required to apply
on QEMU, pls refer to this link for the details of the patch and the fix:
http://lists.nongnu.org/archive/html/qemu-devel/2015-04/msg00917.html

Basicaly vhost sample leverages the VMDq+RSS in HW to receive packets and 
distribute them
into different queue in the pool according to their 5 tuples.

On the other hand, it enables multiple queues mode in vhost/virtio layer by 
setting the queue
number as the value larger than 1.

HW queue numbers in pool is required to be exactly same with the queue number 
in each virtio
device, e.g. rxq = 4, the queue number is 4, it means there are 4 HW queues in 
each VMDq pool,
and 4 queues in each virtio device/port, every queue in pool maps to one qeueu 
in virtio device.

=
==|   |==|
   vport0 |   |  vport1  |
---  ---  ---  ---|   |---  ---  ---  ---|
q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |
/\= =/\= =/\= =/\=|   |/\= =/\= =/\= =/\=|
||   ||   ||   ||  ||   ||   ||   ||
||   ||   ||   ||  ||   ||   ||   ||
||= =||= =||= =||=|   =||== ||== ||== ||=|
q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |

--|   |--|
 VMDq pool0   |   |VMDq pool1|
==|   |==|

In RX side, it firstly polls each queue of the pool and gets the packets from
it and enqueue them into its corresponding queue in virtio device/port.
In TX side, it dequeue packets from each queue of virtio device/port and send
to either physical port or another virtio device according to its destination
MAC address.

It includes a workaround here in virtio as control queue not work for vhost-user
multiple queues. It needs further investigate to root the cause.

Changchun Ouyang (6):
  ixgbe: Support VMDq RSS in non-SRIOV environment
  lib_vhost: Support multiple queues in virtio dev
  lib_vhost: Set memory layout for multiple queues mode
  vhost: Add new command line option: rxq
  vhost: Support multiple queues
  virtio: Resolve for control queue

 examples/vhost/main.c | 199 +-
 lib/librte_ether/rte_ethdev.c |  40 ++
 lib/librte_pmd_ixgbe/ixgbe_rxtx.c |  82 +--
 lib/librte_pmd_virtio/virtio_ethdev.c |   6 +
 lib/librte_vhost/rte_virtio_net.h |  25 +++-
 lib/librte_vhost/vhost_cuse/virtio-net-cdev.c |  57 
 lib/librte_vhost/vhost_rxtx.c |  53 +++
 lib/librte_vhost/vhost_user/virtio-net-user.c | 100 +++--
 lib/librte_vhost/virtio-net.c | 143 +++---
 9 files changed, 475 insertions(+), 230 deletions(-)

-- 
1.8.4.2

[dpdk-dev] [PATCH v7 03/10] eal/linux: add API to set rx interrupt event monitor

2015-05-07 Thread Stephen Hemminger

On Tue,  5 May 2015 13:39:39 +0800
Cunming Liang  wrote:

> + bytes_read = read(fd, , bytes_read);
> + if (bytes_read < 0)
> + RTE_LOG(ERR, EAL, "Error reading from file "
> + "descriptor %d: %s\n", fd,
> + strerror(errno)

The read could be interrupted (EINTR) or there could be a race (EWOULDBLOCK).
In those cases the code should not log anything.

[dpdk-dev] [PATCH v7 02/10] eal/linux: add rte_epoll_wait/ctl support

2015-05-07 Thread Stephen Hemminger

On Tue,  5 May 2015 13:39:38 +0800
Cunming Liang  wrote:

> + else if (rc < 0) {
> + /* epoll_wait fail */
> + RTE_LOG(ERR, EAL, "epoll_wait returns with fail %s\n",
> + strerror(errno));

In real application there maybe other random signals.
Therefore the code should ignore and return for case of EWOULDBLOCK and EINTR

[dpdk-dev] [PATCH] bond: initialize backpointer from pci device to driver

2015-05-07 Thread Stephen Hemminger

Add missing initialization of to pci_dev driver
The link from pci_dev back to the ethernet driver was not being set.

Signed-off-by: Stephen Hemminger 
---
 lib/librte_pmd_bond/rte_eth_bond_api.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/librte_pmd_bond/rte_eth_bond_api.c 
b/lib/librte_pmd_bond/rte_eth_bond_api.c
index e91a623..904b59f 100644
--- a/lib/librte_pmd_bond/rte_eth_bond_api.c
+++ b/lib/librte_pmd_bond/rte_eth_bond_api.c
@@ -260,6 +260,7 @@ rte_eth_bond_create(const char *name, uint8_t mode, uint8_t 
socket_id)

pci_dev->numa_node = socket_id;
pci_drv->name = driver_name;
+   pci_dev->driver = pci_drv;

eth_dev->driver = eth_drv;
eth_dev->data->dev_private = internals;
-- 
2.1.4

[dpdk-dev] [PATCH v2 0/5]: Cleanups in the ixgbe PMD

2015-05-07 Thread Thomas Monjalon

> > This series includes:
> >- Fix the "issue" introduced in 01fa1d6215fa7cd6b5303ac9296381b75b9226de:
> >  files in librte_pmd_ixgbe/ixgbe/ are shared with FreeBSD and AFAIU 
> > should not
> >  be changed unless the change is pushed into the FreeBSD tree first.
> >- Remove unused rsc_en field in ixgbe_rx_queue struct.
> >  Thanks to Shiweixian  for pointing this out.
> >- Kill the non-vector scattered Rx callback and use an appropriate LRO 
> > callback
> >  instead. This is possible because work against HW in both LRO and 
> > scattered RX
> >  cases is the same. Note that this patch touches the ixgbevf PMD as 
> > well.
> >- Use LRO bulk callback when scattered (non-LRO) Rx is requested and 
> > parameters
> >  allow bulk allocation.
> > 
> > Note that this series is meant to cleanup the PF PMD and is a follow up 
> > series for my
> > previous patches. Although VF PMD is slightly modified here too this series 
> > doesn't mean
> > to fix/add new functionality to it. VF PMD should be patched in the similar 
> > way I've
> > patched PF PMD in my previous series in order to fix the same issues that 
> > were fixed in
> > the PF PMD and in order to enable LRO and scattered Rx with bulk 
> > allocations.
> > 
> > New in v2:
> >- Rename RSC-specific structures to "Scattered Rx" derivatives.
> >- Always allocate Scattered Rx ring.
> > 
> > Vlad Zolotarov (5):
> >   ixgbe: move rx_bulk_alloc_allowed and rx_vec_allowed to ixgbe_adapter
> >   ixgbe: ixgbe_rx_queue: remove unused rsc_en field
> >   ixgbe: Rename yy_rsc_xx -> yy_sc/scattered_rx_xx
> >   ixgbe: Kill ixgbe_recv_scattered_pkts()
> >   ixgbe: Add support for scattered Rx with bulk allocation.
> 
> Acked-by: Konstantin Ananyev 
> Thanks a lot for doing it.

Applied, thanks

[dpdk-dev] Issues met while running openvswitch/dpdk/virtio inside the VM

2015-05-07 Thread Oleg Strikov

Hi DPDK users and developers,

Few weeks ago I came up with the idea to run openvswitch with dpdk backend
inside qemu-kvm virtual machine. I don't have enough supported NICs yet and
my plan was to start experimenting inside the virtualized environment,
achieve functional state of all the components and then switch to the real
hardware. Additional useful side-effect of doing things inside the vm is
that issues can be easily reproduced by someone else in a different
environment.

I (fondly) hoped that running openvswitch/dpdk inside the vm would be
simpler than running the same set of components on the real hardware.
Unfortunately I met a bunch of issues on the way. All these issues lie on a
borderline between dpdk and openvswitch but I think that you might be
interested in my story. Please note that I still don't have
openvswitch/dpdk working inside the vm. I definetely have some progress
though.

Q: Does it sound okay from functional (not performance) standpoint to run
openvswitch/dpdk inside the vm? Do we want to be able to do this? Does
anyone from the dpdk development team do this?

## Issue 1 ##

Openvswitch requires backend pmd driver to provide N_CORES tx queues where
N_CORES is the amount of cores available on the machine (openvswitch counts
the amount of cpu* entries inside /sys/devices/system/node/node0/ folder).
To my understanding it doesn't take into account the actual amount of cores
used by dpdk and just allocates tx queue for each available core. You may
refer to this chunk of code for details:
https://github.com/openvswitch/ovs/blob/master/lib/dpif-netdev.c#L1067

This approach works fine on the real hardware but makes some issues when we
run openvswitch/dpdk inside the virtual machine. I tried both emulated
e1000 NIC and virtio NIC and neither of them worked just from the box.
Emulated e1000 NIC doesn't support multiple tx queues at all (see
http://dpdk.org/browse/dpdk/tree/lib/librte_pmd_e1000/em_ethdev.c#n884) and
virtio NIC doesn't support multiple tx queues by default. To enable
multiple tx queue for virtio NIC I had to add the following line to the
interface section of my libvirt config: ''

## Issue 2 ##

Openvswitch calls rte_eth_tx_queue_setup() twice for the same
port_id/queue_id. First call takes place during device initialization (see
call to dpdk_eth_dev_init() inside netdev_dpdk_init():
https://github.com/openvswitch/ovs/blob/master/lib/netdev-dpdk.c#L522).
Second call takes place when openvswitch tries to add more tx queues to the
device (see call to dpdk_eth_dev_init() inside netdev_dpdk_set_multiq():
https://github.com/openvswitch/ovs/blob/master/lib/netdev-dpdk.c#L697).
Second call not only initialized new queues but tries to re-initialize
existing ones.

Unfortunately virtio driver can't handle second call of
rte_eth_tx_queue_setup() and returns error here:
http://dpdk.org/browse/dpdk/tree/lib/librte_pmd_virtio/virtio_ethdev.c#n316
This happens because memzone with the name portN_tvqN already exists when
second call takes place (memzone has been created during the first call).
To deal with this issue I had to manually add rte_memzone_lookup-based
check for this situation and avoid allocation of a new memzone if it
already exists.

Q: Is it okay that openvswitch calls rte_eth_tx_queue_setup() twice? Right
now I can't understand if it's the issue with the virtio pmd driver or
incorrect API usage by openvswitch? Could someone shed some light on this
so I can move forward and maybe propose a fix.

## Issue 3 ##

This issue is also (somehow) related to the fact that openvswitch calls
rte_eth_tx_queue_setup() twice. I fix the previous issue by the method
described above and initialization finishes. The whole machinery starts to
work but crashes at the very beginning (while fetching the first packet
from the NIC maybe). This crash happens here:
http://dpdk.org/browse/dpdk/tree/lib/librte_pmd_virtio/virtio_rxtx.c#n588
It takes place because vq_ring structure contains zeros instead of correct
values:
vq_ring = {num = 0, desc = 0x0, avail = 0x0, used = 0x0}
My understanding is that vq_ring gets initialized after the first call to
rte_eth_tx_queue_setup(), then overwritten by the second call to
rte_eth_tx_queue_setup() but without an appropriate initialization for the
second time. I'm trying to fix this issue right now.

Q: Does it sound like a realistic goal to make virtio driver work in
openvswitch-like scenarios? I'm definitely not an expert in the area of
dpdk and can't estimate time and resources required. Maybe it's better to
wait until I get a proper hardware?

Thanks for helping,
Oleg

[dpdk-dev] Beyond DPDK 2.0

2015-05-07 Thread Avi Kivity

On 05/07/2015 06:49 PM, Wiles, Keith wrote:
>
> On 5/7/15, 8:33 AM, "Avi Kivity"  wrote:
>
>> On 05/07/2015 06:27 PM, Wiles, Keith wrote:
>>> On 5/7/15, 7:02 AM, "Avi Kivity"  wrote:
>>>
 On Wed, Apr 22, 2015 at 6:11 PM, O'Driscoll, Tim
 
 wrote:

> Does anybody have any input or comments on this?
>
>
>> -Original Message-
>> From: O'Driscoll, Tim
>> Sent: Thursday, April 16, 2015 11:39 AM
>> To: dev at dpdk.org
>> Subject: Beyond DPDK 2.0
>>
>> Following the launch of DPDK by Intel as an internal development
>> project, the launch of dpdk.org by 6WIND in 2013, and the first DPDK
> RPM
>> packages for Fedora in 2014, 6WIND, Red Hat and Intel would like to
>> prepare for future releases after DPDK 2.0 by starting a discussion
>> on
>> its evolution. Anyone is welcome to join this initiative.
>>
>> Since then, the project has grown significantly:
>> -The number of commits and mailing list posts has increased
>> steadily.
>> -Support has been added for a wide range of new NICs (Mellanox
>> support submitted by 6WIND, Cisco VIC, Intel i40e and fm10k etc.).
>> -DPDK is now supported on multiple architectures (IBM Power
> support
>> in DPDK 1.8, Tile support submitted by EZchip but not yet reviewed or
>> applied).
>>
>> While this is great progress, we need to make sure that the project
>> is
>> structured in a way that enables it to continue to grow. To achieve
>> this, 6WIND, Red Hat and Intel would like to start a discussion about
>> the future of the project, so that we can agree and establish
> processes
>> that satisfy the needs of the current and future DPDK community.
>>
>> We're very interested in hearing the views of everybody in the
>> community. In addition to debate on the mailing list, we'll also
>> schedule community calls to discuss this.
>>
>>
>> Project Goals
>> -
>>
>> Some topics to be considered for the DPDK project include:
>> -Project Charter: The charter of the DPDK project should be
> clearly
>> defined, and should explain the limits of DPDK (what it does and does
>> not cover). This does not mean that we would be stuck with a singular
>> charter for all time, but the direction and intent of the project
> should
>> be well understood.
 One problem we've seen with dpdk is that it is a framework, not a
 library:
 it wants to create threads, manage memory, and generally take over.
 This
 is a problem for us, as we are writing a framework (seastar, [1]) and
 need
 to create threads, manage memory, and generally take over ourselves.

 Perhaps dpdk can be split into two layers, a library layer that only
 provides mechanisms, and a framework layer that glues together those
 mechanisms and applies a policy, trading in generality for ease of use.
>>> The DPDK system is somewhat divided now between the EAL, PMDS and
>>> utility
>>> functions like malloc/rings/?
>>>
>>> The problem I see is the PMDs need a framework to be usable and the EAL
>>> plus the ethdev layers provide that support today. Setting up and
>>> initializing the DPDK system is pretty clean just call the EAL init
>>> routines along with the pool creates and the basic configs for the
>>> PMDs/hardware. Once the system is inited one can create new threads and
>>> not requiring anyone to use DPDK launch routines. Maybe I am not
>>> understanding your needs can you explain more?
>> An initialization routine that accepts argc/argv can hardly be called
>> clean.
> You want a config file or structure initialization design? If that is the
> case you can contribute that support as another way to initialize DPDK.

A config file would be even worse.  But we are discussing why 
dpdk-as-a-framework is detrimental, not new ways for me to contribute.

>> In seastar, we have our own malloc() (since seastar is sharded we can
>> provide a faster thread-unsafe malloc implementation).  We also have our
>> own threading, and since dpdk is an optional component in seastar, dpdk
>> support requires code duplication.
> DPDK replies one the huge page support for allocation to get the
> performance, do you also not require huge page support.

Sorry, is this a question?  Please rephrase.

>   The malloc system
> in DPDK can be used as a replacement for the standard malloc if that works
> for your needs. Also after DPDK inits you can use your own malloc and any
> other tools you want to use.

How is memory partitioned between dpdk and my application?  If I 
underallocate dpdk memory, something bad will happen.  If I overallocate 
dpdk memory, then I am depriving my application of this memory.  A 
common pool means I do not overallocate or underallocate, but since dpdk 
insists on managing its own pools, I can't do this.

>   I do not

[dpdk-dev] Beyond DPDK 2.0

2015-05-07 Thread Avi Kivity

On 05/07/2015 06:27 PM, Wiles, Keith wrote:
>
> On 5/7/15, 7:02 AM, "Avi Kivity"  wrote:
>
>> On Wed, Apr 22, 2015 at 6:11 PM, O'Driscoll, Tim
>> 
>> wrote:
>>
>>> Does anybody have any input or comments on this?
>>>
>>>
 -Original Message-
 From: O'Driscoll, Tim
 Sent: Thursday, April 16, 2015 11:39 AM
 To: dev at dpdk.org
 Subject: Beyond DPDK 2.0

 Following the launch of DPDK by Intel as an internal development
 project, the launch of dpdk.org by 6WIND in 2013, and the first DPDK
>>> RPM
 packages for Fedora in 2014, 6WIND, Red Hat and Intel would like to
 prepare for future releases after DPDK 2.0 by starting a discussion on
 its evolution. Anyone is welcome to join this initiative.

 Since then, the project has grown significantly:
 -The number of commits and mailing list posts has increased
 steadily.
 -Support has been added for a wide range of new NICs (Mellanox
 support submitted by 6WIND, Cisco VIC, Intel i40e and fm10k etc.).
 -DPDK is now supported on multiple architectures (IBM Power
>>> support
 in DPDK 1.8, Tile support submitted by EZchip but not yet reviewed or
 applied).

 While this is great progress, we need to make sure that the project is
 structured in a way that enables it to continue to grow. To achieve
 this, 6WIND, Red Hat and Intel would like to start a discussion about
 the future of the project, so that we can agree and establish
>>> processes
 that satisfy the needs of the current and future DPDK community.

 We're very interested in hearing the views of everybody in the
 community. In addition to debate on the mailing list, we'll also
 schedule community calls to discuss this.


 Project Goals
 -

 Some topics to be considered for the DPDK project include:
 -Project Charter: The charter of the DPDK project should be
>>> clearly
 defined, and should explain the limits of DPDK (what it does and does
 not cover). This does not mean that we would be stuck with a singular
 charter for all time, but the direction and intent of the project
>>> should
 be well understood.
>>
>> One problem we've seen with dpdk is that it is a framework, not a library:
>> it wants to create threads, manage memory, and generally take over.  This
>> is a problem for us, as we are writing a framework (seastar, [1]) and need
>> to create threads, manage memory, and generally take over ourselves.
>>
>> Perhaps dpdk can be split into two layers, a library layer that only
>> provides mechanisms, and a framework layer that glues together those
>> mechanisms and applies a policy, trading in generality for ease of use.
> The DPDK system is somewhat divided now between the EAL, PMDS and utility
> functions like malloc/rings/?
>
> The problem I see is the PMDs need a framework to be usable and the EAL
> plus the ethdev layers provide that support today. Setting up and
> initializing the DPDK system is pretty clean just call the EAL init
> routines along with the pool creates and the basic configs for the
> PMDs/hardware. Once the system is inited one can create new threads and
> not requiring anyone to use DPDK launch routines. Maybe I am not
> understanding your needs can you explain more?

An initialization routine that accepts argc/argv can hardly be called clean.

In seastar, we have our own malloc() (since seastar is sharded we can 
provide a faster thread-unsafe malloc implementation).  We also have our 
own threading, and since dpdk is an optional component in seastar, dpdk 
support requires code duplication.

I would like to launch my own threads, pin them where I like, and call 
PMD drivers to send and receive packets.  Practically everything else 
that dpdk does gets in my way, including mbuf pools.  I'd much prefer to 
allocate mbufs myself.


>> [1] http://seastar-project.org

[dpdk-dev] Beyond DPDK 2.0

2015-05-07 Thread Avi Kivity

On 05/07/2015 06:27 PM, Wiles, Keith wrote:
>
> On 5/7/15, 7:02 AM, "Avi Kivity"  wrote:
>
>> On Wed, Apr 22, 2015 at 6:11 PM, O'Driscoll, Tim
>> 
>> wrote:
>>
>>> Does anybody have any input or comments on this?
>>>
>>>
 -Original Message-
 From: O'Driscoll, Tim
 Sent: Thursday, April 16, 2015 11:39 AM
 To: dev at dpdk.org
 Subject: Beyond DPDK 2.0

 Following the launch of DPDK by Intel as an internal development
 project, the launch of dpdk.org by 6WIND in 2013, and the first DPDK
>>> RPM
 packages for Fedora in 2014, 6WIND, Red Hat and Intel would like to
 prepare for future releases after DPDK 2.0 by starting a discussion on
 its evolution. Anyone is welcome to join this initiative.

 Since then, the project has grown significantly:
 -The number of commits and mailing list posts has increased
 steadily.
 -Support has been added for a wide range of new NICs (Mellanox
 support submitted by 6WIND, Cisco VIC, Intel i40e and fm10k etc.).
 -DPDK is now supported on multiple architectures (IBM Power
>>> support
 in DPDK 1.8, Tile support submitted by EZchip but not yet reviewed or
 applied).

 While this is great progress, we need to make sure that the project is
 structured in a way that enables it to continue to grow. To achieve
 this, 6WIND, Red Hat and Intel would like to start a discussion about
 the future of the project, so that we can agree and establish
>>> processes
 that satisfy the needs of the current and future DPDK community.

 We're very interested in hearing the views of everybody in the
 community. In addition to debate on the mailing list, we'll also
 schedule community calls to discuss this.


 Project Goals
 -

 Some topics to be considered for the DPDK project include:
 -Project Charter: The charter of the DPDK project should be
>>> clearly
 defined, and should explain the limits of DPDK (what it does and does
 not cover). This does not mean that we would be stuck with a singular
 charter for all time, but the direction and intent of the project
>>> should
 be well understood.
>>
>> One problem we've seen with dpdk is that it is a framework, not a library:
>> it wants to create threads, manage memory, and generally take over.  This
>> is a problem for us, as we are writing a framework (seastar, [1]) and need
>> to create threads, manage memory, and generally take over ourselves.
>>
>> Perhaps dpdk can be split into two layers, a library layer that only
>> provides mechanisms, and a framework layer that glues together those
>> mechanisms and applies a policy, trading in generality for ease of use.
> The DPDK system is somewhat divided now between the EAL, PMDS and utility
> functions like malloc/rings/?
>
> The problem I see is the PMDs need a framework to be usable and the EAL
> plus the ethdev layers provide that support today. Setting up and
> initializing the DPDK system is pretty clean just call the EAL init
> routines along with the pool creates and the basic configs for the
> PMDs/hardware. Once the system is inited one can create new threads and
> not requiring anyone to use DPDK launch routines. Maybe I am not
> understanding your needs can you explain more?

An initialization routine that accepts argc/argv can hardly be called clean.

In seastar, we have our own malloc() (since seastar is sharded we can 
provide a faster thread-unsafe malloc implementation).  We also have our 
own threading, and since dpdk is an optional component in seastar, dpdk 
support requires code duplication.

I would like to launch my own threads, pin them where I like, and call 
PMD drivers to send and receive packets.  Practically everything else 
that dpdk does gets in my way, including mbuf pools.  I'd much prefer to 
allocate mbufs myself.


>> [1] http://seastar-project.org

[dpdk-dev] Issues met while running openvswitch/dpdk/virtio inside the VM

2015-05-07 Thread Pravin Shelar

On Thu, May 7, 2015 at 9:22 AM, Oleg Strikov  
wrote:
> Hi DPDK users and developers,
>
> Few weeks ago I came up with the idea to run openvswitch with dpdk backend
> inside qemu-kvm virtual machine. I don't have enough supported NICs yet and
> my plan was to start experimenting inside the virtualized environment,
> achieve functional state of all the components and then switch to the real
> hardware. Additional useful side-effect of doing things inside the vm is
> that issues can be easily reproduced by someone else in a different
> environment.
>
> I (fondly) hoped that running openvswitch/dpdk inside the vm would be
> simpler than running the same set of components on the real hardware.
> Unfortunately I met a bunch of issues on the way. All these issues lie on a
> borderline between dpdk and openvswitch but I think that you might be
> interested in my story. Please note that I still don't have
> openvswitch/dpdk working inside the vm. I definetely have some progress
> though.
>
Thanks for summarizing all the issues.
DPDK is testing is done on real hardware and we are planing testing it
in VM. This will certainly help in fixing issues sooner.

> Q: Does it sound okay from functional (not performance) standpoint to run
> openvswitch/dpdk inside the vm? Do we want to be able to do this? Does
> anyone from the dpdk development team do this?
>
> ## Issue 1 ##
>
> Openvswitch requires backend pmd driver to provide N_CORES tx queues where
> N_CORES is the amount of cores available on the machine (openvswitch counts
> the amount of cpu* entries inside /sys/devices/system/node/node0/ folder).
> To my understanding it doesn't take into account the actual amount of cores
> used by dpdk and just allocates tx queue for each available core. You may
> refer to this chunk of code for details:
> https://github.com/openvswitch/ovs/blob/master/lib/dpif-netdev.c#L1067
>
In case of OVS DPDK, there is no dpdk thread. Therefore all polling
cores are managed by OVS and there is no need to account cores for
DPDK. You can assign specific cores for OVS to limit number of cores
used by OVS.

> This approach works fine on the real hardware but makes some issues when we
> run openvswitch/dpdk inside the virtual machine. I tried both emulated
> e1000 NIC and virtio NIC and neither of them worked just from the box.
> Emulated e1000 NIC doesn't support multiple tx queues at all (see
> http://dpdk.org/browse/dpdk/tree/lib/librte_pmd_e1000/em_ethdev.c#n884) and
> virtio NIC doesn't support multiple tx queues by default. To enable
> multiple tx queue for virtio NIC I had to add the following line to the
> interface section of my libvirt config: ''
>
Good point. We should document this. Can you send patch to update README.DPDK?

> ## Issue 2 ##
>
> Openvswitch calls rte_eth_tx_queue_setup() twice for the same
> port_id/queue_id. First call takes place during device initialization (see
> call to dpdk_eth_dev_init() inside netdev_dpdk_init():
> https://github.com/openvswitch/ovs/blob/master/lib/netdev-dpdk.c#L522).
> Second call takes place when openvswitch tries to add more tx queues to the
> device (see call to dpdk_eth_dev_init() inside netdev_dpdk_set_multiq():
> https://github.com/openvswitch/ovs/blob/master/lib/netdev-dpdk.c#L697).
> Second call not only initialized new queues but tries to re-initialize
> existing ones.
>
> Unfortunately virtio driver can't handle second call of
> rte_eth_tx_queue_setup() and returns error here:
> http://dpdk.org/browse/dpdk/tree/lib/librte_pmd_virtio/virtio_ethdev.c#n316
> This happens because memzone with the name portN_tvqN already exists when
> second call takes place (memzone has been created during the first call).
> To deal with this issue I had to manually add rte_memzone_lookup-based
> check for this situation and avoid allocation of a new memzone if it
> already exists.
>
This sounds like issue with virtIO driver. I think we need to fix DPDK
upstream for this to work correctly.

> Q: Is it okay that openvswitch calls rte_eth_tx_queue_setup() twice? Right
> now I can't understand if it's the issue with the virtio pmd driver or
> incorrect API usage by openvswitch? Could someone shed some light on this
> so I can move forward and maybe propose a fix.
>
> ## Issue 3 ##
>
> This issue is also (somehow) related to the fact that openvswitch calls
> rte_eth_tx_queue_setup() twice. I fix the previous issue by the method
> described above and initialization finishes. The whole machinery starts to
> work but crashes at the very beginning (while fetching the first packet
> from the NIC maybe). This crash happens here:
> http://dpdk.org/browse/dpdk/tree/lib/librte_pmd_virtio/virtio_rxtx.c#n588
> It takes place because vq_ring structure contains zeros instead of correct
> values:
> vq_ring = {num = 0, desc = 0x0, avail = 0x0, used = 0x0}
> My understanding is that vq_ring gets initialized after the first call to
> rte_eth_tx_queue_setup(), then overwritten by the second call to
>

[dpdk-dev] [RFC PATCH 0/2] Move PMDs out of lib directory

2015-05-07 Thread Marc Sune



On 07/05/15 17:35, Bruce Richardson wrote:
> The "lib" directory is getting very crowded, with both general libs and
> poll mode drivers in it. This patch set proposes to move the PMDs out of the
> lib folder and to put them in a separate "pmds" folder. This should help
> with code browse-ability as the number of libs, and pmds increases.
>
> Comments or objections?
>
> Bruce Richardson (2):
>pmds: Use relative rather than absolute paths
>pmds: move pmds from lib to separate pmd dir
>
>   GNUmakefile|2 +-
>   lib/Makefile   |   14 -
>   lib/librte_eal/linuxapp/eal/Makefile   |8 +-
>   lib/librte_pmd_af_packet/Makefile  |   64 -
>   lib/librte_pmd_af_packet/rte_eth_af_packet.c   |  847 ---
>   lib/librte_pmd_af_packet/rte_eth_af_packet.h   |   53 -
>   .../rte_pmd_af_packet_version.map  |7 -
>   lib/librte_pmd_bond/Makefile   |   68 -
>   lib/librte_pmd_bond/rte_eth_bond.h |  366 --
>   lib/librte_pmd_bond/rte_eth_bond_8023ad.c  | 1216 -
>   lib/librte_pmd_bond/rte_eth_bond_8023ad.h  |  222 -
>   lib/librte_pmd_bond/rte_eth_bond_8023ad_private.h  |  308 --
>   lib/librte_pmd_bond/rte_eth_bond_alb.c |  287 -
>   lib/librte_pmd_bond/rte_eth_bond_alb.h |  142 -
>   lib/librte_pmd_bond/rte_eth_bond_api.c |  840 ---
>   lib/librte_pmd_bond/rte_eth_bond_args.c|  278 -
>   lib/librte_pmd_bond/rte_eth_bond_pmd.c | 2269 
>   lib/librte_pmd_bond/rte_eth_bond_private.h |  287 -
>   lib/librte_pmd_bond/rte_eth_bond_version.map   |   22 -
>   lib/librte_pmd_e1000/Makefile  |   99 -
>   lib/librte_pmd_e1000/e1000/README  |   39 -
>   lib/librte_pmd_e1000/e1000/e1000_80003es2lan.c | 1514 --
>   lib/librte_pmd_e1000/e1000/e1000_80003es2lan.h |  100 -
>   lib/librte_pmd_e1000/e1000/e1000_82540.c   |  717 ---
>   lib/librte_pmd_e1000/e1000/e1000_82541.c   | 1268 -
>   lib/librte_pmd_e1000/e1000/e1000_82541.h   |   91 -
>   lib/librte_pmd_e1000/e1000/e1000_82542.c   |  588 --
>   lib/librte_pmd_e1000/e1000/e1000_82543.c   | 1553 --
>   lib/librte_pmd_e1000/e1000/e1000_82543.h   |   56 -
>   lib/librte_pmd_e1000/e1000/e1000_82571.c   | 2026 ---
>   lib/librte_pmd_e1000/e1000/e1000_82571.h   |   65 -
>   lib/librte_pmd_e1000/e1000/e1000_82575.c   | 3639 -
>   lib/librte_pmd_e1000/e1000/e1000_82575.h   |  520 --
>   lib/librte_pmd_e1000/e1000/e1000_api.c | 1357 -
>   lib/librte_pmd_e1000/e1000/e1000_api.h |  167 -
>   lib/librte_pmd_e1000/e1000/e1000_defines.h | 1498 -
>   lib/librte_pmd_e1000/e1000/e1000_hw.h  | 1026 
>   lib/librte_pmd_e1000/e1000/e1000_i210.c| 1000 
>   lib/librte_pmd_e1000/e1000/e1000_i210.h|  110 -
>   lib/librte_pmd_e1000/e1000/e1000_ich8lan.c | 5260 --
>   lib/librte_pmd_e1000/e1000/e1000_ich8lan.h |  313 --
>   lib/librte_pmd_e1000/e1000/e1000_mac.c | 2247 
>   lib/librte_pmd_e1000/e1000/e1000_mac.h |   95 -
>   lib/librte_pmd_e1000/e1000/e1000_manage.c  |  573 --
>   lib/librte_pmd_e1000/e1000/e1000_manage.h  |   95 -
>   lib/librte_pmd_e1000/e1000/e1000_mbx.c |  777 ---
>   lib/librte_pmd_e1000/e1000/e1000_mbx.h |  105 -
>   lib/librte_pmd_e1000/e1000/e1000_nvm.c | 1377 -
>   lib/librte_pmd_e1000/e1000/e1000_nvm.h |   98 -
>   lib/librte_pmd_e1000/e1000/e1000_osdep.c   |   83 -
>   lib/librte_pmd_e1000/e1000/e1000_osdep.h   |  183 -
>   lib/librte_pmd_e1000/e1000/e1000_phy.c | 4273 ---
>   lib/librte_pmd_e1000/e1000/e1000_phy.h |  327 --
>   lib/librte_pmd_e1000/e1000/e1000_regs.h|  685 ---
>   lib/librte_pmd_e1000/e1000/e1000_vf.c  |  586 --
>   lib/librte_pmd_e1000/e1000/e1000_vf.h  |  295 -
>   lib/librte_pmd_e1000/e1000_ethdev.h|  340 --
>   lib/librte_pmd_e1000/e1000_logs.h  |   78 -
>   lib/librte_pmd_e1000/em_ethdev.c   | 1530 --
>   lib/librte_pmd_e1000/em_rxtx.c | 1865 ---
>   lib/librte_pmd_e1000/igb_ethdev.c  | 3656 -
>   lib/librte_pmd_e1000/igb_pf.c  |  511 --
>   lib/librte_pmd_e1000/igb_rxtx.c| 2397 
>   lib/librte_pmd_e1000/rte_pmd_e1000_version.map |4 -
>   lib/librte_pmd_enic/LICENSE|   27 -
>   lib/librte_pmd_enic/Makefile   |   71 -
>   lib/librte_pmd_enic/enic.h |  200 -
>   lib/librte_pmd_enic/enic_clsf.c|

[dpdk-dev] Beyond DPDK 2.0

2015-05-07 Thread Luke Gorrie

On 7 May 2015 at 16:02, Avi Kivity  wrote:

> One problem we've seen with dpdk is that it is a framework, not a library:
> it wants to create threads, manage memory, and generally take over.  This
> is a problem for us, as we are writing a framework (seastar, [1]) and need
> to create threads, manage memory, and generally take over ourselves.
>

That is also broadly why we don't currently use DPDK in Snabb Switch [1].

There is a bunch of functionality in DPDK that would be tempting for us to
use and contribute back to: device drivers, SIMD routines, data structures,
and so on. I think that we would do this if they were available piecemeal
as stand-alone libi40e, libsimd, liblpn, etc.

The whole DPDK platform/framework is too much for us to adopt though. Some
aspects of it are in conflict with our goals and it is an all-or-nothing
proposition. So for now we are staying self-sufficient even when it means
writing our own ixgbe replacement, etc.

Having said that we are able to share code that doesn't require linking
into our address space e.g. vhost-user and potentially the bifurcated
drivers in the future. That seems like a nice direction for things to be
going in and a way to collaborate even without our directly linking with
DPDK.

[1] https://github.com/lukego/snabbswitch/blob/README/README.md

[dpdk-dev] [RFC PATCH 0/2] Move PMDs out of lib directory

2015-05-07 Thread Bruce Richardson

On Thu, May 07, 2015 at 05:45:20PM +0200, Marc Sune wrote:
> 
> 
> On 07/05/15 17:35, Bruce Richardson wrote:
> >The "lib" directory is getting very crowded, with both general libs and
> >poll mode drivers in it. This patch set proposes to move the PMDs out of the
> >lib folder and to put them in a separate "pmds" folder. This should help
> >with code browse-ability as the number of libs, and pmds increases.
> >
> >Comments or objections?
> >
> >Bruce Richardson (2):
> >   pmds: Use relative rather than absolute paths
> >   pmds: move pmds from lib to separate pmd dir
> >
> >  create mode 100644 pmds/librte_pmd_xenvirt/rte_mempool_gntalloc.c
> >  create mode 100644 pmds/librte_pmd_xenvirt/rte_xen_lib.c
> >  create mode 100644 pmds/librte_pmd_xenvirt/rte_xen_lib.h
> >  create mode 100644 pmds/librte_pmd_xenvirt/virtio_logs.h
> >  create mode 100644 pmds/librte_pmd_xenvirt/virtqueue.h
> >
> 
> But at the end they are also libraries. What about something like:
> 
> * libs/core <= fundamental libraries (eal, mbuf rings...)
> * libs/pmds <= all pmds
> 
> And other feature-group oriented, higher level lib, directories (not sure
> right now how to better classify them right now):
> * libs/processing <= packet processing
> * libs/utils
> ...
> 
Yes, they are all just libs, so we could make "pmds" be a sub-dir of the lib
folder. I prefer the shorter path myself, but if others want a multi-level
hierarchy it's no big deal.

For the other libs, I'm not sure we need to split them up, and I also think
that trying to divide them into categories - and what those categories should
be could - cause endless discussion. However, maybe I'm overly pessimistic... 
:-)

/Bruce

[dpdk-dev] Beyond DPDK 2.0

2015-05-07 Thread Avi Kivity

On Wed, Apr 22, 2015 at 6:11 PM, O'Driscoll, Tim 
wrote:

> Does anybody have any input or comments on this?
>
>
> > -Original Message-
> > From: O'Driscoll, Tim
> > Sent: Thursday, April 16, 2015 11:39 AM
> > To: dev at dpdk.org
> > Subject: Beyond DPDK 2.0
> >
> > Following the launch of DPDK by Intel as an internal development
> > project, the launch of dpdk.org by 6WIND in 2013, and the first DPDK RPM
> > packages for Fedora in 2014, 6WIND, Red Hat and Intel would like to
> > prepare for future releases after DPDK 2.0 by starting a discussion on
> > its evolution. Anyone is welcome to join this initiative.
> >
> > Since then, the project has grown significantly:
> > -The number of commits and mailing list posts has increased
> > steadily.
> > -Support has been added for a wide range of new NICs (Mellanox
> > support submitted by 6WIND, Cisco VIC, Intel i40e and fm10k etc.).
> > -DPDK is now supported on multiple architectures (IBM Power support
> > in DPDK 1.8, Tile support submitted by EZchip but not yet reviewed or
> > applied).
> >
> > While this is great progress, we need to make sure that the project is
> > structured in a way that enables it to continue to grow. To achieve
> > this, 6WIND, Red Hat and Intel would like to start a discussion about
> > the future of the project, so that we can agree and establish processes
> > that satisfy the needs of the current and future DPDK community.
> >
> > We're very interested in hearing the views of everybody in the
> > community. In addition to debate on the mailing list, we'll also
> > schedule community calls to discuss this.
> >
> >
> > Project Goals
> > -
> >
> > Some topics to be considered for the DPDK project include:
> > -Project Charter: The charter of the DPDK project should be clearly
> > defined, and should explain the limits of DPDK (what it does and does
> > not cover). This does not mean that we would be stuck with a singular
> > charter for all time, but the direction and intent of the project should
> > be well understood.
>


One problem we've seen with dpdk is that it is a framework, not a library:
it wants to create threads, manage memory, and generally take over.  This
is a problem for us, as we are writing a framework (seastar, [1]) and need
to create threads, manage memory, and generally take over ourselves.

Perhaps dpdk can be split into two layers, a library layer that only
provides mechanisms, and a framework layer that glues together those
mechanisms and applies a policy, trading in generality for ease of use.

[1] http://seastar-project.org

[dpdk-dev] [RFC PATCH 0/2] Move PMDs out of lib directory

2015-05-07 Thread Bruce Richardson

On Thu, May 07, 2015 at 04:35:49PM +0100, Bruce Richardson wrote:
> The "lib" directory is getting very crowded, with both general libs and 
> poll mode drivers in it. This patch set proposes to move the PMDs out of the
> lib folder and to put them in a separate "pmds" folder. This should help
> with code browse-ability as the number of libs, and pmds increases.
> 
> Comments or objections?
> 
> Bruce Richardson (2):
>   pmds: Use relative rather than absolute paths
>   pmds: move pmds from lib to separate pmd dir

Apologies, but the second patch of the proposed set failed because it was too
big. However, I'm confident that people can make a pretty good guess as to what
it actually contained based on the summary below. :-)

/Bruce

> 
>  GNUmakefile|2 +-
>  lib/Makefile   |   14 -
>  lib/librte_eal/linuxapp/eal/Makefile   |8 +-
>  lib/librte_pmd_af_packet/Makefile  |   64 -
>  lib/librte_pmd_af_packet/rte_eth_af_packet.c   |  847 ---
>  lib/librte_pmd_af_packet/rte_eth_af_packet.h   |   53 -
>  .../rte_pmd_af_packet_version.map  |7 -
>  lib/librte_pmd_bond/Makefile   |   68 -
>  lib/librte_pmd_bond/rte_eth_bond.h |  366 --
>  lib/librte_pmd_bond/rte_eth_bond_8023ad.c  | 1216 -
>  lib/librte_pmd_bond/rte_eth_bond_8023ad.h  |  222 -
>  lib/librte_pmd_bond/rte_eth_bond_8023ad_private.h  |  308 --
>  lib/librte_pmd_bond/rte_eth_bond_alb.c |  287 -
>  lib/librte_pmd_bond/rte_eth_bond_alb.h |  142 -
>  lib/librte_pmd_bond/rte_eth_bond_api.c |  840 ---
>  lib/librte_pmd_bond/rte_eth_bond_args.c|  278 -
>  lib/librte_pmd_bond/rte_eth_bond_pmd.c | 2269 
>  lib/librte_pmd_bond/rte_eth_bond_private.h |  287 -
>  lib/librte_pmd_bond/rte_eth_bond_version.map   |   22 -
>  lib/librte_pmd_e1000/Makefile  |   99 -
>  lib/librte_pmd_e1000/e1000/README  |   39 -
>  lib/librte_pmd_e1000/e1000/e1000_80003es2lan.c | 1514 --
>  lib/librte_pmd_e1000/e1000/e1000_80003es2lan.h |  100 -
>  lib/librte_pmd_e1000/e1000/e1000_82540.c   |  717 ---
>  lib/librte_pmd_e1000/e1000/e1000_82541.c   | 1268 -
>  lib/librte_pmd_e1000/e1000/e1000_82541.h   |   91 -
>  lib/librte_pmd_e1000/e1000/e1000_82542.c   |  588 --
>  lib/librte_pmd_e1000/e1000/e1000_82543.c   | 1553 --
>  lib/librte_pmd_e1000/e1000/e1000_82543.h   |   56 -
>  lib/librte_pmd_e1000/e1000/e1000_82571.c   | 2026 ---
>  lib/librte_pmd_e1000/e1000/e1000_82571.h   |   65 -
>  lib/librte_pmd_e1000/e1000/e1000_82575.c   | 3639 -
>  lib/librte_pmd_e1000/e1000/e1000_82575.h   |  520 --
>  lib/librte_pmd_e1000/e1000/e1000_api.c | 1357 -
>  lib/librte_pmd_e1000/e1000/e1000_api.h |  167 -
>  lib/librte_pmd_e1000/e1000/e1000_defines.h | 1498 -
>  lib/librte_pmd_e1000/e1000/e1000_hw.h  | 1026 
>  lib/librte_pmd_e1000/e1000/e1000_i210.c| 1000 
>  lib/librte_pmd_e1000/e1000/e1000_i210.h|  110 -
>  lib/librte_pmd_e1000/e1000/e1000_ich8lan.c | 5260 --
>  lib/librte_pmd_e1000/e1000/e1000_ich8lan.h |  313 --
>  lib/librte_pmd_e1000/e1000/e1000_mac.c | 2247 
>  lib/librte_pmd_e1000/e1000/e1000_mac.h |   95 -
>  lib/librte_pmd_e1000/e1000/e1000_manage.c  |  573 --
>  lib/librte_pmd_e1000/e1000/e1000_manage.h  |   95 -
>  lib/librte_pmd_e1000/e1000/e1000_mbx.c |  777 ---
>  lib/librte_pmd_e1000/e1000/e1000_mbx.h |  105 -
>  lib/librte_pmd_e1000/e1000/e1000_nvm.c | 1377 -
>  lib/librte_pmd_e1000/e1000/e1000_nvm.h |   98 -
>  lib/librte_pmd_e1000/e1000/e1000_osdep.c   |   83 -
>  lib/librte_pmd_e1000/e1000/e1000_osdep.h   |  183 -
>  lib/librte_pmd_e1000/e1000/e1000_phy.c | 4273 ---
>  lib/librte_pmd_e1000/e1000/e1000_phy.h |  327 --
>  lib/librte_pmd_e1000/e1000/e1000_regs.h|  685 ---
>  lib/librte_pmd_e1000/e1000/e1000_vf.c  |  586 --
>  lib/librte_pmd_e1000/e1000/e1000_vf.h  |  295 -
>  lib/librte_pmd_e1000/e1000_ethdev.h|  340 --
>  lib/librte_pmd_e1000/e1000_logs.h  |   78 -
>  lib/librte_pmd_e1000/em_ethdev.c   | 1530 --
>  lib/librte_pmd_e1000/em_rxtx.c | 1865 ---
>  lib/librte_pmd_e1000/igb_ethdev.c  | 3656 -
>  lib/librte_pmd_e1000/igb_pf.c  |  511 --
>  lib/librte_pmd_e1000/igb_rxtx.c| 2397 
>  lib/librte_pmd_e1000/rte_pmd_e1000_version.map |4 -
>  lib/librte_pmd_enic/LICENSE|   27 -
>

[dpdk-dev] Issue for storing the payload on vHost-testpmd

2015-05-07 Thread Cheng Kevin

Hi all,

 Recently, i've modified the vHost app - testpmd.
 The modification includes the following steps:
 1.obtain the payload of each packet
 2.collect the information that i need. ex. 'url'
 3.store the "url" into a disk file. ex. payload.txt

 The first two steps is achieved easily.
 However, step 3 is a problem now. i use fwrite to write the file.
 And fwrite occupy all of the free memory in the vm.
 At the end, testpmd will be terminated by os.

 Is there any way to get rid of this situation?
 Any advise will be appreciated.

Best Regards,
kevin

[dpdk-dev] [RFC PATCH 1/2] pmds: Use relative rather than absolute paths

2015-05-07 Thread Bruce Richardson

In the Makefiles for the PMDs, the paths to the files are often
specified using the full path from $(RTE_SDK) variable. These paths can
be shortened, and make more flexible in case of a future path change by
specifying the paths using $(SRCDIR) instead.

Signed-off-by: Bruce Richardson 
---
 lib/librte_pmd_e1000/Makefile   | 4 ++--
 lib/librte_pmd_enic/Makefile| 6 +++---
 lib/librte_pmd_fm10k/Makefile   | 4 ++--
 lib/librte_pmd_i40e/Makefile| 4 ++--
 lib/librte_pmd_ixgbe/Makefile   | 4 ++--
 lib/librte_pmd_vmxnet3/Makefile | 2 +-
 6 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/lib/librte_pmd_e1000/Makefile b/lib/librte_pmd_e1000/Makefile
index 8c8fed8..158bc81 100644
--- a/lib/librte_pmd_e1000/Makefile
+++ b/lib/librte_pmd_e1000/Makefile
@@ -60,10 +60,10 @@ endif
 # Add extra flags for base driver files (also known as shared code)
 # to disable warnings in them
 #
-BASE_DRIVER_OBJS=$(patsubst %.c,%.o,$(notdir $(wildcard 
$(RTE_SDK)/lib/librte_pmd_e1000/e1000/*.c)))
+BASE_DRIVER_OBJS=$(patsubst %.c,%.o,$(notdir $(wildcard $(SRCDIR)/e1000/*.c)))
 $(foreach obj, $(BASE_DRIVER_OBJS), $(eval 
CFLAGS_$(obj)+=$(CFLAGS_BASE_DRIVER)))

-VPATH += $(RTE_SDK)/lib/librte_pmd_e1000/e1000
+VPATH += $(SRCDIR)/e1000

 #
 # all source are stored in SRCS-y
diff --git a/lib/librte_pmd_enic/Makefile b/lib/librte_pmd_enic/Makefile
index 251a898..bfc0994 100644
--- a/lib/librte_pmd_enic/Makefile
+++ b/lib/librte_pmd_enic/Makefile
@@ -41,12 +41,12 @@ EXPORT_MAP := rte_pmd_enic_version.map

 LIBABIVER := 1

-CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_enic/vnic/
-CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_enic/
+CFLAGS += -I$(SRCDIR)/vnic/
+CFLAGS += -I$(SRCDIR)
 CFLAGS += -O3
 CFLAGS += $(WERROR_FLAGS) -Wno-strict-aliasing

-VPATH += $(RTE_SDK)/lib/librte_pmd_enic/src
+VPATH += $(SRCDIR)/src

 #
 # all source are stored in SRCS-y
diff --git a/lib/librte_pmd_fm10k/Makefile b/lib/librte_pmd_fm10k/Makefile
index 7516d37..7395933 100644
--- a/lib/librte_pmd_fm10k/Makefile
+++ b/lib/librte_pmd_fm10k/Makefile
@@ -76,10 +76,10 @@ endif
 #
 # Add extra flags for base driver source files to disable warnings in them
 #
-BASE_DRIVER_OBJS=$(patsubst %.c,%.o,$(notdir $(wildcard 
$(RTE_SDK)/lib/librte_pmd_fm10k/base/*.c)))
+BASE_DRIVER_OBJS=$(patsubst %.c,%.o,$(notdir $(wildcard $(SRCDIR)/base/*.c)))
 $(foreach obj, $(BASE_DRIVER_OBJS), $(eval 
CFLAGS_$(obj)+=$(CFLAGS_BASE_DRIVER)))

-VPATH += $(RTE_SDK)/lib/librte_pmd_fm10k/base
+VPATH += $(SRCDIR)/base

 #
 # all source are stored in SRCS-y
diff --git a/lib/librte_pmd_i40e/Makefile b/lib/librte_pmd_i40e/Makefile
index 64bab16..050dd44 100644
--- a/lib/librte_pmd_i40e/Makefile
+++ b/lib/librte_pmd_i40e/Makefile
@@ -75,10 +75,10 @@ endif

 CFLAGS_i40e_lan_hmc.o += -Wno-error
 endif
-OBJS_BASE_DRIVER=$(patsubst %.c,%.o,$(notdir $(wildcard 
$(RTE_SDK)/lib/librte_pmd_i40e/i40e/*.c)))
+OBJS_BASE_DRIVER=$(patsubst %.c,%.o,$(notdir $(wildcard $(SRCDIR)/i40e/*.c)))
 $(foreach obj, $(OBJS_BASE_DRIVER), $(eval 
CFLAGS_$(obj)+=$(CFLAGS_BASE_DRIVER)))

-VPATH += $(RTE_SDK)/lib/librte_pmd_i40e/i40e
+VPATH += $(SRCDIR)/i40e

 #
 # all source are stored in SRCS-y
diff --git a/lib/librte_pmd_ixgbe/Makefile b/lib/librte_pmd_ixgbe/Makefile
index fbf6966..e0f8916 100644
--- a/lib/librte_pmd_ixgbe/Makefile
+++ b/lib/librte_pmd_ixgbe/Makefile
@@ -82,10 +82,10 @@ endif
 # Add extra flags for base driver files (also known as shared code)
 # to disable warnings in them
 #
-BASE_DRIVER_OBJS=$(patsubst %.c,%.o,$(notdir $(wildcard 
$(RTE_SDK)/lib/librte_pmd_ixgbe/ixgbe/*.c)))
+BASE_DRIVER_OBJS=$(patsubst %.c,%.o,$(notdir $(wildcard $(SRCDIR)/ixgbe/*.c)))
 $(foreach obj, $(BASE_DRIVER_OBJS), $(eval 
CFLAGS_$(obj)+=$(CFLAGS_BASE_DRIVER)))

-VPATH += $(RTE_SDK)/lib/librte_pmd_ixgbe/ixgbe
+VPATH += $(SRCDIR)/ixgbe

 #
 # all source are stored in SRCS-y
diff --git a/lib/librte_pmd_vmxnet3/Makefile b/lib/librte_pmd_vmxnet3/Makefile
index fc616c4..48177a3 100644
--- a/lib/librte_pmd_vmxnet3/Makefile
+++ b/lib/librte_pmd_vmxnet3/Makefile
@@ -64,7 +64,7 @@ CFLAGS_BASE_DRIVER += -Wno-strict-aliasing 
-Wno-format-extra-args

 endif

-VPATH += $(RTE_SDK)/lib/librte_pmd_vmxnet3/vmxnet3
+VPATH += $(SRCDIR)/vmxnet3

 EXPORT_MAP := rte_pmd_vmxnet3_version.map

-- 
2.1.0

[dpdk-dev] [RFC PATCH 0/2] Move PMDs out of lib directory

2015-05-07 Thread Bruce Richardson

The "lib" directory is getting very crowded, with both general libs and 
poll mode drivers in it. This patch set proposes to move the PMDs out of the
lib folder and to put them in a separate "pmds" folder. This should help
with code browse-ability as the number of libs, and pmds increases.

Comments or objections?

Bruce Richardson (2):
  pmds: Use relative rather than absolute paths
  pmds: move pmds from lib to separate pmd dir

 GNUmakefile|2 +-
 lib/Makefile   |   14 -
 lib/librte_eal/linuxapp/eal/Makefile   |8 +-
 lib/librte_pmd_af_packet/Makefile  |   64 -
 lib/librte_pmd_af_packet/rte_eth_af_packet.c   |  847 ---
 lib/librte_pmd_af_packet/rte_eth_af_packet.h   |   53 -
 .../rte_pmd_af_packet_version.map  |7 -
 lib/librte_pmd_bond/Makefile   |   68 -
 lib/librte_pmd_bond/rte_eth_bond.h |  366 --
 lib/librte_pmd_bond/rte_eth_bond_8023ad.c  | 1216 -
 lib/librte_pmd_bond/rte_eth_bond_8023ad.h  |  222 -
 lib/librte_pmd_bond/rte_eth_bond_8023ad_private.h  |  308 --
 lib/librte_pmd_bond/rte_eth_bond_alb.c |  287 -
 lib/librte_pmd_bond/rte_eth_bond_alb.h |  142 -
 lib/librte_pmd_bond/rte_eth_bond_api.c |  840 ---
 lib/librte_pmd_bond/rte_eth_bond_args.c|  278 -
 lib/librte_pmd_bond/rte_eth_bond_pmd.c | 2269 
 lib/librte_pmd_bond/rte_eth_bond_private.h |  287 -
 lib/librte_pmd_bond/rte_eth_bond_version.map   |   22 -
 lib/librte_pmd_e1000/Makefile  |   99 -
 lib/librte_pmd_e1000/e1000/README  |   39 -
 lib/librte_pmd_e1000/e1000/e1000_80003es2lan.c | 1514 --
 lib/librte_pmd_e1000/e1000/e1000_80003es2lan.h |  100 -
 lib/librte_pmd_e1000/e1000/e1000_82540.c   |  717 ---
 lib/librte_pmd_e1000/e1000/e1000_82541.c   | 1268 -
 lib/librte_pmd_e1000/e1000/e1000_82541.h   |   91 -
 lib/librte_pmd_e1000/e1000/e1000_82542.c   |  588 --
 lib/librte_pmd_e1000/e1000/e1000_82543.c   | 1553 --
 lib/librte_pmd_e1000/e1000/e1000_82543.h   |   56 -
 lib/librte_pmd_e1000/e1000/e1000_82571.c   | 2026 ---
 lib/librte_pmd_e1000/e1000/e1000_82571.h   |   65 -
 lib/librte_pmd_e1000/e1000/e1000_82575.c   | 3639 -
 lib/librte_pmd_e1000/e1000/e1000_82575.h   |  520 --
 lib/librte_pmd_e1000/e1000/e1000_api.c | 1357 -
 lib/librte_pmd_e1000/e1000/e1000_api.h |  167 -
 lib/librte_pmd_e1000/e1000/e1000_defines.h | 1498 -
 lib/librte_pmd_e1000/e1000/e1000_hw.h  | 1026 
 lib/librte_pmd_e1000/e1000/e1000_i210.c| 1000 
 lib/librte_pmd_e1000/e1000/e1000_i210.h|  110 -
 lib/librte_pmd_e1000/e1000/e1000_ich8lan.c | 5260 --
 lib/librte_pmd_e1000/e1000/e1000_ich8lan.h |  313 --
 lib/librte_pmd_e1000/e1000/e1000_mac.c | 2247 
 lib/librte_pmd_e1000/e1000/e1000_mac.h |   95 -
 lib/librte_pmd_e1000/e1000/e1000_manage.c  |  573 --
 lib/librte_pmd_e1000/e1000/e1000_manage.h  |   95 -
 lib/librte_pmd_e1000/e1000/e1000_mbx.c |  777 ---
 lib/librte_pmd_e1000/e1000/e1000_mbx.h |  105 -
 lib/librte_pmd_e1000/e1000/e1000_nvm.c | 1377 -
 lib/librte_pmd_e1000/e1000/e1000_nvm.h |   98 -
 lib/librte_pmd_e1000/e1000/e1000_osdep.c   |   83 -
 lib/librte_pmd_e1000/e1000/e1000_osdep.h   |  183 -
 lib/librte_pmd_e1000/e1000/e1000_phy.c | 4273 ---
 lib/librte_pmd_e1000/e1000/e1000_phy.h |  327 --
 lib/librte_pmd_e1000/e1000/e1000_regs.h|  685 ---
 lib/librte_pmd_e1000/e1000/e1000_vf.c  |  586 --
 lib/librte_pmd_e1000/e1000/e1000_vf.h  |  295 -
 lib/librte_pmd_e1000/e1000_ethdev.h|  340 --
 lib/librte_pmd_e1000/e1000_logs.h  |   78 -
 lib/librte_pmd_e1000/em_ethdev.c   | 1530 --
 lib/librte_pmd_e1000/em_rxtx.c | 1865 ---
 lib/librte_pmd_e1000/igb_ethdev.c  | 3656 -
 lib/librte_pmd_e1000/igb_pf.c  |  511 --
 lib/librte_pmd_e1000/igb_rxtx.c| 2397 
 lib/librte_pmd_e1000/rte_pmd_e1000_version.map |4 -
 lib/librte_pmd_enic/LICENSE|   27 -
 lib/librte_pmd_enic/Makefile   |   71 -
 lib/librte_pmd_enic/enic.h |  200 -
 lib/librte_pmd_enic/enic_clsf.c|  259 -
 lib/librte_pmd_enic/enic_compat.h  |  147 -
 lib/librte_pmd_enic/enic_ethdev.c  |  640 ---
 lib/librte_pmd_enic/enic_main.c| 1117 
 lib/librte_pmd_enic/enic_res.c |  219 -

[dpdk-dev] Beyond DPDK 2.0

2015-05-07 Thread Ivan Boule

Hi Avi,

On 05/07/2015 04:02 PM, Avi Kivity wrote:
> On Wed, Apr 22, 2015 at 6:11 PM, O'Driscoll, Tim 
> wrote:
>
>> Does anybody have any input or comments on this?
>>
>>
>>> -Original Message-
>>> From: O'Driscoll, Tim
>>> Sent: Thursday, April 16, 2015 11:39 AM
>>> To: dev at dpdk.org
>>> Subject: Beyond DPDK 2.0
>>>
>>> Following the launch of DPDK by Intel as an internal development
>>> project, the launch of dpdk.org by 6WIND in 2013, and the first DPDK RPM
>>> packages for Fedora in 2014, 6WIND, Red Hat and Intel would like to
>>> prepare for future releases after DPDK 2.0 by starting a discussion on
>>> its evolution. Anyone is welcome to join this initiative.
>>>
>>> Since then, the project has grown significantly:
>>> -The number of commits and mailing list posts has increased
>>> steadily.
>>> -Support has been added for a wide range of new NICs (Mellanox
>>> support submitted by 6WIND, Cisco VIC, Intel i40e and fm10k etc.).
>>> -DPDK is now supported on multiple architectures (IBM Power support
>>> in DPDK 1.8, Tile support submitted by EZchip but not yet reviewed or
>>> applied).
>>>
>>> While this is great progress, we need to make sure that the project is
>>> structured in a way that enables it to continue to grow. To achieve
>>> this, 6WIND, Red Hat and Intel would like to start a discussion about
>>> the future of the project, so that we can agree and establish processes
>>> that satisfy the needs of the current and future DPDK community.
>>>
>>> We're very interested in hearing the views of everybody in the
>>> community. In addition to debate on the mailing list, we'll also
>>> schedule community calls to discuss this.
>>>
>>>
>>> Project Goals
>>> -
>>>
>>> Some topics to be considered for the DPDK project include:
>>> -Project Charter: The charter of the DPDK project should be clearly
>>> defined, and should explain the limits of DPDK (what it does and does
>>> not cover). This does not mean that we would be stuck with a singular
>>> charter for all time, but the direction and intent of the project should
>>> be well understood.
>>
>
>
> One problem we've seen with dpdk is that it is a framework, not a library:
> it wants to create threads, manage memory, and generally take over.  This
> is a problem for us, as we are writing a framework (seastar, [1]) and need
> to create threads, manage memory, and generally take over ourselves.
>
> Perhaps dpdk can be split into two layers, a library layer that only
> provides mechanisms, and a framework layer that glues together those
> mechanisms and applies a policy, trading in generality for ease of use.
>
> [1] http://seastar-project.org
>
I fully agree with this analysis/proposal.
And I think that:
- the associated modifications should be done ASAP,
- the underlying design rules that this proposal refers to should drive 
the design and review of new DPDK features.

Regards,
Ivan

[dpdk-dev] [PATCH] librte_eal:Using compiler memory barrier for IA processor's rte_wmb/rte_rmb.

2015-05-07 Thread Ananyev, Konstantin


Hi Dong,

> -Original Message-
> From: Wang Dong [mailto:dong.wang.pro at hotmail.com]
> Sent: Thursday, May 07, 2015 4:28 PM
> To: Ananyev, Konstantin; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] librte_eal:Using compiler memory barrier for 
> IA processor's rte_wmb/rte_rmb.
> 
> Hi Konstantin,
> 
> > Hi Dong,
> >
> >> -Original Message-
> >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of WangDong
> >> Sent: Tuesday, May 05, 2015 4:38 PM
> >> To: dev at dpdk.org
> >> Subject: [dpdk-dev] [PATCH] librte_eal:Using compiler memory barrier for 
> >> IA processor's rte_wmb/rte_rmb.
> >>
> >> The current implementation of rte_wmb/rte_rmb for x86 is using processor 
> >> memory barrier. It's unnessary for IA processor,
> compiler
> >> memory barrier is enough.
> >
> > I wouldn't say they are 'unnecessary'.
> > There are situations, even on IA, when you need _fence_ isntructions.
> > So, please leave rte_*mb() macros unmodified.
> OK, leave them unmodified, but I really can't find a situation to use
> sfence and lfence instructions.

For example:
http://bartoszmilewski.com/2008/11/05/who-ordered-memory-fences-on-an-x86/
http://dpdk.org/ml/archives/dev/2014-May/002613.html

> 
> 
> > I still think that we need to create a new set of architecture dependent 
> > macros, as what discussed before.
> > Probably by analogy with linux kernel rte_smp_*mb() is a good name for them.
> > Though if you have some better name in mind, I am open to suggestions here.
> What abount rte_dma_*mb()? I find dma_*mb() in linux-4.0.1, it looks good~~

Hmm, but why _dma_?
We need same thing for multi-core communication too.
If rte_smp_ is not good enough, might be: rte_arch_?

> 
> >
> >> But if dpdk runing on a AMD processor, maybe we should use processor 
> >> memory barrier.
> >
> > As far as I remember, amd has the same memory ordering model.
> It's too hard to find a AMD's software developer manual.

There for example:
http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2012/10/24593_APM_v21.pdf
?

Konstantin

> 
> Dong
> 
> > So, I don't think we need  #ifdef RTE_ARCH_X86_IA here.
> >
> > Konstantin
> >
> >> I add a macro to distinguish them, if we compile DPDK for IA processor, 
> >> add the macro (RTE_ARCH_X86_IA) can improve
> performance
> >> with compiler memory barrier. Or we can add RTE_ARCH_X86_AMD for using 
> >> processor memory barrier, in this case, if didn't add
> the
> >> macro, the memory ordering will not be guaranteed. Which macro is better?
> >> If this patch applied, the PMD's old implementation of compiler memory 
> >> barrier (some volatile variable) can be fixed with
> rte_rmb()
> >> and rte_wmb() for any architecture.
> >>
> >> ---
> >>   lib/librte_eal/common/include/arch/x86/rte_atomic.h | 10 ++
> >>   1 file changed, 10 insertions(+)
> >>
> >> diff --git a/lib/librte_eal/common/include/arch/x86/rte_atomic.h 
> >> b/lib/librte_eal/common/include/arch/x86/rte_atomic.h
> >> index e93e8ee..52b1e81 100644
> >> --- a/lib/librte_eal/common/include/arch/x86/rte_atomic.h
> >> +++ b/lib/librte_eal/common/include/arch/x86/rte_atomic.h
> >> @@ -49,10 +49,20 @@ extern "C" {
> >>
> >>   #define  rte_mb() _mm_mfence()
> >>
> >> +#ifdef RTE_ARCH_X86_IA
> >> +
> >> +#define rte_wmb() rte_compiler_barrier()
> >> +
> >> +#define rte_rmb() rte_compiler_barrier()
> >> +
> >> +#else
> >> +
> >>   #define  rte_wmb() _mm_sfence()
> >>
> >>   #define  rte_rmb() _mm_lfence()
> >>
> >> +#endif
> >> +
> >>   /*- 16 bit atomic operations 
> >> -*/
> >>
> >>   #ifndef RTE_FORCE_INTRINSICS
> >> --
> >> 1.9.1
> >

[dpdk-dev] [PATCH 3/4] bnx2x: new poll mode driver (part2)

2015-05-07 Thread Stephen Hemminger

From: Sergey Kreys 

Add driver for the Broadcom NetXtremeII 10 gigabit devices.

Signed-off-by: Stephen Hemminger 

---
v3 - this is same as previous bcm driver just renamed
v4 - split into two pieces for review.
---
 lib/librte_pmd_bnx2x/debug.c  |   113 +
 lib/librte_pmd_bnx2x/ecore_fw_defs.h  |   422 +
 lib/librte_pmd_bnx2x/ecore_hsi.h  |  6348 +++
 lib/librte_pmd_bnx2x/ecore_init.h |   841 ++
 lib/librte_pmd_bnx2x/ecore_init_ops.h |   886 +++
 lib/librte_pmd_bnx2x/ecore_mfw_req.h  |   206 +
 lib/librte_pmd_bnx2x/ecore_reg.h  |  3663 +
 lib/librte_pmd_bnx2x/ecore_sp.c   |  5455 +
 lib/librte_pmd_bnx2x/ecore_sp.h   |  1795 +
 lib/librte_pmd_bnx2x/elink.c  | 13378 
 lib/librte_pmd_bnx2x/elink.h  |   609 ++
 11 files changed, 33716 insertions(+)
 create mode 100644 lib/librte_pmd_bnx2x/debug.c
 create mode 100644 lib/librte_pmd_bnx2x/ecore_fw_defs.h
 create mode 100644 lib/librte_pmd_bnx2x/ecore_hsi.h
 create mode 100644 lib/librte_pmd_bnx2x/ecore_init.h
 create mode 100644 lib/librte_pmd_bnx2x/ecore_init_ops.h
 create mode 100644 lib/librte_pmd_bnx2x/ecore_mfw_req.h
 create mode 100644 lib/librte_pmd_bnx2x/ecore_reg.h
 create mode 100644 lib/librte_pmd_bnx2x/ecore_sp.c
 create mode 100644 lib/librte_pmd_bnx2x/ecore_sp.h
 create mode 100644 lib/librte_pmd_bnx2x/elink.c
 create mode 100644 lib/librte_pmd_bnx2x/elink.h

diff --git a/lib/librte_pmd_bnx2x/debug.c b/lib/librte_pmd_bnx2x/debug.c
new file mode 100644
index 000..ba51f1a
--- /dev/null
+++ b/lib/librte_pmd_bnx2x/debug.c
@@ -0,0 +1,113 @@
+/*-
+ * Copyright (c) 2007-2013 Broadcom Corporation. All rights reserved.
+ *
+ * Eric Davis
+ * David Christensen 
+ * Gary Zambrano 
+ *
+ * Copyright (c) 2013-2015 Brocade Communications Systems, Inc.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright
+ *notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *notice, this list of conditions and the following disclaimer in the
+ *documentation and/or other materials provided with the distribution.
+ * 3. Neither the name of Broadcom Corporation nor the name of its contributors
+ *may be used to endorse or promote products derived from this software
+ *without specific prior written consent.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS'
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS
+ * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF
+ * THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "bnx2x.h"
+
+
+/*
+ * Debug versions of the 8/16/32 bit OS register read/write functions to
+ * capture/display values read/written from/to the controller.
+ */
+void
+bnx2x_reg_write8(struct bnx2x_softc *sc, size_t offset, uint8_t val)
+{
+   PMD_DRV_LOG(DEBUG, "offset=0x%08lx val=0x%02x", offset, val);
+   *((volatile uint8_t*)((uint64_t)sc->bar[BAR0].base_addr + offset)) = 
val;
+}
+
+void
+bnx2x_reg_write16(struct bnx2x_softc *sc, size_t offset, uint16_t val)
+{
+   if ((offset % 2) != 0) {
+   PMD_DRV_LOG(DEBUG, "Unaligned 16-bit write to 0x%08lx", offset);
+   }
+
+   PMD_DRV_LOG(DEBUG, "offset=0x%08lx val=0x%04x", offset, val);
+   *((volatile uint16_t*)((uint64_t)sc->bar[BAR0].base_addr + offset)) = 
val;
+}
+
+void
+bnx2x_reg_write32(struct bnx2x_softc *sc, size_t offset, uint32_t val)
+{
+   if ((offset % 4) != 0) {
+   PMD_DRV_LOG(DEBUG, "Unaligned 32-bit write to 0x%08lx", offset);
+   }
+
+   PMD_DRV_LOG(DEBUG, "offset=0x%08lx val=0x%08x", offset, val);
+   *((volatile uint32_t*)((uint64_t)sc->bar[BAR0].base_addr + offset)) = 
val;
+}
+
+uint8_t
+bnx2x_reg_read8(struct bnx2x_softc *sc, size_t offset)
+{
+   uint8_t val;
+
+   val = (uint8_t)(*((volatile uint8_t*)((uint64_t)sc->bar[BAR0].base_addr 
+ offset)));
+   PMD_DRV_LOG(DEBUG, "offset=0x%08lx val=0x%02x", offset, val);
+
+   return (val);
+}
+
+uint16_t
+bnx2x_reg_read16(struct bnx2x_softc *sc, size_t offset)
+{
+   uint16_t val;
+
+   if ((offset % 2) != 0) {
+

[dpdk-dev] [PATCH 2/4] bnx2x: new poll mode driver (part1)

2015-05-07 Thread Stephen Hemminger

From: Sergey Kreys 

Add driver for the Broadcom NetXtremeII 10 gigabit devices.

Signed-off-by: Stephen Hemminger 

---
v3 - this is same as previous bcm driver just renamed
v4 - split into two pieces for review.
---
 lib/librte_pmd_bnx2x/Makefile   |28 +
 lib/librte_pmd_bnx2x/bnx2x.c| 11816 ++
 lib/librte_pmd_bnx2x/bnx2x.h|  1998 ++
 lib/librte_pmd_bnx2x/bnx2x_ethdev.c |   542 ++
 lib/librte_pmd_bnx2x/bnx2x_ethdev.h |79 +
 lib/librte_pmd_bnx2x/bnx2x_logs.h   |51 +
 lib/librte_pmd_bnx2x/bnx2x_rxtx.c   |   487 ++
 lib/librte_pmd_bnx2x/bnx2x_rxtx.h   |85 +
 lib/librte_pmd_bnx2x/bnx2x_stats.c  |  1619 +
 lib/librte_pmd_bnx2x/bnx2x_stats.h  |   632 ++
 lib/librte_pmd_bnx2x/bnx2x_vfpf.c   |   597 ++
 lib/librte_pmd_bnx2x/bnx2x_vfpf.h   |   315 +
 12 files changed, 18249 insertions(+)
 create mode 100644 lib/librte_pmd_bnx2x/Makefile
 create mode 100644 lib/librte_pmd_bnx2x/bnx2x.c
 create mode 100644 lib/librte_pmd_bnx2x/bnx2x.h
 create mode 100644 lib/librte_pmd_bnx2x/bnx2x_ethdev.c
 create mode 100644 lib/librte_pmd_bnx2x/bnx2x_ethdev.h
 create mode 100644 lib/librte_pmd_bnx2x/bnx2x_logs.h
 create mode 100644 lib/librte_pmd_bnx2x/bnx2x_rxtx.c
 create mode 100644 lib/librte_pmd_bnx2x/bnx2x_rxtx.h
 create mode 100644 lib/librte_pmd_bnx2x/bnx2x_stats.c
 create mode 100644 lib/librte_pmd_bnx2x/bnx2x_stats.h
 create mode 100644 lib/librte_pmd_bnx2x/bnx2x_vfpf.c
 create mode 100644 lib/librte_pmd_bnx2x/bnx2x_vfpf.h

diff --git a/lib/librte_pmd_bnx2x/Makefile b/lib/librte_pmd_bnx2x/Makefile
new file mode 100644
index 000..0de5db9
--- /dev/null
+++ b/lib/librte_pmd_bnx2x/Makefile
@@ -0,0 +1,28 @@
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_bnx2x.a
+
+CFLAGS += -O3 -g
+CFLAGS += $(WERROR_FLAGS)
+CFLAGS += -DZLIB_CONST
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_BNX2X_PMD) += bnx2x.c
+SRCS-$(CONFIG_RTE_LIBRTE_BNX2X_PMD) += bnx2x_rxtx.c
+SRCS-$(CONFIG_RTE_LIBRTE_BNX2X_PMD) += bnx2x_stats.c
+SRCS-$(CONFIG_RTE_LIBRTE_BNX2X_PMD) += bnx2x_ethdev.c
+SRCS-$(CONFIG_RTE_LIBRTE_BNX2X_PMD) += ecore_sp.c
+SRCS-$(CONFIG_RTE_LIBRTE_BNX2X_PMD) += elink.c
+SRCS-$(CONFIG_RTE_LIBRTE_BNX2X_PMD) += bnx2x_vfpf.c
+SRCS-$(CONFIG_RTE_LIBRTE_BNX2X_DEBUG) += debug.c
+
+# this lib depends upon:
+DEPDIRS-$(CONFIG_RTE_LIBRTE_BNX2X_PMD) += lib/librte_eal lib/librte_ether 
lib/librte_hash
+DEPDIRS-$(CONFIG_RTE_LIBRTE_BNX2X_PMD) += lib/librte_mempool lib/librte_mbuf
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_pmd_bnx2x/bnx2x.c b/lib/librte_pmd_bnx2x/bnx2x.c
new file mode 100644
index 000..c7f898e
--- /dev/null
+++ b/lib/librte_pmd_bnx2x/bnx2x.c
@@ -0,0 +1,11816 @@
+/*-
+ * Copyright (c) 2007-2013 Broadcom Corporation. All rights reserved.
+ *
+ * Eric Davis
+ * David Christensen 
+ * Gary Zambrano 
+ *
+ * Copyright (c) 2013-2015 Brocade Communications Systems, Inc.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright
+ *notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *notice, this list of conditions and the following disclaimer in the
+ *documentation and/or other materials provided with the distribution.
+ * 3. Neither the name of Broadcom Corporation nor the name of its contributors
+ *may be used to endorse or promote products derived from this software
+ *without specific prior written consent.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS'
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS
+ * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF
+ * THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#define BNX2X_DRIVER_VERSION "1.78.18"
+
+#include "bnx2x.h"
+#include "bnx2x_vfpf.h"
+#include "ecore_sp.h"
+#include "ecore_init.h"
+#include "ecore_init_ops.h"
+
+#include "rte_pci_dev_ids.h"
+
+#include 
+#include 
+#include 
+#include 
+
+static z_stream zlib_stream;
+
+#define EVL_VLID_MASK 0x0FFF
+
+#define BNX2X_DEF_SB_ATT_IDX 0x0001
+#define BNX2X_DEF_SB_IDX 0x0002
+
+/*
+ * FLR Support - bnx2x_pf_flr_clnup() is called during nic_load in the per
+ *

[dpdk-dev] [PATCH 1/4] pci: allow access to PCI config space

2015-05-07 Thread Stephen Hemminger

From: Stephen Hemminger 

Some drivers need ability to access PCI config (for example for power
management). This adds an abstraction to do this; only implemented
on Linux, but should be possible on BSD.

Signed-off-by: Stephen Hemminger 
---
 lib/librte_eal/common/include/rte_pci.h | 28 +++
 lib/librte_eal/linuxapp/eal/eal_pci.c   | 48 +
 lib/librte_eal/linuxapp/eal/eal_pci_init.h  | 11 ++
 lib/librte_eal/linuxapp/eal/eal_pci_uio.c   | 14 
 lib/librte_eal/linuxapp/eal/eal_pci_vfio.c  | 16 +
 lib/librte_eal/linuxapp/eal/rte_eal_version.map |  2 ++
 6 files changed, 119 insertions(+)

diff --git a/lib/librte_eal/common/include/rte_pci.h 
b/lib/librte_eal/common/include/rte_pci.h
index 223d3cd..cea982a 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -393,6 +393,34 @@ void rte_eal_pci_register(struct rte_pci_driver *driver);
  */
 void rte_eal_pci_unregister(struct rte_pci_driver *driver);

+/**
+ * Read PCI config space.
+ *
+ * @param device
+ *   A pointer to a rte_pci_device structure describing the device
+ *   to use
+ * @param buf
+ *   A data buffer where the bytes should be read into
+ * @param size
+ *   The length of the data buffer.
+ */
+int rte_eal_pci_read_config(const struct rte_pci_device *device,
+   void *buf, size_t len, off_t offset);
+
+/**
+ * Write PCI config space.
+ *
+ * @param device
+ *   A pointer to a rte_pci_device structure describing the device
+ *   to use
+ * @param buf
+ *   A data buffer containing the bytes should be written
+ * @param size
+ *   The length of the data buffer.
+ */
+int rte_eal_pci_write_config(const struct rte_pci_device *device,
+const void *buf, size_t len, off_t offset);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c 
b/lib/librte_eal/linuxapp/eal/eal_pci.c
index d2adc66..6d79a08 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -756,6 +756,54 @@ rte_eal_pci_close_one_driver(struct rte_pci_driver *dr 
__rte_unused,
 }
 #endif /* RTE_LIBRTE_EAL_HOTPLUG */

+/* Read PCI config space. */
+int rte_eal_pci_read_config(const struct rte_pci_device *device,
+   void *buf, size_t len, off_t offset)
+{
+   const struct rte_intr_handle *intr_handle = >intr_handle;
+
+   switch (intr_handle->type) {
+   case RTE_INTR_HANDLE_UIO:
+   return pci_uio_read_config(intr_handle, buf, len, offset);
+
+#ifdef VFIO_PRESENT
+   case RTE_INTR_HANDLE_VFIO_MSIX:
+   case RTE_INTR_HANDLE_VFIO_MSI:
+   case RTE_INTR_HANDLE_VFIO_LEGACY:
+   return pci_vfio_read_config(intr_handle, buf, len, offset);
+#endif
+   default:
+   RTE_LOG(ERR, EAL,
+   "Unknown handle type of fd %d\n",
+   intr_handle->fd);
+   return -1;
+   }
+}
+
+/* Write PCI config space. */
+int rte_eal_pci_write_config(const struct rte_pci_device *device,
+const void *buf, size_t len, off_t offset)
+{
+   const struct rte_intr_handle *intr_handle = >intr_handle;
+
+   switch (intr_handle->type) {
+   case RTE_INTR_HANDLE_UIO:
+   return pci_uio_write_config(intr_handle, buf, len, offset);
+
+#ifdef VFIO_PRESENT
+   case RTE_INTR_HANDLE_VFIO_MSIX:
+   case RTE_INTR_HANDLE_VFIO_MSI:
+   case RTE_INTR_HANDLE_VFIO_LEGACY:
+   return pci_vfio_write_config(intr_handle, buf, len, offset);
+#endif
+   default:
+   RTE_LOG(ERR, EAL,
+   "Unknown handle type of fd %d\n",
+   intr_handle->fd);
+   return -1;
+   }
+}
+
 /* Init the PCI EAL subsystem */
 int
 rte_eal_pci_init(void)
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_init.h 
b/lib/librte_eal/linuxapp/eal/eal_pci_init.h
index aa7b755..c28e5b0 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci_init.h
+++ b/lib/librte_eal/linuxapp/eal/eal_pci_init.h
@@ -68,6 +68,11 @@ void *pci_find_max_end_va(void);
 void *pci_map_resource(void *requested_addr, int fd, off_t offset,
   size_t size, int additional_flags);

+int pci_uio_read_config(const struct rte_intr_handle *intr_handle,
+   void *buf, size_t len, off_t offs);
+int pci_uio_write_config(const struct rte_intr_handle *intr_handle,
+const void *buf, size_t len, off_t offs);
+
 /* map IGB_UIO resource prototype */
 int pci_uio_map_resource(struct rte_pci_device *dev);

@@ -86,6 +91,12 @@ int pci_vfio_enable(void);
 int pci_vfio_is_enabled(void);
 int pci_vfio_mp_sync_setup(void);

+/* access config space */
+int pci_vfio_read_config(const struct rte_intr_handle *intr_handle,
+void *buf, size_t len, off_t offs);
+int

[dpdk-dev] [PATCH 0/4 v4] bnx2x: new poll mode driver

2015-05-07 Thread Stephen Hemminger

Update to previous sumbission.
 - Split main source of driver from one monster patch into two large patches
 - Add map files for shared library support

Stephen Hemminger (4):
  pci: allow access to PCI config space
  bnx2x: new poll mode driver (part1)
  bnx2x: new poll mode driver (part2)
  bnx2x: enable BNX2X poll mode driver

 MAINTAINERS | 3 +
 config/common_linuxapp  |10 +
 lib/Makefile| 1 +
 lib/librte_eal/common/include/rte_pci.h |28 +
 lib/librte_eal/common/include/rte_pci_dev_ids.h |30 +
 lib/librte_eal/linuxapp/eal/eal_pci.c   |48 +
 lib/librte_eal/linuxapp/eal/eal_pci_init.h  |11 +
 lib/librte_eal/linuxapp/eal/eal_pci_uio.c   |14 +
 lib/librte_eal/linuxapp/eal/eal_pci_vfio.c  |16 +
 lib/librte_eal/linuxapp/eal/rte_eal_version.map | 2 +
 lib/librte_pmd_bnx2x/Makefile   |28 +
 lib/librte_pmd_bnx2x/bnx2x.c| 11816 +++
 lib/librte_pmd_bnx2x/bnx2x.h|  1998 
 lib/librte_pmd_bnx2x/bnx2x_ethdev.c |   542 +
 lib/librte_pmd_bnx2x/bnx2x_ethdev.h |79 +
 lib/librte_pmd_bnx2x/bnx2x_logs.h   |51 +
 lib/librte_pmd_bnx2x/bnx2x_rxtx.c   |   487 +
 lib/librte_pmd_bnx2x/bnx2x_rxtx.h   |85 +
 lib/librte_pmd_bnx2x/bnx2x_stats.c  |  1619 +++
 lib/librte_pmd_bnx2x/bnx2x_stats.h  |   632 +
 lib/librte_pmd_bnx2x/bnx2x_vfpf.c   |   597 +
 lib/librte_pmd_bnx2x/bnx2x_vfpf.h   |   315 +
 lib/librte_pmd_bnx2x/debug.c|   113 +
 lib/librte_pmd_bnx2x/ecore_fw_defs.h|   422 +
 lib/librte_pmd_bnx2x/ecore_hsi.h|  6348 ++
 lib/librte_pmd_bnx2x/ecore_init.h   |   841 ++
 lib/librte_pmd_bnx2x/ecore_init_ops.h   |   886 ++
 lib/librte_pmd_bnx2x/ecore_mfw_req.h|   206 +
 lib/librte_pmd_bnx2x/ecore_reg.h|  3663 ++
 lib/librte_pmd_bnx2x/ecore_sp.c |  5455 +
 lib/librte_pmd_bnx2x/ecore_sp.h |  1795 +++
 lib/librte_pmd_bnx2x/elink.c| 13378 ++
 lib/librte_pmd_bnx2x/elink.h|   609 +
 lib/librte_pmd_bnx2x/rte_pmd_bnx2x_version.map  | 4 +
 mk/rte.app.mk   | 8 +
 35 files changed, 52140 insertions(+)
 create mode 100644 lib/librte_pmd_bnx2x/Makefile
 create mode 100644 lib/librte_pmd_bnx2x/bnx2x.c
 create mode 100644 lib/librte_pmd_bnx2x/bnx2x.h
 create mode 100644 lib/librte_pmd_bnx2x/bnx2x_ethdev.c
 create mode 100644 lib/librte_pmd_bnx2x/bnx2x_ethdev.h
 create mode 100644 lib/librte_pmd_bnx2x/bnx2x_logs.h
 create mode 100644 lib/librte_pmd_bnx2x/bnx2x_rxtx.c
 create mode 100644 lib/librte_pmd_bnx2x/bnx2x_rxtx.h
 create mode 100644 lib/librte_pmd_bnx2x/bnx2x_stats.c
 create mode 100644 lib/librte_pmd_bnx2x/bnx2x_stats.h
 create mode 100644 lib/librte_pmd_bnx2x/bnx2x_vfpf.c
 create mode 100644 lib/librte_pmd_bnx2x/bnx2x_vfpf.h
 create mode 100644 lib/librte_pmd_bnx2x/debug.c
 create mode 100644 lib/librte_pmd_bnx2x/ecore_fw_defs.h
 create mode 100644 lib/librte_pmd_bnx2x/ecore_hsi.h
 create mode 100644 lib/librte_pmd_bnx2x/ecore_init.h
 create mode 100644 lib/librte_pmd_bnx2x/ecore_init_ops.h
 create mode 100644 lib/librte_pmd_bnx2x/ecore_mfw_req.h
 create mode 100644 lib/librte_pmd_bnx2x/ecore_reg.h
 create mode 100644 lib/librte_pmd_bnx2x/ecore_sp.c
 create mode 100644 lib/librte_pmd_bnx2x/ecore_sp.h
 create mode 100644 lib/librte_pmd_bnx2x/elink.c
 create mode 100644 lib/librte_pmd_bnx2x/elink.h
 create mode 100644 lib/librte_pmd_bnx2x/rte_pmd_bnx2x_version.map

-- 
2.1.4

[dpdk-dev] Beyond DPDK 2.0

2015-05-07 Thread Wiles, Keith



On 5/7/15, 8:33 AM, "Avi Kivity"  wrote:

>On 05/07/2015 06:27 PM, Wiles, Keith wrote:
>>
>> On 5/7/15, 7:02 AM, "Avi Kivity"  wrote:
>>
>>> On Wed, Apr 22, 2015 at 6:11 PM, O'Driscoll, Tim
>>> 
>>> wrote:
>>>
 Does anybody have any input or comments on this?


> -Original Message-
> From: O'Driscoll, Tim
> Sent: Thursday, April 16, 2015 11:39 AM
> To: dev at dpdk.org
> Subject: Beyond DPDK 2.0
>
> Following the launch of DPDK by Intel as an internal development
> project, the launch of dpdk.org by 6WIND in 2013, and the first DPDK
 RPM
> packages for Fedora in 2014, 6WIND, Red Hat and Intel would like to
> prepare for future releases after DPDK 2.0 by starting a discussion
>on
> its evolution. Anyone is welcome to join this initiative.
>
> Since then, the project has grown significantly:
> -The number of commits and mailing list posts has increased
> steadily.
> -Support has been added for a wide range of new NICs (Mellanox
> support submitted by 6WIND, Cisco VIC, Intel i40e and fm10k etc.).
> -DPDK is now supported on multiple architectures (IBM Power
 support
> in DPDK 1.8, Tile support submitted by EZchip but not yet reviewed or
> applied).
>
> While this is great progress, we need to make sure that the project
>is
> structured in a way that enables it to continue to grow. To achieve
> this, 6WIND, Red Hat and Intel would like to start a discussion about
> the future of the project, so that we can agree and establish
 processes
> that satisfy the needs of the current and future DPDK community.
>
> We're very interested in hearing the views of everybody in the
> community. In addition to debate on the mailing list, we'll also
> schedule community calls to discuss this.
>
>
> Project Goals
> -
>
> Some topics to be considered for the DPDK project include:
> -Project Charter: The charter of the DPDK project should be
 clearly
> defined, and should explain the limits of DPDK (what it does and does
> not cover). This does not mean that we would be stuck with a singular
> charter for all time, but the direction and intent of the project
 should
> be well understood.
>>>
>>> One problem we've seen with dpdk is that it is a framework, not a
>>>library:
>>> it wants to create threads, manage memory, and generally take over.
>>>This
>>> is a problem for us, as we are writing a framework (seastar, [1]) and
>>>need
>>> to create threads, manage memory, and generally take over ourselves.
>>>
>>> Perhaps dpdk can be split into two layers, a library layer that only
>>> provides mechanisms, and a framework layer that glues together those
>>> mechanisms and applies a policy, trading in generality for ease of use.
>> The DPDK system is somewhat divided now between the EAL, PMDS and
>>utility
>> functions like malloc/rings/?
>>
>> The problem I see is the PMDs need a framework to be usable and the EAL
>> plus the ethdev layers provide that support today. Setting up and
>> initializing the DPDK system is pretty clean just call the EAL init
>> routines along with the pool creates and the basic configs for the
>> PMDs/hardware. Once the system is inited one can create new threads and
>> not requiring anyone to use DPDK launch routines. Maybe I am not
>> understanding your needs can you explain more?
>
>An initialization routine that accepts argc/argv can hardly be called
>clean.

You want a config file or structure initialization design? If that is the
case you can contribute that support as another way to initialize DPDK.
>
>In seastar, we have our own malloc() (since seastar is sharded we can
>provide a faster thread-unsafe malloc implementation).  We also have our
>own threading, and since dpdk is an optional component in seastar, dpdk
>support requires code duplication.

DPDK replies one the huge page support for allocation to get the
performance, do you also not require huge page support. The malloc system
in DPDK can be used as a replacement for the standard malloc if that works
for your needs. Also after DPDK inits you can use your own malloc and any
other tools you want to use. I do not see a lot of duplicate code here
IMO. I guess if you are installing into a very small memory system then
yes it could be a problem, but DPDK is was not designed to run in a system
with limited memory.

>
>I would like to launch my own threads, pin them where I like, and call
>PMD drivers to send and receive packets.  Practically everything else
>that dpdk does gets in my way, including mbuf pools.  I'd much prefer to
>allocate mbufs myself.

You do not need to use the lauching of threads in the EAL and can supply
your own, right?

Regards,
++Keith
>
>
>>> [1] http://seastar-project.org
>

[dpdk-dev] [PATCH 04/18] fm10k: add fm10k device id

2015-05-07 Thread David Marchand

On Thu, May 7, 2015 at 3:36 PM, Neil Horman  wrote:

> > I tried to reuse modinfo, but the problem is that kmod implementation is
> > checking the filename extension against .ko and .ko.gz.
> >
> Well, you can alter modinfo so that it looks at .so files if you like, but
> thats
> not the only tool you can use.  Truthfully you can just use objdump if you
> like.
>
> > I find it a bit too bad to have to rewrite this kind of tool just for
> dpdk
> > ... but on the other hand we would need something for bsd as well or we
> > give a shell script that rely on readelf to retrieve theis section.
> >
> See above, try objdump -j=.modinfo -S /path/to/kernel/module.  objdump
> doesn't
> care about file extensions, as long as its ELF.  With that you can:
>
> 1) Dump out any section contents you like
> 2) strip away the application top end, and just use libbfd to get at the
> elf
> contents if you like.
>

Yes, I reached the same conclusion.
Ok, I will see what I can do.

Thanks.


-- 
David Marchand

[dpdk-dev] Beyond DPDK 2.0

2015-05-07 Thread Wiles, Keith



On 5/7/15, 7:02 AM, "Avi Kivity"  wrote:

>On Wed, Apr 22, 2015 at 6:11 PM, O'Driscoll, Tim
>
>wrote:
>
>> Does anybody have any input or comments on this?
>>
>>
>> > -Original Message-
>> > From: O'Driscoll, Tim
>> > Sent: Thursday, April 16, 2015 11:39 AM
>> > To: dev at dpdk.org
>> > Subject: Beyond DPDK 2.0
>> >
>> > Following the launch of DPDK by Intel as an internal development
>> > project, the launch of dpdk.org by 6WIND in 2013, and the first DPDK
>>RPM
>> > packages for Fedora in 2014, 6WIND, Red Hat and Intel would like to
>> > prepare for future releases after DPDK 2.0 by starting a discussion on
>> > its evolution. Anyone is welcome to join this initiative.
>> >
>> > Since then, the project has grown significantly:
>> > -The number of commits and mailing list posts has increased
>> > steadily.
>> > -Support has been added for a wide range of new NICs (Mellanox
>> > support submitted by 6WIND, Cisco VIC, Intel i40e and fm10k etc.).
>> > -DPDK is now supported on multiple architectures (IBM Power
>>support
>> > in DPDK 1.8, Tile support submitted by EZchip but not yet reviewed or
>> > applied).
>> >
>> > While this is great progress, we need to make sure that the project is
>> > structured in a way that enables it to continue to grow. To achieve
>> > this, 6WIND, Red Hat and Intel would like to start a discussion about
>> > the future of the project, so that we can agree and establish
>>processes
>> > that satisfy the needs of the current and future DPDK community.
>> >
>> > We're very interested in hearing the views of everybody in the
>> > community. In addition to debate on the mailing list, we'll also
>> > schedule community calls to discuss this.
>> >
>> >
>> > Project Goals
>> > -
>> >
>> > Some topics to be considered for the DPDK project include:
>> > -Project Charter: The charter of the DPDK project should be
>>clearly
>> > defined, and should explain the limits of DPDK (what it does and does
>> > not cover). This does not mean that we would be stuck with a singular
>> > charter for all time, but the direction and intent of the project
>>should
>> > be well understood.
>>
>
>
>One problem we've seen with dpdk is that it is a framework, not a library:
>it wants to create threads, manage memory, and generally take over.  This
>is a problem for us, as we are writing a framework (seastar, [1]) and need
>to create threads, manage memory, and generally take over ourselves.
>
>Perhaps dpdk can be split into two layers, a library layer that only
>provides mechanisms, and a framework layer that glues together those
>mechanisms and applies a policy, trading in generality for ease of use.

The DPDK system is somewhat divided now between the EAL, PMDS and utility
functions like malloc/rings/?

The problem I see is the PMDs need a framework to be usable and the EAL
plus the ethdev layers provide that support today. Setting up and
initializing the DPDK system is pretty clean just call the EAL init
routines along with the pool creates and the basic configs for the
PMDs/hardware. Once the system is inited one can create new threads and
not requiring anyone to use DPDK launch routines. Maybe I am not
understanding your needs can you explain more?
>
>[1] http://seastar-project.org

[dpdk-dev] [PATCH / RFC] kni: Add set_rx_mode callback to handle multicast groups

2015-05-07 Thread Simon Kagstrom

This is needed to add / remove interfaces in multicast groups via the
ip tool.

The callback does nothing - the same as the kernel tun.c.

Signed-off-by: Simon Kagstrom 
---
Marked RFC since I'm by no means an expert on this. We noticed this
when playing with KNI and IGMP handling.

 lib/librte_eal/linuxapp/kni/kni_net.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/lib/librte_eal/linuxapp/kni/kni_net.c
b/lib/librte_eal/linuxapp/kni/kni_net.c index dd95db5..cf93c4b 100644
--- a/lib/librte_eal/linuxapp/kni/kni_net.c
+++ b/lib/librte_eal/linuxapp/kni/kni_net.c
@@ -495,6 +495,11 @@ kni_net_ioctl(struct net_device *dev, struct ifreq
*rq, int cmd) return 0;
 }

+static void
+kni_net_set_rx_mode(struct net_device *dev)
+{
+}
+
 static int
 kni_net_change_mtu(struct net_device *dev, int new_mtu)
 {
@@ -645,6 +650,7 @@ static const struct net_device_ops
kni_net_netdev_ops = { .ndo_start_xmit = kni_net_tx,
.ndo_change_mtu = kni_net_change_mtu,
.ndo_do_ioctl = kni_net_ioctl,
+   .ndo_set_rx_mode = kni_net_set_rx_mode,
.ndo_get_stats = kni_net_stats,
.ndo_tx_timeout = kni_net_tx_timeout,
.ndo_set_mac_address = kni_net_set_mac,
-- 
1.9.1

[dpdk-dev] Intel fortville not working with multi-segment

2015-05-07 Thread Nissim Nisimov

Hi,



I am trying to work with Intel Fortville (XL710) NICs in Passthrough mode from 
a VM running dpdk app.


First I didn't have any TX traffic from the VM, I got dpdk patch for this issue 
and it fixed it. (http://www.dpdk.org/dev/patchwork/patch/4588/)

But now I see that when trying to run multi-segment traffic not all the packets 
reaching the VM (I tested it on bare metal as well and saw the same issue)

I don't have support for TSO in my application. Do I need to turn the TSO for 
the NIC?

Is it a known issue? any workaround for it?

Thanks,
Nissim

[dpdk-dev] [PATCH v7 06/10] eal/linux: add interrupt vectors handling on VFIO

2015-05-07 Thread Liang, Cunming



On 5/6/2015 2:38 AM, Stephen Hemminger wrote:
> On Tue,  5 May 2015 13:39:42 +0800
> Cunming Liang  wrote:
>
>> diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_vfio.c 
>> b/lib/librte_eal/linuxapp/eal/eal_pci_vfio.c
>> index aea1fb1..387f54c 100644
>> --- a/lib/librte_eal/linuxapp/eal/eal_pci_vfio.c
>> +++ b/lib/librte_eal/linuxapp/eal/eal_pci_vfio.c
>> @@ -308,6 +308,18 @@ pci_vfio_setup_interrupts(struct rte_pci_device *dev, 
>> int vfio_dev_fd)
>>  case VFIO_PCI_MSIX_IRQ_INDEX:
>>  internal_config.vfio_intr_mode = RTE_INTR_MODE_MSIX;
>>  dev->intr_handle.type = RTE_INTR_HANDLE_VFIO_MSIX;
>> +for (i = 0; i < RTE_MAX_RXTX_INTR_VEC_ID; i++) {
>> +fd = eventfd(0, 0);
>> +if (fd < 0) {
>> +
> You should pass EFD_NONBLOCK and EFD_CLOEXEC as flags to any eventfd's created
> internally.
[LCM] Agree, make sense.

[dpdk-dev] [PATCH v7 03/10] eal/linux: add API to set rx interrupt event monitor

2015-05-07 Thread Liang, Cunming



On 5/6/2015 2:34 AM, Stephen Hemminger wrote:
> On Tue,  5 May 2015 13:39:39 +0800
> Cunming Liang  wrote:
>
>>   static void
>> +eal_intr_proc_rxtx_intr(int fd, struct rte_intr_handle *intr_handle)
>> +{
> Should be const intr_handle is not modified
[LCM] accept.

[dpdk-dev] How to use dpdk ovs

2015-05-07 Thread Gray, Mark D

> Subject: [dpdk-dev] How to use dpdk ovs
[...]
> Who can tell me, thanks a lot.
> 
you should post to discuss at openvswitch.org

[dpdk-dev] [PATCH v2] Add toeplitz hash algorithm used by RSS

2015-05-07 Thread Vladimir Medvedkin

Hi Andrey,

The main goal of this new functions is to calculate the hash which is equal
to the hash of the NIC.
According to XL710 datasheet table 7-5 for sctp input set consists of
IP4-S, IP4-D, SCTP-Verification-Tag. I don't see any NIC that uses QinQ or
single vlan tag, ip proto number, tunnel id, vxlan, etc for calculating RSS
hash. If it appear we can always update union rte_thash_tuple.
I think it should be like:
struct rte_ports {
uint16_t dport;
uint16_t sport;
};

union rte_thash_l4 {
struct  rte_ports ports;
uint32_tsctp_tag;
};
struct rte_ipv4_tuple {
uint32_tsrc_addr;
uint32_tdst_addr;
union rte_thash_l4 l4;
};
If it is necessary to distribute packets according to non standart tuples I
think it's more appropriate to use crc32 or jhash because of speed.
rte_softrss_be consumes 400-500 clocks for each 4-byte input at E3
1230v1 at 3.2GHz. This means for ipv4+tcp it consumes ~1500 clocks.
If you or someone still think you need general toeplitz hash I'll add it.

Regards,
Vladimir


2015-05-05 19:03 GMT+03:00 Chilikin, Andrey :

> Hi Vladimir,
>
> Why limit Toeplitz hash calculation to predefined tuples and length?
> Should it be more general, something like
> rte_softrss_be(void *input, uint32_t input_len, const uint8_t *rss_key) to
> enable hash calculation for an input of any size? It would be useful for
> distributing packets using some non-standard tuples, like hashing on QinQ
> or adding IP protocol to hash calculation to separate UDP and TCP flows or
> even some other fields from a packet, for example, tunnel ID from VXLAN
> headers. By the way, i40e already supports RSS for SCTP in addition to TCP
> and UDP and includes Verification Tag as well as SCTP source and
> destination ports for RSS hash.
>
> Regards,
> Andrey
>
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Vladimir
> > Medvedkin
> > Sent: Tuesday, May 5, 2015 2:20 PM
> > To: dev at dpdk.org
> > Subject: [dpdk-dev] [PATCH v2] Add toeplitz hash algorithm used by RSS
> >
> > Software implementation of the Toeplitz hash function used by RSS.
> > Can be used either for packet distribution on single queue NIC or for
> > simulating of RSS computation on specific NIC (for example after GRE
> header
> > decapsulating).
> >
> > v2 changes
> > - Add ipv6 support
> > - Various style fixes
> >
> > Signed-off-by: Vladimir Medvedkin 
> > ---
> >  lib/librte_hash/Makefile|   1 +
> >  lib/librte_hash/rte_thash.h | 209
> > 
> >  2 files changed, 210 insertions(+)
> >  create mode 100644 lib/librte_hash/rte_thash.h
> >
> > diff --git a/lib/librte_hash/Makefile b/lib/librte_hash/Makefile index
> > 3696cb1..981230b 100644
> > --- a/lib/librte_hash/Makefile
> > +++ b/lib/librte_hash/Makefile
> > @@ -49,6 +49,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_HASH) += rte_fbk_hash.c
> > SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include := rte_hash.h  SYMLINK-
> > $(CONFIG_RTE_LIBRTE_HASH)-include += rte_hash_crc.h  SYMLINK-
> > $(CONFIG_RTE_LIBRTE_HASH)-include += rte_jhash.h
> > +SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_thash.h
> >  SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_fbk_hash.h
> >
> >  # this lib needs eal
> > diff --git a/lib/librte_hash/rte_thash.h b/lib/librte_hash/rte_thash.h
> new file
> > mode 100644 index 000..42c7bf6
> > --- /dev/null
> > +++ b/lib/librte_hash/rte_thash.h
> > @@ -0,0 +1,209 @@
> > +/*-
> > + *   BSD LICENSE
> > + *
> > + *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
> > + *   All rights reserved.
> > + *
> > + *   Redistribution and use in source and binary forms, with or without
> > + *   modification, are permitted provided that the following conditions
> > + *   are met:
> > + *
> > + * * Redistributions of source code must retain the above copyright
> > + *   notice, this list of conditions and the following disclaimer.
> > + * * Redistributions in binary form must reproduce the above
> copyright
> > + *   notice, this list of conditions and the following disclaimer in
> > + *   the documentation and/or other materials provided with the
> > + *   distribution.
> > + * * Neither the name of Intel Corporation nor the names of its
> > + *   contributors may be used to endorse or promote products derived
> > + *   from this software without specific prior written permission.
> > + *
> > + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> > CONTRIBUTORS
> > + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
> > NOT
> > + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> > FITNESS FOR
> > + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> > COPYRIGHT
> > + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> > INCIDENTAL,
> > + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
> > NOT
> > + *   LIMITED TO,

[dpdk-dev] [PATCH 04/18] fm10k: add fm10k device id

2015-05-07 Thread David Marchand

Hello Neil,

Reviving this old thread.

On Sat, Jan 31, 2015 at 7:35 PM, Neil Horman  wrote:

> On Sat, Jan 31, 2015 at 05:55:07PM +0100, David Marchand wrote:
> > On Sat, Jan 31, 2015 at 5:32 PM, Neil Horman 
> wrote:
> >
> > > On Sat, Jan 31, 2015 at 05:07:28PM +0100, David Marchand wrote:
> > > > In the end, we miss something to have dpdk work automatically like it
> > > used
> > > > to be, before the pci devices ids were stripped out of igb_uio.
> > > >
> > > > I can see two solutions :
> > > > - all pmds export the pci device ids they support (this sounds like
> > > > modalias :-)) or they register into the eal that exports this
> information
> > > > for use by application, but to me the application should not bother
> with
> > > > this ...
> > > > - the pmd handles this automatically (like binding/unbinding on a
> kernel
> > > > driver), with a _runtime_ option to enable this behavior (default
> being
> > > "no
> > > > automatic bind")
> > > >
> > > > Comments ? Ideas ?
> > > >
> > > I like the modalias idea, as it transports a table for uio/vfio to
> > > identify with
> > > the binary that needs it, preventing a possible discrepancy in what the
> > > core
> > > dpdk library identifies as supported devices and what the pmd DSO's
> > > actually do
> > > support.
> > >
> >
> > Yes, but if a pmd can do this itself alone, there is no discrepancy
> either.
> >
> Yes, absolutely, but it needs to be done in a way that an external binary
> can
> inspect the object independently of its being loaded, and preferably in a
> non-programatic context (since the uio bind/unbind utilities are separate
> scripts).
>
> The modinfo method involves putting info into a special data section that
> gets
> processed as part of the kernel modpost build stage.  There, the additional
> section gets translated into a C file, and built into its own object thats
> attached to the final binary module.  We can so the same thing here if you
> like
>
> We could do something simpler, too.  For instance we could add a field to
> the
> struct that gets registered as part of the RTE_REGISTER_PMD macro, and
> export it
> via a new ethdev library call.  That would be very straightforward, but the
> implication there is that you would need a programatic method to
> interrogate
> that information (a binary to call into the dpdk), which is not part of
> how the
> bind/unbind scripts work today.
>
> > Thinking aloud ...
> > As long as the pmd does all the magic bindings, the dpdk core does not
> need
> > to know about it.
> Yes, I'm not suggesting anything other than the pmd be responsible for
> codifying
> its binding information.  Its how that information is exported to other
> utilities that potentially increases the complexity of this operation.
>
> > And if the dpdk core really needs to know about this (I can see no use
> case
> > here, just want to avoid being blocked later) a dynamic register system
> > would be fine too.
> Sure, I don't see a problem with that. If we properly macro-wrap this, we
> can
> likely add a dynamic registration mechanism without having to change the
> pmds
> later
>

Did you work on this ?


I have been playing with this, I will most likely propose patches soon.

My preferred approach is to dedicate an elf section for this ("? la" kernel
.modinfo).
I tried to reuse modinfo, but the problem is that kmod implementation is
checking the filename extension against .ko and .ko.gz.

I find it a bit too bad to have to rewrite this kind of tool just for dpdk
... but on the other hand we would need something for bsd as well or we
give a shell script that rely on readelf to retrieve theis section.


-- 
David Marchand

[dpdk-dev] [PATCH] enic: add support for enic in nic_uio driver for FreeBSD

2015-05-07 Thread David Marchand

On Thu, May 7, 2015 at 11:23 AM, Bruce Richardson <
bruce.richardson at intel.com> wrote:

> On Thu, May 07, 2015 at 09:19:09AM +0530, Sujith Sankar wrote:
> > This patch adds support for enic in the nic_uio driver so that enic
> could be used on FreeBSD.
> >
> > Signed-off-by: Sujith Sankar 
>
> Acked-by: Bruce Richardson 
>
>
Well this is not really bsd specific, as people who rely on
rte_pci_dev_ids.h header to find devices that must be bound to igb_uio and
consort, will also benefit from this fix.
By the way, I am working on removing these device ids from the eal, since
the pmds should be the only one that maintain their devices ids list.
Will send some patches soon.

Acked-by: David Marchand 


-- 
David Marchand

[dpdk-dev] [PATCH v2] Add toeplitz hash algorithm used by RSS

2015-05-07 Thread Chilikin, Andrey

Hi Vladimir,

Yes, at the moment NICs support limited input sets for hash calculation, but 
why limit SW for the same sets if it can be done in more general way and be 
easily scalable for HW updates? Using limited input set for RSS is not a 
feature of Toeplitz hash, but limitation of HW. I believe that general Toeplitz 
function will be more appropriate ? it will cover input sets currently 
supported by HW and also will be easily scalable for future HW. Also, talking 
about different NICs ? Niantic and Fortville, for example, have hash keys of 
different length, so rte_softrss() function should take into account hash key?s 
length as well.
Regards,
Andrey


From: Vladimir Medvedkin [mailto:medvedk...@gmail.com]
Sent: Thursday, May 7, 2015 11:28 AM
To: Chilikin, Andrey
Cc: dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH v2] Add toeplitz hash algorithm used by RSS

Hi Andrey,

The main goal of this new functions is to calculate the hash which is equal to 
the hash of the NIC.
According to XL710 datasheet table 7-5 for sctp input set consists of IP4-S, 
IP4-D, SCTP-Verification-Tag. I don't see any NIC that uses QinQ or single vlan 
tag, ip proto number, tunnel id, vxlan, etc for calculating RSS hash. If it 
appear we can always update union rte_thash_tuple.
I think it should be like:
struct rte_ports {
uint16_t dport;
uint16_t sport;
};

union rte_thash_l4 {
struct  rte_ports ports;
uint32_tsctp_tag;
};
struct rte_ipv4_tuple {
uint32_tsrc_addr;
uint32_tdst_addr;
union rte_thash_l4 l4;
};
If it is necessary to distribute packets according to non standart tuples I 
think it's more appropriate to use crc32 or jhash because of speed. 
rte_softrss_be consumes 400-500 clocks for each 4-byte input at E3 1230v1 at 
3.2GHz. This means for ipv4+tcp it consumes ~1500 
clocks.
If you or someone still think you need general toeplitz hash I'll add it.
Regards,
Vladimir


2015-05-05 19:03 GMT+03:00 Chilikin, Andrey mailto:andrey.chilikin at intel.com>>:
Hi Vladimir,

Why limit Toeplitz hash calculation to predefined tuples and length? Should it 
be more general, something like
rte_softrss_be(void *input, uint32_t input_len, const uint8_t *rss_key) to 
enable hash calculation for an input of any size? It would be useful for 
distributing packets using some non-standard tuples, like hashing on QinQ or 
adding IP protocol to hash calculation to separate UDP and TCP flows or even 
some other fields from a packet, for example, tunnel ID from VXLAN headers. By 
the way, i40e already supports RSS for SCTP in addition to TCP and UDP and 
includes Verification Tag as well as SCTP source and destination ports for RSS 
hash.

Regards,
Andrey

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On 
> Behalf Of Vladimir
> Medvedkin
> Sent: Tuesday, May 5, 2015 2:20 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v2] Add toeplitz hash algorithm used by RSS
>
> Software implementation of the Toeplitz hash function used by RSS.
> Can be used either for packet distribution on single queue NIC or for
> simulating of RSS computation on specific NIC (for example after GRE header
> decapsulating).
>
> v2 changes
> - Add ipv6 support
> - Various style fixes
>
> Signed-off-by: Vladimir Medvedkin mailto:medvedkinv 
> at gmail.com>>
> ---
>  lib/librte_hash/Makefile|   1 +
>  lib/librte_hash/rte_thash.h | 209
> 
>  2 files changed, 210 insertions(+)
>  create mode 100644 lib/librte_hash/rte_thash.h
>
> diff --git a/lib/librte_hash/Makefile b/lib/librte_hash/Makefile index
> 3696cb1..981230b 100644
> --- a/lib/librte_hash/Makefile
> +++ b/lib/librte_hash/Makefile
> @@ -49,6 +49,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_HASH) += rte_fbk_hash.c
> SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include := rte_hash.h  SYMLINK-
> $(CONFIG_RTE_LIBRTE_HASH)-include += rte_hash_crc.h  SYMLINK-
> $(CONFIG_RTE_LIBRTE_HASH)-include += rte_jhash.h
> +SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_thash.h
>  SYMLINK-$(CONFIG_RTE_LIBRTE_HASH)-include += rte_fbk_hash.h
>
>  # this lib needs eal
> diff --git a/lib/librte_hash/rte_thash.h b/lib/librte_hash/rte_thash.h new 
> file
> mode 100644 index 000..42c7bf6
> --- /dev/null
> +++ b/lib/librte_hash/rte_thash.h
> @@ -0,0 +1,209 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + * * Redistributions of source code must retain the above copyright
> + *   notice, this list of conditions and the following disclaimer.
> + * * Redistributions in binary form must reproduce the above copyright
> + *   notice, this list

[dpdk-dev] GitHub sandbox for the DPDK community

2015-05-07 Thread John W. Linville

On Wed, May 06, 2015 at 11:12:28AM +0300, Panu Matilainen wrote:

> Forcing a change of tools and workflows on everybody WILL create ill-will if
> nothing else.
> 
> Also please realize that not everybody sees GitHub as the greatest thing
> since sliced bread. It has quite some "Hotel California" aspects to it, and
> actually the imago of an average GH project is not that great: there are so
> many badly run and abandoned projects there that the first thought when I
> hear the word GitHub is "oh no, not one of those again" rather than "cool".
> I know I'm not alone in that thinking.

GitHub -- the SourceForge of the 21st century!

-- 
John W. LinvilleSomeday the world will need a hero, and you
linville at tuxdriver.com   might be all we have.  Be ready.

[dpdk-dev] [PATCH v3 3/6] hash: update jhash function with the latest available

2015-05-07 Thread Ananyev, Konstantin



> -Original Message-
> From: Ananyev, Konstantin
> Sent: Wednesday, May 06, 2015 5:11 PM
> To: De Lara Guarch, Pablo; dev at dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v3 3/6] hash: update jhash function with the 
> latest available
> 
> Hi Pablo,
> 
> > -Original Message-
> > From: De Lara Guarch, Pablo
> > Sent: Wednesday, May 06, 2015 10:36 AM
> > To: Ananyev, Konstantin; dev at dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH v3 3/6] hash: update jhash function with the 
> > latest available
> >
> > Hi Konstantin,
> >
> > > -Original Message-
> > > From: Ananyev, Konstantin
> > > Sent: Wednesday, May 06, 2015 1:36 AM
> > > To: De Lara Guarch, Pablo; dev at dpdk.org
> > > Subject: RE: [dpdk-dev] [PATCH v3 3/6] hash: update jhash function with 
> > > the
> > > latest available
> > >
> > >
> > > Hi Pablo,
> > >
> > > > -Original Message-
> > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Pablo de Lara
> > > > Sent: Tuesday, May 05, 2015 3:44 PM
> > > > To: dev at dpdk.org
> > > > Subject: [dpdk-dev] [PATCH v3 3/6] hash: update jhash function with the
> > > latest available
> > > >
> > > > Jenkins hash function was developed originally in 1996,
> > > > and was integrated in first versions of DPDK.
> > > > The function has been improved in 2006,
> > > > achieving up to 60% better performance, compared to the original one.
> > > >
> > > > This patch integrates that code into the rte_jhash library.
> > > >
> > > > Signed-off-by: Pablo de Lara 
> > > > ---
> > > >  lib/librte_hash/rte_jhash.h |  261
> > > +++
> > > >  1 files changed, 188 insertions(+), 73 deletions(-)
> > > >
> > > > diff --git a/lib/librte_hash/rte_jhash.h b/lib/librte_hash/rte_jhash.h
> > > > index a4bf5a1..0e96b7c 100644
> > > > --- a/lib/librte_hash/rte_jhash.h
> > > > +++ b/lib/librte_hash/rte_jhash.h
> > > > @@ -1,7 +1,7 @@
> > > >  /*-
> > > >   *   BSD LICENSE
> > > >   *
> > > > - *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
> > > > + *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
> > > >   *   All rights reserved.
> > > >   *
> > > >   *   Redistribution and use in source and binary forms, with or without
> > > > @@ -45,38 +45,68 @@ extern "C" {
> > > >  #endif
> > > >
> > > >  #include 
> > > > +#include 
> > > > +#include 
> > > >
> > > >  /* jhash.h: Jenkins hash support.
> > > >   *
> > > > - * Copyright (C) 1996 Bob Jenkins (bob_jenkins at burtleburtle.net)
> > > > + * Copyright (C) 2006 Bob Jenkins (bob_jenkins at burtleburtle.net)
> > > >   *
> > > >   * http://burtleburtle.net/bob/hash/
> > > >   *
> > > >   * These are the credits from Bob's sources:
> > > >   *
> > > > - * lookup2.c, by Bob Jenkins, December 1996, Public Domain.
> > > > - * hash(), hash2(), hash3, and mix() are externally useful functions.
> > > > - * Routines to test the hash are included if SELF_TEST is defined.
> > > > - * You can use this free for any purpose.  It has no warranty.
> > > > + * lookup3.c, by Bob Jenkins, May 2006, Public Domain.
> > > > + *
> > > > + * These are functions for producing 32-bit hashes for hash table 
> > > > lookup.
> > > > + * hashword(), hashlittle(), hashlittle2(), hashbig(), mix(), and 
> > > > final()
> > > > + * are externally useful functions.  Routines to test the hash are 
> > > > included
> > > > + * if SELF_TEST is defined.  You can use this free for any purpose.  
> > > > It's in
> > > > + * the public domain.  It has no warranty.
> > > >   *
> > > >   * $FreeBSD$
> > > >   */
> > > >
> > > > +#define rot(x, k) (((x) << (k)) | ((x) >> (32-(k
> > > > +
> > > >  /** @internal Internal function. NOTE: Arguments are modified. */
> > > >  #define __rte_jhash_mix(a, b, c) do { \
> > > > -   a -= b; a -= c; a ^= (c>>13); \
> > > > -   b -= c; b -= a; b ^= (a<<8); \
> > > > -   c -= a; c -= b; c ^= (b>>13); \
> > > > -   a -= b; a -= c; a ^= (c>>12); \
> > > > -   b -= c; b -= a; b ^= (a<<16); \
> > > > -   c -= a; c -= b; c ^= (b>>5); \
> > > > -   a -= b; a -= c; a ^= (c>>3); \
> > > > -   b -= c; b -= a; b ^= (a<<10); \
> > > > -   c -= a; c -= b; c ^= (b>>15); \
> > > > +   a -= c; a ^= rot(c, 4); c += b; \
> > > > +   b -= a; b ^= rot(a, 6); a += c; \
> > > > +   c -= b; c ^= rot(b, 8); b += a; \
> > > > +   a -= c; a ^= rot(c, 16); c += b; \
> > > > +   b -= a; b ^= rot(a, 19); a += c; \
> > > > +   c -= b; c ^= rot(b, 4); b += a; \
> > > > +} while (0)
> > > > +
> > > > +#define __rte_jhash_final(a, b, c) do { \
> > > > +   c ^= b; c -= rot(b, 14); \
> > > > +   a ^= c; a -= rot(c, 11); \
> > > > +   b ^= a; b -= rot(a, 25); \
> > > > +   c ^= b; c -= rot(b, 16); \
> > > > +   a ^= c; a -= rot(c, 4);  \
> > > > +   b ^= a; b -= rot(a, 14); \
> > > > +   c ^= b; c -= rot(b, 24); \
> > > >  } while (0)
> > > >
> > > >  /** The golden ratio: an arbitrary value. */
> > > > -#define RTE_JHASH_GOLDEN_RATIO

[dpdk-dev] [PATCH] enic: add support for enic in nic_uio driver for FreeBSD

2015-05-07 Thread Bruce Richardson

On Thu, May 07, 2015 at 09:19:09AM +0530, Sujith Sankar wrote:
> This patch adds support for enic in the nic_uio driver so that enic could be 
> used on FreeBSD.
> 
> Signed-off-by: Sujith Sankar 

Acked-by: Bruce Richardson 

> ---
>  lib/librte_eal/bsdapp/nic_uio/nic_uio.c |  1 +
>  lib/librte_eal/common/include/rte_pci_dev_ids.h | 17 +
>  2 files changed, 18 insertions(+)
> 
> diff --git a/lib/librte_eal/bsdapp/nic_uio/nic_uio.c 
> b/lib/librte_eal/bsdapp/nic_uio/nic_uio.c
> index 5ae8560..e649e32 100644
> --- a/lib/librte_eal/bsdapp/nic_uio/nic_uio.c
> +++ b/lib/librte_eal/bsdapp/nic_uio/nic_uio.c
> @@ -113,6 +113,7 @@ struct pci_bdf {
>  #define RTE_PCI_DEV_ID_DECL_I40EVF(vend, dev)  {vend, dev},
>  #define RTE_PCI_DEV_ID_DECL_VIRTIO(vend, dev)  {vend, dev},
>  #define RTE_PCI_DEV_ID_DECL_VMXNET3(vend, dev) {vend, dev},
> +#define RTE_PCI_DEV_ID_DECL_ENIC(vend, dev){vend, dev},
>  
>  const struct device devices[] = {
>  #include 
> diff --git a/lib/librte_eal/common/include/rte_pci_dev_ids.h 
> b/lib/librte_eal/common/include/rte_pci_dev_ids.h
> index 21d2eed..5d1b285 100644
> --- a/lib/librte_eal/common/include/rte_pci_dev_ids.h
> +++ b/lib/librte_eal/common/include/rte_pci_dev_ids.h
> @@ -140,6 +140,10 @@
>  #define RTE_PCI_DEV_ID_DECL_FM10KVF(vend, dev)
>  #endif
>  
> +#ifndef RTE_PCI_DEV_ID_DECL_ENIC
> +#define RTE_PCI_DEV_ID_DECL_ENIC(vend, dev)
> +#endif
> +
>  #ifndef PCI_VENDOR_ID_INTEL
>  /** Vendor ID used by Intel devices */
>  #define PCI_VENDOR_ID_INTEL 0x8086
> @@ -155,6 +159,11 @@
>  #define PCI_VENDOR_ID_VMWARE 0x15AD
>  #endif
>  
> +#ifndef PCI_VENDOR_ID_CISCO
> +/** Vendor ID used by Cisco VIC devices */
> +#define PCI_VENDOR_ID_CISCO 0x1137
> +#endif
> +
>  / Physical EM devices from e1000_hw.h 
> /
>  
>  #define E1000_DEV_ID_825420x1000
> @@ -548,6 +557,14 @@ RTE_PCI_DEV_ID_DECL_VMXNET3(PCI_VENDOR_ID_VMWARE, 
> VMWARE_DEV_ID_VMXNET3)
>  
>  RTE_PCI_DEV_ID_DECL_FM10KVF(PCI_VENDOR_ID_INTEL, FM10K_DEV_ID_VF)
>  
> +/** Cisco VIC devices **/
> +
> +#define PCI_DEVICE_ID_CISCO_VIC_ENET 0x0043  /* ethernet vnic */
> +#define PCI_DEVICE_ID_CISCO_VIC_ENET_VF  0x0071  /* enet SRIOV VF */
> +
> +RTE_PCI_DEV_ID_DECL_ENIC(PCI_VENDOR_ID_CISCO, PCI_DEVICE_ID_CISCO_VIC_ENET)
> +RTE_PCI_DEV_ID_DECL_ENIC(PCI_VENDOR_ID_CISCO, 
> PCI_DEVICE_ID_CISCO_VIC_ENET_VF)
> +
>  /*
>   * Undef all RTE_PCI_DEV_ID_DECL_* here.
>   */
> -- 
> 1.9.1
>

[dpdk-dev] [PATCH v6 00/13] mbuf: enhancements of mbuf clones

2015-05-07 Thread Ananyev, Konstantin



> -Original Message-
> From: Olivier MATZ [mailto:olivier.matz at 6wind.com]
> Sent: Thursday, May 07, 2015 8:32 AM
> To: Xu, HuilongX; Zoltan Kiss; Ananyev, Konstantin; dev at dpdk.org
> Cc: Cao, Waterman; Cao, Min; Zhang, Helin
> Subject: Re: [dpdk-dev] [PATCH v6 00/13] mbuf: enhancements of mbuf clones
> 
> Hi Huilong,
> 
> On 05/07/2015 03:57 AM, Xu, HuilongX wrote:
> > Hi Olivier,
> > Today I find a compile error, when I test ip fragment on dpdk.org. would 
> > you check this?  thanks a lot.
> > My dpdk.org commit: a6d71fa7146cc04320c2485d6dde44c1d888d652
> > The compile error as below:
> > CC main.o
> > /root/dpdk/examples/ip_fragmentation/main.c: In function 'init_mem':
> > /root/dpdk/examples/ip_fragmentation/main.c:748:8: error: 'MBUF_DATA_SIZE' 
> > undeclared (first use in this function)
> >   0, MBUF_DATA_SIZE, socket);
> >  ^
> > /root/dpdk/examples/ip_fragmentation/main.c:748:8: note: each undeclared 
> > identifier is reported only once for each function it
> appears in
> 
> Sure, I'll have a look. Thanks for reporting.

Looks like a typo here.
Should be fixed by http://dpdk.org/ml/archives/dev/2015-April/017119.html.
Konstantin

> 
> Regards,
> Olivier
> 
> 
> >
> >
> > Best regards
> >
> > huilong
> >
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Zoltan Kiss
> > Sent: Friday, April 24, 2015 6:39 PM
> > To: Ananyev, Konstantin; Olivier Matz; dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v6 00/13] mbuf: enhancements of mbuf clones
> >
> > Hi,
> >
> > On 22/04/15 12:59, Ananyev, Konstantin wrote:
> >>
> >>
> >>> -Original Message-
> >>> From: Olivier Matz [mailto:olivier.matz at 6wind.com]
> >>> Sent: Wednesday, April 22, 2015 10:57 AM
> >>> To: dev at dpdk.org
> >>> Cc: Ananyev, Konstantin; zoltan.kiss at linaro.org; Richardson, Bruce; 
> >>> nhorman at tuxdriver.com; olivier.matz at 6wind.com
> >>> Subject: [PATCH v6 00/13] mbuf: enhancements of mbuf clones
> >>>
> >>> The first objective of this series is to fix the support of indirect
> >>> mbufs when the application reserves a private area in mbufs. It also
> >>> removes the limitation that rte_pktmbuf_clone() is only allowed on
> >>> direct (non-cloned) mbufs. The series also contains some enhancements
> >>> and fixes in the mbuf area that makes the implementation of the
> >>> last patches easier.
> >>>
> >>
> >> Acked-by: Konstantin Ananyev 
> >
> > When does this series get merged?
> >
> > Regards,
> >
> > Zoltan
> >

[dpdk-dev] [PATCH 04/18] fm10k: add fm10k device id

2015-05-07 Thread Neil Horman

On Thu, May 07, 2015 at 01:06:02PM +0200, David Marchand wrote:
> Hello Neil,
> 
> Reviving this old thread.
> 
> On Sat, Jan 31, 2015 at 7:35 PM, Neil Horman  wrote:
> 
> > On Sat, Jan 31, 2015 at 05:55:07PM +0100, David Marchand wrote:
> > > On Sat, Jan 31, 2015 at 5:32 PM, Neil Horman 
> > wrote:
> > >
> > > > On Sat, Jan 31, 2015 at 05:07:28PM +0100, David Marchand wrote:
> > > > > In the end, we miss something to have dpdk work automatically like it
> > > > used
> > > > > to be, before the pci devices ids were stripped out of igb_uio.
> > > > >
> > > > > I can see two solutions :
> > > > > - all pmds export the pci device ids they support (this sounds like
> > > > > modalias :-)) or they register into the eal that exports this
> > information
> > > > > for use by application, but to me the application should not bother
> > with
> > > > > this ...
> > > > > - the pmd handles this automatically (like binding/unbinding on a
> > kernel
> > > > > driver), with a _runtime_ option to enable this behavior (default
> > being
> > > > "no
> > > > > automatic bind")
> > > > >
> > > > > Comments ? Ideas ?
> > > > >
> > > > I like the modalias idea, as it transports a table for uio/vfio to
> > > > identify with
> > > > the binary that needs it, preventing a possible discrepancy in what the
> > > > core
> > > > dpdk library identifies as supported devices and what the pmd DSO's
> > > > actually do
> > > > support.
> > > >
> > >
> > > Yes, but if a pmd can do this itself alone, there is no discrepancy
> > either.
> > >
> > Yes, absolutely, but it needs to be done in a way that an external binary
> > can
> > inspect the object independently of its being loaded, and preferably in a
> > non-programatic context (since the uio bind/unbind utilities are separate
> > scripts).
> >
> > The modinfo method involves putting info into a special data section that
> > gets
> > processed as part of the kernel modpost build stage.  There, the additional
> > section gets translated into a C file, and built into its own object thats
> > attached to the final binary module.  We can so the same thing here if you
> > like
> >
> > We could do something simpler, too.  For instance we could add a field to
> > the
> > struct that gets registered as part of the RTE_REGISTER_PMD macro, and
> > export it
> > via a new ethdev library call.  That would be very straightforward, but the
> > implication there is that you would need a programatic method to
> > interrogate
> > that information (a binary to call into the dpdk), which is not part of
> > how the
> > bind/unbind scripts work today.
> >
> > > Thinking aloud ...
> > > As long as the pmd does all the magic bindings, the dpdk core does not
> > need
> > > to know about it.
> > Yes, I'm not suggesting anything other than the pmd be responsible for
> > codifying
> > its binding information.  Its how that information is exported to other
> > utilities that potentially increases the complexity of this operation.
> >
> > > And if the dpdk core really needs to know about this (I can see no use
> > case
> > > here, just want to avoid being blocked later) a dynamic register system
> > > would be fine too.
> > Sure, I don't see a problem with that. If we properly macro-wrap this, we
> > can
> > likely add a dynamic registration mechanism without having to change the
> > pmds
> > later
> >
> 
> Did you work on this ?
> 
No, I had assumed from our previous discussion that you were.

> 
> I have been playing with this, I will most likely propose patches soon.
> 
> My preferred approach is to dedicate an elf section for this ("? la" kernel
> .modinfo).
Yes, this makes sense.

> I tried to reuse modinfo, but the problem is that kmod implementation is
> checking the filename extension against .ko and .ko.gz.
> 
Well, you can alter modinfo so that it looks at .so files if you like, but thats
not the only tool you can use.  Truthfully you can just use objdump if you
like.

> I find it a bit too bad to have to rewrite this kind of tool just for dpdk
> ... but on the other hand we would need something for bsd as well or we
> give a shell script that rely on readelf to retrieve theis section.
> 
See above, try objdump -j=.modinfo -S /path/to/kernel/module.  objdump doesn't
care about file extensions, as long as its ELF.  With that you can:

1) Dump out any section contents you like
2) strip away the application top end, and just use libbfd to get at the elf
contents if you like.

Neil

> 
> -- 
> David Marchand

[dpdk-dev] [PATCH v6 00/13] mbuf: enhancements of mbuf clones

2015-05-07 Thread Olivier MATZ

Hi Huilong,

On 05/07/2015 03:57 AM, Xu, HuilongX wrote:
> Hi Olivier,
> Today I find a compile error, when I test ip fragment on dpdk.org. would you 
> check this?  thanks a lot.
> My dpdk.org commit: a6d71fa7146cc04320c2485d6dde44c1d888d652
> The compile error as below:
> CC main.o
> /root/dpdk/examples/ip_fragmentation/main.c: In function 'init_mem':
> /root/dpdk/examples/ip_fragmentation/main.c:748:8: error: 'MBUF_DATA_SIZE' 
> undeclared (first use in this function)
>   0, MBUF_DATA_SIZE, socket);
>  ^
> /root/dpdk/examples/ip_fragmentation/main.c:748:8: note: each undeclared 
> identifier is reported only once for each function it appears in

Sure, I'll have a look. Thanks for reporting.

Regards,
Olivier


>
>
> Best regards
>
> huilong
>
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Zoltan Kiss
> Sent: Friday, April 24, 2015 6:39 PM
> To: Ananyev, Konstantin; Olivier Matz; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v6 00/13] mbuf: enhancements of mbuf clones
>
> Hi,
>
> On 22/04/15 12:59, Ananyev, Konstantin wrote:
>>
>>
>>> -Original Message-
>>> From: Olivier Matz [mailto:olivier.matz at 6wind.com]
>>> Sent: Wednesday, April 22, 2015 10:57 AM
>>> To: dev at dpdk.org
>>> Cc: Ananyev, Konstantin; zoltan.kiss at linaro.org; Richardson, Bruce; 
>>> nhorman at tuxdriver.com; olivier.matz at 6wind.com
>>> Subject: [PATCH v6 00/13] mbuf: enhancements of mbuf clones
>>>
>>> The first objective of this series is to fix the support of indirect
>>> mbufs when the application reserves a private area in mbufs. It also
>>> removes the limitation that rte_pktmbuf_clone() is only allowed on
>>> direct (non-cloned) mbufs. The series also contains some enhancements
>>> and fixes in the mbuf area that makes the implementation of the
>>> last patches easier.
>>>
>>
>> Acked-by: Konstantin Ananyev 
>
> When does this series get merged?
>
> Regards,
>
> Zoltan
>

[dpdk-dev] [PATCH] enic: add support for enic in nic_uio driver for FreeBSD

2015-05-07 Thread Sujith Sankar

This patch adds support for enic in the nic_uio driver so that enic could be 
used on FreeBSD.

Signed-off-by: Sujith Sankar 
---
 lib/librte_eal/bsdapp/nic_uio/nic_uio.c |  1 +
 lib/librte_eal/common/include/rte_pci_dev_ids.h | 17 +
 2 files changed, 18 insertions(+)

diff --git a/lib/librte_eal/bsdapp/nic_uio/nic_uio.c 
b/lib/librte_eal/bsdapp/nic_uio/nic_uio.c
index 5ae8560..e649e32 100644
--- a/lib/librte_eal/bsdapp/nic_uio/nic_uio.c
+++ b/lib/librte_eal/bsdapp/nic_uio/nic_uio.c
@@ -113,6 +113,7 @@ struct pci_bdf {
 #define RTE_PCI_DEV_ID_DECL_I40EVF(vend, dev)  {vend, dev},
 #define RTE_PCI_DEV_ID_DECL_VIRTIO(vend, dev)  {vend, dev},
 #define RTE_PCI_DEV_ID_DECL_VMXNET3(vend, dev) {vend, dev},
+#define RTE_PCI_DEV_ID_DECL_ENIC(vend, dev){vend, dev},

 const struct device devices[] = {
 #include 
diff --git a/lib/librte_eal/common/include/rte_pci_dev_ids.h 
b/lib/librte_eal/common/include/rte_pci_dev_ids.h
index 21d2eed..5d1b285 100644
--- a/lib/librte_eal/common/include/rte_pci_dev_ids.h
+++ b/lib/librte_eal/common/include/rte_pci_dev_ids.h
@@ -140,6 +140,10 @@
 #define RTE_PCI_DEV_ID_DECL_FM10KVF(vend, dev)
 #endif

+#ifndef RTE_PCI_DEV_ID_DECL_ENIC
+#define RTE_PCI_DEV_ID_DECL_ENIC(vend, dev)
+#endif
+
 #ifndef PCI_VENDOR_ID_INTEL
 /** Vendor ID used by Intel devices */
 #define PCI_VENDOR_ID_INTEL 0x8086
@@ -155,6 +159,11 @@
 #define PCI_VENDOR_ID_VMWARE 0x15AD
 #endif

+#ifndef PCI_VENDOR_ID_CISCO
+/** Vendor ID used by Cisco VIC devices */
+#define PCI_VENDOR_ID_CISCO 0x1137
+#endif
+
 / Physical EM devices from e1000_hw.h /

 #define E1000_DEV_ID_825420x1000
@@ -548,6 +557,14 @@ RTE_PCI_DEV_ID_DECL_VMXNET3(PCI_VENDOR_ID_VMWARE, 
VMWARE_DEV_ID_VMXNET3)

 RTE_PCI_DEV_ID_DECL_FM10KVF(PCI_VENDOR_ID_INTEL, FM10K_DEV_ID_VF)

+/** Cisco VIC devices **/
+
+#define PCI_DEVICE_ID_CISCO_VIC_ENET 0x0043  /* ethernet vnic */
+#define PCI_DEVICE_ID_CISCO_VIC_ENET_VF  0x0071  /* enet SRIOV VF */
+
+RTE_PCI_DEV_ID_DECL_ENIC(PCI_VENDOR_ID_CISCO, PCI_DEVICE_ID_CISCO_VIC_ENET)
+RTE_PCI_DEV_ID_DECL_ENIC(PCI_VENDOR_ID_CISCO, PCI_DEVICE_ID_CISCO_VIC_ENET_VF)
+
 /*
  * Undef all RTE_PCI_DEV_ID_DECL_* here.
  */
-- 
1.9.1

[dpdk-dev] How to use dpdk ovs

2015-05-07 Thread Ravi Rao

Hi,
Below are the seps that I followed to get DPDK with OVS. Some of the 
paths need to be changed to your env..
#12 Test Switching using OpenVSwitch with DPDK **
#--- Setup the ifaces as dpdk..
cd /home/vnspteam01/dpdk-1.7.1
cd /home/vnspteam01/dpdk-1.8.0
sudo modprobe uio
sudo modprobe cuse
sudo rmmod igb_uio
sudo rmmod rte_kni
sudo insmod x86_64-native-linuxapp-gcc/kmod/igb_uio.ko
# Assign the dpdk capable interfaces to igb_uio driver
tools/dpdk_nic_bind.py --status
sudo /home/vnspteam01/dpdk-1.7.1/tools/dpdk_nic_bind.py -b igb_uio 
:02:00.0
sudo /home/vnspteam01/dpdk-1.7.1/tools/dpdk_nic_bind.py -b igb_uio 
:02:00.1
/home/vnspteam01/dpdk-1.7.1/tools/dpdk_nic_bind.py --status
#--- Setup the openVswitch
su -l
cd /home/vnspteam01/openvswitch-2.3.1
pkill -9 ovs
mkdir -p /usr/local/etc/openvswitch
mkdir -p /usr/local/var/run/openvswitch
rm -rf /usr/local/etc/openvswitch/conf.db
ovsdb/ovsdb-tool create /usr/local/etc/openvswitch/conf.db 
vswitchd/vswitch.ovsschema
#Start ovsdb-server
ovsdb/ovsdb-server --remote=punix:/usr/local/var/run/openvswitch/db.sock 
--remote=db:Open_vSwitch,Open_vSwitch,manager_options --pidfile --detach
utilities/ovs-vsctl --no-wait init
#Start vswitchd:
export DB_SOCK=/usr/local/var/run/openvswitch/db.sock
vswitchd/ovs-vswitchd --dpdk -c 0x3 -n 4 --syslog syslog -- 
unix:$DB_SOCK --pidfile --detach
#Add bridge & ports
utilities/ovs-vsctl add-br ovs-br0 -- set bridge ovs-br0 
datapath_type=netdev
utilities/ovs-vsctl add-port ovs-br0 dpdk0 -- set Interface dpdk0 type=dpdk
utilities/ovs-vsctl add-port ovs-br0 dpdk1 -- set Interface dpdk1 type=dpdk
#display
utilities/ovs-vsctl show
utilities/ovs-vsctl list interface dpdk0
utilities/ovs-vsctl list interface dpdk1
#Add test flows
utilities/ovs-ofctl del-flows br0
# Add flows between port 1 (dpdk0) to port 2 (dpdk1)
utilities/ovs-ofctl add-flow br0 in_port=1,action=output:2
utilities/ovs-ofctl add-flow br0 in_port=2,action=output:1

Regards,
Ravi
On 05/07/15 08:12, topperxin wrote:
> Hi all
> I'm freshman of dpdk.
> And , I want to use dpdk ovs. I compiled successfully.
> like:
> Bridge "br0"
>Port "br0"
> Interface "br0"
> type: internal
>Port "dpdk0"
> Interface "dpdk0"
> type: dpdk
>  But ,I don't know how to use the dpdk port, how to let the data flow 
> go through the dpdk0?
>  Who can tell me, thanks a lot.
>
>  lx
>

[dpdk-dev] [PATCH v4 1/5] vhost: eventfd_link: moving ioctl to a function

2015-05-07 Thread Xie, Huawei

On 4/3/2015 1:02 AM, Pavel Boldin wrote:
> Move ioctl `EVENTFD_COPY' handler code to an inline function.
Pavel:
There is no necessity to inline this function.
/huawei

[dpdk-dev] [PATCH v5 4/5] vhost: eventfd_link: replace copy-pasted sys_close

2015-05-07 Thread Xie, Huawei

On 4/16/2015 7:48 PM, Pavel Boldin wrote:
> Replace copy-pasted `fget_from_files' -> `filp_close' with
> a `sys_close' call.
>
> Signed-off-by: Pavel Boldin 
> ---
>  lib/librte_vhost/eventfd_link/eventfd_link.c | 49 
> +++-
>  1 file changed, 12 insertions(+), 37 deletions(-)
>
> diff --git a/lib/librte_vhost/eventfd_link/eventfd_link.c 
> b/lib/librte_vhost/eventfd_link/eventfd_link.c
> index 0a06594..9bc52a3 100644
> --- a/lib/librte_vhost/eventfd_link/eventfd_link.c
> +++ b/lib/librte_vhost/eventfd_link/eventfd_link.c
> @@ -88,9 +88,8 @@ eventfd_link_ioctl_copy(unsigned long arg)
>  {
>  
> + /* Closing the source_fd */
> + ret = sys_close(eventfd_copy.source_fd);
Pavel:
Here we close the fd and re-install a new file on this fd later. 
sys_close does all cleanup.
But, for instance, if we allocate new fd later, normally it will reuse
the just freed fds by sys_close, is there issue here? 

> + if (ret)
>   goto out_task;
> - }
> -
> - /*
> -  * Release the existing eventfd in the source process
> -  */
> - spin_lock(>file_lock);
> - fput(file);
> - filp_close(file, files);
> - fdt = files_fdtable(files);
> - fdt->fd[eventfd_copy.source_fd] = NULL;
> - spin_unlock(>file_lock);
> -
> - put_files_struct(files);
> + ret = -ESTALE;
>  
>   /*
>* Find the file struct associated with the target fd.
>*/
>  
> - ret = -ESTALE;
> - files = get_files_struct(task_target);
> - if (files == NULL) {
> + target_files = get_files_struct(task_target);
> + if (target_files == NULL) {
>   pr_info("Failed to get target files struct\n");
>   goto out_task;
>   }
>  
>   ret = -EBADF;
> - file = fget_from_files(files, eventfd_copy.target_fd);
> - put_files_struct(files);
> + target_file = fget_from_files(target_files, eventfd_copy.target_fd);
> + put_files_struct(target_files);
>  
> - if (file == NULL) {
> + if (target_file == NULL) {
>   pr_info("Failed to get fd %d from target\n",
>   eventfd_copy.target_fd);
>   goto out_task;
> @@ -164,7 +139,7 @@ eventfd_link_ioctl_copy(unsigned long arg)
>* file desciptor of the source process,
>*/
>  
> - fd_install(eventfd_copy.source_fd, file);
> + fd_install(eventfd_copy.source_fd, target_file);
>   ret = 0;
>  
>  out_task:

[dpdk-dev] [PATCH 1/3] pci: allow access to PCI config space

2015-05-07 Thread Neil Horman

On Wed, May 06, 2015 at 02:37:06PM -0700, Stephen Hemminger wrote:
> From: Stephen Hemminger 
> 
> Some drivers need ability to access PCI config (for example for power
> management). This adds an abstraction to do this; only implemented
> on Linux, but should be possible on BSD.
> 
You didn't test this with shared libraries.  Not having added these new symbols
to the version map will cause a build break.  Also I think you need to implement
this on BSD, its not ok to just have bnx2x break on non-linux platforms.

Thanks
Neil

[dpdk-dev] [PATCH] enicpmd: build changes for FreeBSD

2015-05-07 Thread Sujith Sankar (ssujith)



On 06/05/15 9:19 pm, "Bruce Richardson"  wrote:

>On Wed, May 06, 2015 at 02:41:00PM +0530, Sujith Sankar wrote:
>> This patch adds the changes required to build enic for FreeBSD
>> 
>
>Hi,
>
>I see no issues with this patch, but I suggest the description for it
>should
>be changed. There is no actual build problem or error with the enic (at
>least
>not that I can see), it builds fine with gcc and clang currently. This
>patch
>instead adds support for the enic to the nic_uio driver so the enic can be
>"used", not just "built" on FreeBSD. Correct?

Bruce, I fully agree with you.
I shall change the description and re-submit the patch.

Thanks !
-Sujith

>
>Other than that description reworking:
>
>Acked-by: Bruce Richardson 
>
>> Signed-off-by: Sujith Sankar 
>> ---
>>  lib/librte_eal/bsdapp/nic_uio/nic_uio.c |  1 +
>>  lib/librte_eal/common/include/rte_pci_dev_ids.h | 17 +
>>  2 files changed, 18 insertions(+)
>> 
>> diff --git a/lib/librte_eal/bsdapp/nic_uio/nic_uio.c
>>b/lib/librte_eal/bsdapp/nic_uio/nic_uio.c
>> index 5ae8560..e649e32 100644
>> --- a/lib/librte_eal/bsdapp/nic_uio/nic_uio.c
>> +++ b/lib/librte_eal/bsdapp/nic_uio/nic_uio.c
>> @@ -113,6 +113,7 @@ struct pci_bdf {
>>  #define RTE_PCI_DEV_ID_DECL_I40EVF(vend, dev)  {vend, dev},
>>  #define RTE_PCI_DEV_ID_DECL_VIRTIO(vend, dev)  {vend, dev},
>>  #define RTE_PCI_DEV_ID_DECL_VMXNET3(vend, dev) {vend, dev},
>> +#define RTE_PCI_DEV_ID_DECL_ENIC(vend, dev){vend, dev},
>>  
>>  const struct device devices[] = {
>>  #include 
>> diff --git a/lib/librte_eal/common/include/rte_pci_dev_ids.h
>>b/lib/librte_eal/common/include/rte_pci_dev_ids.h
>> index 21d2eed..5d1b285 100644
>> --- a/lib/librte_eal/common/include/rte_pci_dev_ids.h
>> +++ b/lib/librte_eal/common/include/rte_pci_dev_ids.h
>> @@ -140,6 +140,10 @@
>>  #define RTE_PCI_DEV_ID_DECL_FM10KVF(vend, dev)
>>  #endif
>>  
>> +#ifndef RTE_PCI_DEV_ID_DECL_ENIC
>> +#define RTE_PCI_DEV_ID_DECL_ENIC(vend, dev)
>> +#endif
>> +
>>  #ifndef PCI_VENDOR_ID_INTEL
>>  /** Vendor ID used by Intel devices */
>>  #define PCI_VENDOR_ID_INTEL 0x8086
>> @@ -155,6 +159,11 @@
>>  #define PCI_VENDOR_ID_VMWARE 0x15AD
>>  #endif
>>  
>> +#ifndef PCI_VENDOR_ID_CISCO
>> +/** Vendor ID used by Cisco VIC devices */
>> +#define PCI_VENDOR_ID_CISCO 0x1137
>> +#endif
>> +
>>  / Physical EM devices from e1000_hw.h
>>/
>>  
>>  #define E1000_DEV_ID_825420x1000
>> @@ -548,6 +557,14 @@ RTE_PCI_DEV_ID_DECL_VMXNET3(PCI_VENDOR_ID_VMWARE,
>>VMWARE_DEV_ID_VMXNET3)
>>  
>>  RTE_PCI_DEV_ID_DECL_FM10KVF(PCI_VENDOR_ID_INTEL, FM10K_DEV_ID_VF)
>>  
>> +/** Cisco VIC devices **/
>> +
>> +#define PCI_DEVICE_ID_CISCO_VIC_ENET 0x0043  /* ethernet vnic
>>*/
>> +#define PCI_DEVICE_ID_CISCO_VIC_ENET_VF  0x0071  /* enet SRIOV VF
>>*/
>> +
>> +RTE_PCI_DEV_ID_DECL_ENIC(PCI_VENDOR_ID_CISCO,
>>PCI_DEVICE_ID_CISCO_VIC_ENET)
>> +RTE_PCI_DEV_ID_DECL_ENIC(PCI_VENDOR_ID_CISCO,
>>PCI_DEVICE_ID_CISCO_VIC_ENET_VF)
>> +
>>  /*
>>   * Undef all RTE_PCI_DEV_ID_DECL_* here.
>>   */
>> -- 
>> 1.9.1
>>

[dpdk-dev] GitHub sandbox for the DPDK community

2015-05-07 Thread Wiles, Keith

I did not finish a thought for some reason.

On 5/6/15, 4:49 PM, "Wiles, Keith"  wrote:

>Hi Thomas, (sorry about the length)
>
>On 5/6/15, 2:37 PM, "Marc Sune"  wrote:
>
>>
>>
>>On 06/05/15 23:09, Thomas Monjalon wrote:
>>> Hello everyone,
>>>
>>> I'm back from mini-holidays and it's good to see that there are
>>> a lot of proposals trying to improve our workflow.
>>> Most of the discussions are focus on process and tools, however
>>> we must keep in mind that submitting clean patches and doing more
>>> reviews can greatly improve the life of the project.
>>> The debate for/against GitHub raises several interesting questions
>>> about different parts of the workflow which deserves some detailed
>>> explanations (and context reminders).
>>>
>>> Previously, there was a discussion about the contribution rules and
>>>tools:
>>> http://dpdk.org/ml/archives/dev/2015-March/015499.html
>>> Then a coding rules discussion was started:
>>> http://dpdk.org/ml/archives/dev/2015-April/016243.html
>>> And a more general thread brought some interesting opinions:
>>> http://dpdk.org/ml/archives/dev/2015-April/016551.html
>>> As a consequence, we are now discussing the workflow and especially
>>> how GitHub could help us.
>
>The emails above show one thing we can not make a decision on how to
>proceed. We have no method to decide on a topic, look at coding style we
>have yet to make any head way and it is unclear how we can decide on a
>path. We can not vote and we do not have a king of the repo to make those
>decisions, it just dies with out being resolved.
>
>I was hoping the moving to Github would allow us to have multiple
>persons/companies equal access to the repos/web pages and other functions
>on a third party site. With this move we would put processes in place to
>start fixing these problems. I know we can do this now, but the move IMO
>was how we get it started. We should start now anyway.
>
>We are all over the world and it would be good to have a neutral worldwide
>site to give everyone a equal foothold into DPDK. I was hoping it would
>reduce some cost and time from 6Wind, but maybe it is consider just the
>cost of doing business for 6Wind.
>
>>> Please note that the follow-up of some of these discussions may be done
>>> by submitting & reviewing patches (e.g. guidelines documents,
>>> tools integration, etc).
>>> Now let's talk about the workflow.
>>>
>>> When the dpdk.org project was started in 2013, it has been decided to
>>>adopt
>>> an email workflow. It is the most common model in projects which are
>>> technically close to DPDK: Linux, Qemu, GLIBC, GCC. So it is a promise
>>>to
>>> attract contributors from these projects. Moreover, the number of
>>>comments
>>> to this thread tends to prove that emails are not dead ;)
>>> See also the number of contributors of previous versions:
>>> 1.6: 25 (2014, April)
>>> 1.7: 46 (2014, September)
>>> 1.8: 54 (2014, December)
>>> 2.0: 60 (2015, April)
>>>
>>> Another choice was done about the number of mailing lists: most of the
>>>traffic
>>> is in only one list (dev@) in order to avoid separation between patches
>>>and
>>> discussions/reports leading to patches. It also allows user questions
>>>to be
>>> read by skilled developers.
>>>
>>> The portal to doc, git and mailing list is the website which is managed
>>>with
>>> git in order to open it when needed and mature enough.
>>> Please find web traffic evolution in the attached file.
>>> There is also a patchwork web interface to ease browsing patches
>>>submitted
>>> to the mailing list. It provides a view on patches status and agregate
>>> discussions on specific patches. Some improvements are in progress:
>>> http://permalink.gmane.org/gmane.comp.version-control.patchwork/1162
>>> https://lists.ozlabs.org/pipermail/patchwork/2015-May/001310.html
>
>The patchwork site would not be required for Github as you can review and
>see all of the pull requests. Also the pull requested are quickly accessed
>to sort and manage the patches IMO better then patchwork. The feature is
>built into GitHub and we do not need to maintain that site or tool. The
>pull requests can also be placed into given states just like patchwork.
>The patchwork interface is clunky to me as it seems to be odd to manage
>patches, maybe they can fix the usability issues. The filter button is not
>very visible and when you need to change a set of patches you have to do a
>lot of clicks and back pages to change them all. Maybe I do not know how
>to use the site, but I do not think that is the problem IMO. The GitHub
>one works today without having to fix anything.
>
>>>
>>> There are 3 types of git repositories (http://dpdk.org/browse):
>>>- the main DPDK tree
>>>- subtrees, created on request or external, may help to scale by
>>>providing
>>>  patches ready for merge in the main tree
>>>- side trees, created on request, e.g. dts or pktgen
>
>I like the idea of going to the GitHub page and being

[dpdk-dev] GitHub sandbox for the DPDK community

2015-05-07 Thread Wiles, Keith

Hi Thomas, (sorry about the length)

On 5/6/15, 2:37 PM, "Marc Sune"  wrote:

>
>
>On 06/05/15 23:09, Thomas Monjalon wrote:
>> Hello everyone,
>>
>> I'm back from mini-holidays and it's good to see that there are
>> a lot of proposals trying to improve our workflow.
>> Most of the discussions are focus on process and tools, however
>> we must keep in mind that submitting clean patches and doing more
>> reviews can greatly improve the life of the project.
>> The debate for/against GitHub raises several interesting questions
>> about different parts of the workflow which deserves some detailed
>> explanations (and context reminders).
>>
>> Previously, there was a discussion about the contribution rules and
>>tools:
>>  http://dpdk.org/ml/archives/dev/2015-March/015499.html
>> Then a coding rules discussion was started:
>>  http://dpdk.org/ml/archives/dev/2015-April/016243.html
>> And a more general thread brought some interesting opinions:
>>  http://dpdk.org/ml/archives/dev/2015-April/016551.html
>> As a consequence, we are now discussing the workflow and especially
>> how GitHub could help us.

The emails above show one thing we can not make a decision on how to
proceed. We have no method to decide on a topic, look at coding style we
have yet to make any head way and it is unclear how we can decide on a
path. We can not vote and we do not have a king of the repo to make those
decisions, it just dies with out being resolved.

I was hoping the moving to Github would allow us to have multiple
persons/companies equal access to the repos/web pages and other functions
on a third party site. With this move we would put processes in place to
start fixing these problems. I know we can do this now, but the move IMO
was how we get it started. We should start now anyway.

We are all over the world and it would be good to have a neutral worldwide
site to give everyone a equal foothold into DPDK. I was hoping it would
reduce some cost and time from 6Wind, but maybe it is consider just the
cost of doing business for 6Wind.

>> Please note that the follow-up of some of these discussions may be done
>> by submitting & reviewing patches (e.g. guidelines documents,
>> tools integration, etc).
>> Now let's talk about the workflow.
>>
>> When the dpdk.org project was started in 2013, it has been decided to
>>adopt
>> an email workflow. It is the most common model in projects which are
>> technically close to DPDK: Linux, Qemu, GLIBC, GCC. So it is a promise
>>to
>> attract contributors from these projects. Moreover, the number of
>>comments
>> to this thread tends to prove that emails are not dead ;)
>> See also the number of contributors of previous versions:
>>  1.6: 25 (2014, April)
>>  1.7: 46 (2014, September)
>>  1.8: 54 (2014, December)
>>  2.0: 60 (2015, April)
>>
>> Another choice was done about the number of mailing lists: most of the
>>traffic
>> is in only one list (dev@) in order to avoid separation between patches
>>and
>> discussions/reports leading to patches. It also allows user questions
>>to be
>> read by skilled developers.
>>
>> The portal to doc, git and mailing list is the website which is managed
>>with
>> git in order to open it when needed and mature enough.
>> Please find web traffic evolution in the attached file.
>> There is also a patchwork web interface to ease browsing patches
>>submitted
>> to the mailing list. It provides a view on patches status and agregate
>> discussions on specific patches. Some improvements are in progress:
>>  http://permalink.gmane.org/gmane.comp.version-control.patchwork/1162
>>  https://lists.ozlabs.org/pipermail/patchwork/2015-May/001310.html

The patchwork site would not be required for Github as you can review and
see all of the pull requests. Also the pull requested are quickly accessed
to sort and manage the patches IMO better then patchwork. The feature is
built into GitHub and we do not need to maintain that site or tool. The
pull requests can also be placed into given states just like patchwork.
The patchwork interface is clunky to me as it seems to be odd to manage
patches, maybe they can fix the usability issues. The filter button is not
very visible and when you need to change a set of patches you have to do a
lot of clicks and back pages to change them all. Maybe I do not know how
to use the site, but I do not think that is the problem IMO. The GitHub
one works today without having to fix anything.

>>
>> There are 3 types of git repositories (http://dpdk.org/browse):
>>- the main DPDK tree
>>- subtrees, created on request or external, may help to scale by
>>providing
>>  patches ready for merge in the main tree
>>- side trees, created on request, e.g. dts or pktgen

I like the idea of going to the GitHub page and being able to scroll down
the page to see all of the repos at the same time. This way people notice
the other tools and subtrees quickly. I know you can modify the web page
to make it easier

[dpdk-dev] GitHub sandbox for the DPDK community

2015-05-07 Thread Marc Sune



On 06/05/15 23:09, Thomas Monjalon wrote:
> Hello everyone,
>
> I'm back from mini-holidays and it's good to see that there are
> a lot of proposals trying to improve our workflow.
> Most of the discussions are focus on process and tools, however
> we must keep in mind that submitting clean patches and doing more
> reviews can greatly improve the life of the project.
> The debate for/against GitHub raises several interesting questions
> about different parts of the workflow which deserves some detailed
> explanations (and context reminders).
>
> Previously, there was a discussion about the contribution rules and tools:
>   http://dpdk.org/ml/archives/dev/2015-March/015499.html
> Then a coding rules discussion was started:
>   http://dpdk.org/ml/archives/dev/2015-April/016243.html
> And a more general thread brought some interesting opinions:
>   http://dpdk.org/ml/archives/dev/2015-April/016551.html
> As a consequence, we are now discussing the workflow and especially
> how GitHub could help us.
> Please note that the follow-up of some of these discussions may be done
> by submitting & reviewing patches (e.g. guidelines documents,
> tools integration, etc).
> Now let's talk about the workflow.
>
> When the dpdk.org project was started in 2013, it has been decided to adopt
> an email workflow. It is the most common model in projects which are
> technically close to DPDK: Linux, Qemu, GLIBC, GCC. So it is a promise to
> attract contributors from these projects. Moreover, the number of comments
> to this thread tends to prove that emails are not dead ;)
> See also the number of contributors of previous versions:
>   1.6: 25 (2014, April)
>   1.7: 46 (2014, September)
>   1.8: 54 (2014, December)
>   2.0: 60 (2015, April)
>
> Another choice was done about the number of mailing lists: most of the traffic
> is in only one list (dev@) in order to avoid separation between patches and
> discussions/reports leading to patches. It also allows user questions to be
> read by skilled developers.
>
> The portal to doc, git and mailing list is the website which is managed with
> git in order to open it when needed and mature enough.
> Please find web traffic evolution in the attached file.
> There is also a patchwork web interface to ease browsing patches submitted
> to the mailing list. It provides a view on patches status and agregate
> discussions on specific patches. Some improvements are in progress:
>   http://permalink.gmane.org/gmane.comp.version-control.patchwork/1162
>   https://lists.ozlabs.org/pipermail/patchwork/2015-May/001310.html
>
> There are 3 types of git repositories (http://dpdk.org/browse):
>- the main DPDK tree
>- subtrees, created on request or external, may help to scale by providing
>  patches ready for merge in the main tree
>- side trees, created on request, e.g. dts or pktgen
> Do not hesitate to request creation of a new tree, it's open.
> Intel has requested some small subtrees which seems not very useful. We may
> try to organize some new subtrees for bigger areas, which would take care
> of many sections of the MAINTAINERS file. Maybe that some dedicated mailing
> lists should be created. These mailing lists and subtrees may be hosted on
> dpdk.org or elsewhere if everybody agree.
>
> There was no bug tracker initially installed to avoid fragmentation with
> mailing-list discussions. Now that traffic is becoming huge, it appears to be
> a new priority.
>
> Last point in the workflow status: tests and continuous integration.
> It's a complicated topic, especially because DPDK requires some expensive
> infrastructure for the tests. Some people are working on it at Intel and
> 6WIND, so I guess we will have a public discussion in the coming weeks.
>
> After carefully reading previous comments about github hosting, I would like
> to sort pros/cons below.
> Invalidated Pro:
> - web pages system: already possible without GitHub
> - popularity: why being hosted on GitHub would improve the visibility?
> Pros:
> - less complicated command lines
> - same view for everyone (independent of MUA features)
> - more code context when reading patches
> - integrated bug tracker
> Cons:
> - full feature usage implies everybody is forced to use it
> - fragmentation between online data and mailing list
> - discussions are not threaded, long discussions not clear
> - editing in browser may be limited
> - no offline access
> - difficult to follow history as we rely on user repositories which may change
> - GitHub (commercial service) is watching us
> - how to leave and migrate data from GitHub?
> - administration issues out of control (see snapshot of today's downtime)
>
> I did an abuse report for https://github.com/dpdk in case we want to use this
> GitHub account.
> My opinion is that GitHub offers some nice tools and toys but some people
> won't be comfortable with it.
> It may be reasonable to try some features without forcing everyone to migrate,
>

[dpdk-dev] GitHub sandbox for the DPDK community

2015-05-07 Thread Thomas Monjalon

Hello everyone,

I'm back from mini-holidays and it's good to see that there are
a lot of proposals trying to improve our workflow.
Most of the discussions are focus on process and tools, however
we must keep in mind that submitting clean patches and doing more
reviews can greatly improve the life of the project.
The debate for/against GitHub raises several interesting questions
about different parts of the workflow which deserves some detailed
explanations (and context reminders).

Previously, there was a discussion about the contribution rules and tools:
http://dpdk.org/ml/archives/dev/2015-March/015499.html
Then a coding rules discussion was started:
http://dpdk.org/ml/archives/dev/2015-April/016243.html
And a more general thread brought some interesting opinions:
http://dpdk.org/ml/archives/dev/2015-April/016551.html
As a consequence, we are now discussing the workflow and especially
how GitHub could help us.
Please note that the follow-up of some of these discussions may be done
by submitting & reviewing patches (e.g. guidelines documents,
tools integration, etc).
Now let's talk about the workflow.

When the dpdk.org project was started in 2013, it has been decided to adopt
an email workflow. It is the most common model in projects which are
technically close to DPDK: Linux, Qemu, GLIBC, GCC. So it is a promise to
attract contributors from these projects. Moreover, the number of comments
to this thread tends to prove that emails are not dead ;)
See also the number of contributors of previous versions:
1.6: 25 (2014, April)
1.7: 46 (2014, September)
1.8: 54 (2014, December)
2.0: 60 (2015, April)

Another choice was done about the number of mailing lists: most of the traffic
is in only one list (dev@) in order to avoid separation between patches and
discussions/reports leading to patches. It also allows user questions to be
read by skilled developers.

The portal to doc, git and mailing list is the website which is managed with
git in order to open it when needed and mature enough.
Please find web traffic evolution in the attached file.
There is also a patchwork web interface to ease browsing patches submitted
to the mailing list. It provides a view on patches status and agregate
discussions on specific patches. Some improvements are in progress:
http://permalink.gmane.org/gmane.comp.version-control.patchwork/1162
https://lists.ozlabs.org/pipermail/patchwork/2015-May/001310.html

There are 3 types of git repositories (http://dpdk.org/browse):
  - the main DPDK tree
  - subtrees, created on request or external, may help to scale by providing
patches ready for merge in the main tree
  - side trees, created on request, e.g. dts or pktgen
Do not hesitate to request creation of a new tree, it's open.
Intel has requested some small subtrees which seems not very useful. We may
try to organize some new subtrees for bigger areas, which would take care
of many sections of the MAINTAINERS file. Maybe that some dedicated mailing
lists should be created. These mailing lists and subtrees may be hosted on
dpdk.org or elsewhere if everybody agree.

There was no bug tracker initially installed to avoid fragmentation with
mailing-list discussions. Now that traffic is becoming huge, it appears to be
a new priority.

Last point in the workflow status: tests and continuous integration.
It's a complicated topic, especially because DPDK requires some expensive
infrastructure for the tests. Some people are working on it at Intel and
6WIND, so I guess we will have a public discussion in the coming weeks.

After carefully reading previous comments about github hosting, I would like
to sort pros/cons below.
Invalidated Pro:
- web pages system: already possible without GitHub
- popularity: why being hosted on GitHub would improve the visibility?
Pros:
- less complicated command lines
- same view for everyone (independent of MUA features)
- more code context when reading patches
- integrated bug tracker
Cons:
- full feature usage implies everybody is forced to use it
- fragmentation between online data and mailing list
- discussions are not threaded, long discussions not clear
- editing in browser may be limited
- no offline access
- difficult to follow history as we rely on user repositories which may change
- GitHub (commercial service) is watching us
- how to leave and migrate data from GitHub?
- administration issues out of control (see snapshot of today's downtime)

I did an abuse report for https://github.com/dpdk in case we want to use this
GitHub account.
My opinion is that GitHub offers some nice tools and toys but some people
won't be comfortable with it.
It may be reasonable to try some features without forcing everyone to migrate,
while keeping consistency between every contributors.
Making some tests in a sandbox seems to be a good approach.

Thanks for reading
-- next part --
A non-text attachment was scrubbed...
Name:

60 matches

Mail list logo