[dpdk-dev] [PATCH v3 0/2] support ethertype filter on fortville

2014-11-26 Thread Thomas Monjalon
> > The patch set supports ethertype filter on fortville.
> > 
> > v3 changes:
> >  - redefine the control packet filter to ethertype filter
> > 
> > v2 changes:
> >  - strip the filter APIs definitions from this patch set
> > 
> > jingjing.wu (2):
> >   ethdev: new structure of Ethertype Filter for filter_ctrl api
> >   i40e: implement operation to add/delete an ethertype filter
> 
> Acked-by: Jijiang Liu 

Applied

As for flow director API, it is expected to have this new API applied to other
drivers, as a top priority.

Thanks
-- 
Thomas


[dpdk-dev] [PATCH v6 0/6] enicpmd: Cisco Systems Inc. VIC Ethernet PMD

2014-11-26 Thread Thomas Monjalon
> > ENIC PMD is the poll-mode driver for the Cisco Systems Inc. VIC to be
> > used with DPDK suite.
> >
> > Sujith Sankar (6):
> >   enicpmd: License text
> >   enicpmd:  Makefile
> >   enicpmd: VNIC common code partially shared with ENIC kernel mode
> > driver
> >   enicpmd: pmd specific code
> >   enicpmd: DPDK-ENIC PMD interface
> >   enicpmd: DPDK changes for accommodating ENIC PMD
> >
> 
> Acked-by: David Marchand 
> 
> Thanks Sujith.

Applied and enabled in BSD configuration (not tested).

It would be nice to have some documentation for this driver now.

Thanks
-- 
Thomas


[dpdk-dev] [PATCH] kni: create KNI interface in current network namespace

2014-11-26 Thread Thomas Monjalon
Anyone to review this KNI patch?

2014-11-21 12:10, Takayuki Usui:
> With this patch, KNI interface (e.g. vEth0) is created in the
> network namespace where the DPDK application is running.
> Otherwise, all interfaces are created in the default namespace
> in the host.
> 
> Signed-off-by: Takayuki Usui 
> ---
>  lib/librte_eal/linuxapp/kni/kni_misc.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/lib/librte_eal/linuxapp/kni/kni_misc.c 
> b/lib/librte_eal/linuxapp/kni/kni_misc.c
> index ba6..f4a9965 100644
> --- a/lib/librte_eal/linuxapp/kni/kni_misc.c
> +++ b/lib/librte_eal/linuxapp/kni/kni_misc.c
> @@ -354,6 +354,8 @@ kni_ioctl_create(unsigned int ioctl_num, unsigned long 
> ioctl_param)
>   return -EBUSY;
>   }
>  
> + dev_net_set(net_dev, get_net_ns_by_pid(current->pid));
> +
>   kni = netdev_priv(net_dev);
>  
>   kni->net_dev = net_dev;



[dpdk-dev] [PATCH] kni: optimizing the rte_kni_rx_burst

2014-11-26 Thread Thomas Monjalon
Ping

2014-11-11 23:58, Thomas Monjalon:
> Is there anyone interested in KNI to review this patch please?
> 
> 
> 2014-07-23 12:15, Hemant Agrawal:
> > The current implementation of rte_kni_rx_burst polls the fifo for buffers.
> > Irrespective of success or failure, it allocates the mbuf and try to put 
> > them into the alloc_q
> > if the buffers are not added to alloc_q, it frees them.
> > This waste lots of cpu cycles in allocating and freeing the buffers if 
> > alloc_q is full.
> > 
> > The logic has been changed to:
> > 1. Initially allocand add buffer(burstsize) to alloc_q
> > 2. Add buffers to alloc_q only when you are pulling out the buffers.
> > 
> > Signed-off-by: Hemant Agrawal 
> > ---
> >  lib/librte_kni/rte_kni.c |8 ++--
> >  1 file changed, 6 insertions(+), 2 deletions(-)
> > 
> > diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c
> > index 76feef4..01e85f8 100644
> > --- a/lib/librte_kni/rte_kni.c
> > +++ b/lib/librte_kni/rte_kni.c
> > @@ -263,6 +263,9 @@ rte_kni_alloc(struct rte_mempool *pktmbuf_pool,
> >  
> > ctx->in_use = 1;
> >  
> > +   /* Allocate mbufs and then put them into alloc_q */
> > +   kni_allocate_mbufs(ctx);
> > +
> > return ctx;
> >  
> >  fail:
> > @@ -369,8 +372,9 @@ rte_kni_rx_burst(struct rte_kni *kni, struct rte_mbuf 
> > **mbufs, unsigned num)
> >  {
> > unsigned ret = kni_fifo_get(kni->tx_q, (void **)mbufs, num);
> >  
> > -   /* Allocate mbufs and then put them into alloc_q */
> > -   kni_allocate_mbufs(kni);
> > +   /* If buffers removed, allocate mbufs and then put them into alloc_q */
> > +   if(ret)
> > +   kni_allocate_mbufs(kni);
> >  
> > return ret;
> >  }



[dpdk-dev] [PATCH] table: hash: fix entry size of configurable key size ext and lru

2014-11-26 Thread Thomas Monjalon
Hi,

2014-08-11 12:43, Takayuki Usui:
> Signed-off-by: Takayuki Usui 
> ---
>  lib/librte_table/rte_table_hash_ext.c | 2 +-
>  lib/librte_table/rte_table_hash_lru.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/librte_table/rte_table_hash_ext.c 
> b/lib/librte_table/rte_table_hash_ext.c
> index 6e26d98..8b86fab 100644
> --- a/lib/librte_table/rte_table_hash_ext.c
> +++ b/lib/librte_table/rte_table_hash_ext.c
> @@ -221,7 +221,7 @@ rte_table_hash_ext_create(void *params, int socket_id, 
> uint32_t entry_size)
>   /* Internal */
>   t->bucket_mask = t->n_buckets - 1;
>   t->key_size_shl = __builtin_ctzl(p->key_size);
> - t->data_size_shl = __builtin_ctzl(p->key_size);
> + t->data_size_shl = __builtin_ctzl(entry_size);
>  
>   /* Tables */
>   table_meta_offset = 0;
> diff --git a/lib/librte_table/rte_table_hash_lru.c 
> b/lib/librte_table/rte_table_hash_lru.c
> index d1a4984..bf92e81 100644
> --- a/lib/librte_table/rte_table_hash_lru.c
> +++ b/lib/librte_table/rte_table_hash_lru.c
> @@ -192,7 +192,7 @@ rte_table_hash_lru_create(void *params, int socket_id, 
> uint32_t entry_size)
>   /* Internal */
>   t->bucket_mask = t->n_buckets - 1;
>   t->key_size_shl = __builtin_ctzl(p->key_size);
> - t->data_size_shl = __builtin_ctzl(p->key_size);
> + t->data_size_shl = __builtin_ctzl(entry_size);
>  
>   /* Tables */
>   table_meta_offset = 0;

A similar patch has been recently applied:
http://dpdk.org/browse/dpdk/commit/?id=8595428e50

Cristian, as the author of this library, it would be appreciated that you
review and ack such patch. It's important to accept contributions and give
credit to the first author of a patch.

Thanks
-- 
Thomas


[dpdk-dev] [PATCH v5 00/14] Patches for DPDK to support Power architecture

2014-11-26 Thread Thomas Monjalon
2014-11-26 10:32, David Marchand:
> On Tue, Nov 25, 2014 at 11:17 PM, Chao Zhu 
> wrote:
> 
> > The set of patches add IBM Power architecture to the DPDK. It adds the
> > required support to the EAL library. This set of patches doesn't support
> > full DPDK function on Power processors. To compile on PPC64 architecture,
> > GCC version >= 4.8 must be used. According to Bruce and Neil's comments,
> > this v5 patch removed the common configuration files of Powerpc in v4.
> > Also, it fixed the checkpatch issues in v3.
> >
> > The only unsolved checkpatch issue is :
> > ERROR: space prohibited before open square bracket '['
> >
> > This issue refers to the asm code input/output naming. But I think the
> > error is invalid.
> >
> >
> I must admit that the architecture abstraction in dpdk is not fully done,
> but it is more a "core" problem than a problem of this port itself.
> I am pretty sure that there is still a lot of places in dpdk that rely on
> the fact that they were written for x86 architecture.
> 
> Neil's concerns on cpuflags (
> http://dpdk.org/ml/archives/dev/2014-November/008769.html) are valid but I
> think we could go with an incremental approach.
> Since Chao is commited to add power support to dpdk, we can have this fixed
> in subsequent patches with this patchset in 1.8.
> 
> 
> So, this patchset looks good enough to me.
> Acked-by: David Marchand 

Applied.

Nice to see a new architecture.
We'll have to make things more generic and do more tests but this is a good
first step.

Thanks
-- 
Thomas


[dpdk-dev] [PATCH v4 12/13] testpmd: support TSO in csum forward engine

2014-11-26 Thread Ananyev, Konstantin
> From: Olivier Matz [mailto:olivier.matz at 6wind.com]
> Sent: Wednesday, November 26, 2014 3:05 PM
> To: dev at dpdk.org
> Cc: olivier.matz at 6wind.com; Walukiewicz, Miroslaw; Liu, Jijiang; Liu, 
> Yong; jigsaw at gmail.com; Richardson, Bruce; Ananyev, Konstantin
> Subject: [PATCH v4 12/13] testpmd: support TSO in csum forward engine
> 
> Add two new commands in testpmd:
> 
> - tso set  
> - tso show 
> 
> These commands can be used enable TSO when transmitting TCP packets in
> the csum forward engine. Ex:
> 
>   set fwd csum
>   tx_checksum set ip hw 0
>   tso set 800 0
>   start
> 
> Signed-off-by: Olivier Matz 

Acked-by: Konstantin Ananyev 

> ---
>  app/test-pmd/cmdline.c  | 92 
> +
>  app/test-pmd/csumonly.c | 64 --
>  app/test-pmd/testpmd.h  |  1 +
>  3 files changed, 139 insertions(+), 18 deletions(-)
> 
> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
> index 722cd76..2a8c260 100644
> --- a/app/test-pmd/cmdline.c
> +++ b/app/test-pmd/cmdline.c
> @@ -329,6 +329,14 @@ static void cmd_help_long_parsed(void *parsed_result,
>   "tx_checksum show (port_id)\n"
>   "Display tx checksum offload configuration\n\n"
> 
> + "tso set (segsize) (portid)\n"
> + "Enable TCP Segmentation Offload in csum forward"
> + " engine.\n"
> + "Please check the NIC datasheet for HW limits.\n\n"
> +
> + "tso show (portid)"
> + "Display the status of TCP Segmentation 
> Offload.\n\n"
> +
>   "set fwd (%s)\n"
>   "Set packet forwarding mode.\n\n"
> 
> @@ -2984,6 +2992,88 @@ cmdline_parse_inst_t cmd_tx_cksum_show = {
>   },
>  };
> 
> +/* *** ENABLE HARDWARE SEGMENTATION IN TX PACKETS *** */
> +struct cmd_tso_set_result {
> + cmdline_fixed_string_t tso;
> + cmdline_fixed_string_t mode;
> + uint16_t tso_segsz;
> + uint8_t port_id;
> +};
> +
> +static void
> +cmd_tso_set_parsed(void *parsed_result,
> +__attribute__((unused)) struct cmdline *cl,
> +__attribute__((unused)) void *data)
> +{
> + struct cmd_tso_set_result *res = parsed_result;
> + struct rte_eth_dev_info dev_info;
> +
> + if (port_id_is_invalid(res->port_id))
> + return;
> +
> + if (!strcmp(res->mode, "set"))
> + ports[res->port_id].tso_segsz = res->tso_segsz;
> +
> + if (ports[res->port_id].tso_segsz == 0)
> + printf("TSO is disabled\n");
> + else
> + printf("TSO segment size is %d\n",
> + ports[res->port_id].tso_segsz);
> +
> + /* display warnings if configuration is not supported by the NIC */
> + rte_eth_dev_info_get(res->port_id, &dev_info);
> + if ((ports[res->port_id].tso_segsz != 0) &&
> + (dev_info.tx_offload_capa & DEV_TX_OFFLOAD_TCP_TSO) == 0) {
> + printf("Warning: TSO enabled but not "
> + "supported by port %d\n", res->port_id);
> + }
> +}
> +
> +cmdline_parse_token_string_t cmd_tso_set_tso =
> + TOKEN_STRING_INITIALIZER(struct cmd_tso_set_result,
> + tso, "tso");
> +cmdline_parse_token_string_t cmd_tso_set_mode =
> + TOKEN_STRING_INITIALIZER(struct cmd_tso_set_result,
> + mode, "set");
> +cmdline_parse_token_num_t cmd_tso_set_tso_segsz =
> + TOKEN_NUM_INITIALIZER(struct cmd_tso_set_result,
> + tso_segsz, UINT16);
> +cmdline_parse_token_num_t cmd_tso_set_portid =
> + TOKEN_NUM_INITIALIZER(struct cmd_tso_set_result,
> + port_id, UINT8);
> +
> +cmdline_parse_inst_t cmd_tso_set = {
> + .f = cmd_tso_set_parsed,
> + .data = NULL,
> + .help_str = "Set TSO segment size for csum engine (0 to disable): "
> + "tso set  ",
> + .tokens = {
> + (void *)&cmd_tso_set_tso,
> + (void *)&cmd_tso_set_mode,
> + (void *)&cmd_tso_set_tso_segsz,
> + (void *)&cmd_tso_set_portid,
> + NULL,
> + },
> +};
> +
> +cmdline_parse_token_string_t cmd_tso_show_mode =
> + TOKEN_STRING_INITIALIZER(struct cmd_tso_set_result,
> + mode, "show");
> +
> +
> +cmdline_parse_inst_t cmd_tso_show = {
> + .f = cmd_tso_set_parsed,
> + .data = NULL,
> + .help_str = "Show TSO segment size for csum engine: "
> + "tso show ",
> + .tokens = {
> + (void *)&cmd_tso_set_tso,
> + (void *)&cmd_tso_show_mode,
> + (void *)&cmd_tso_set_portid,
> + NULL,
> + },
> +};
> +
>  /* *** ENABLE/DISABLE FLUSH ON RX STREAMS *** */
>  struct cmd_set_flush_rx {
>   cmdline_fixed_string_t set;
> @@ -8660,6 +8750,8 @@ cmdline_parse_ctx_t main_ctx[] = {
>   (cmdline_parse_inst_t *)&cmd_tx_vlan_s

[dpdk-dev] [PATCH v4 08/13] testpmd: rework csum forward engine

2014-11-26 Thread Ananyev, Konstantin
Hi Oliver,

> -Original Message-
> From: Olivier Matz [mailto:olivier.matz at 6wind.com]
> Sent: Wednesday, November 26, 2014 3:05 PM
> To: dev at dpdk.org
> Cc: olivier.matz at 6wind.com; Walukiewicz, Miroslaw; Liu, Jijiang; Liu, 
> Yong; jigsaw at gmail.com; Richardson, Bruce; Ananyev, Konstantin
> Subject: [PATCH v4 08/13] testpmd: rework csum forward engine
> 
> The csum forward engine was becoming too complex to be used and
> extended (the next commits want to add the support of TSO):
> 
> - no explaination about what the code does
> - code is not factorized, lots of code duplicated, especially between
>   ipv4/ipv6
> - user command line api: use of bitmasks that need to be calculated by
>   the user
> - the user flags don't have the same semantic:
>   - for legacy IP/UDP/TCP/SCTP, it selects software or hardware checksum
>   - for other (vxlan), it selects between hardware checksum or no
> checksum
> - the code relies too much on flags set by the driver without software
>   alternative (ex: PKT_RX_TUNNEL_IPV4_HDR). It is nice to be able to
>   compare a software implementation with the hardware offload.
> 
> This commit tries to fix these issues, and provide a simple definition
> of what is done by the forward engine:
> 
>  * Receive a burst of packets, and for supported packet types:
>  *  - modify the IPs
>  *  - reprocess the checksum in SW or HW, depending on testpmd command line
>  *configuration
>  * Then packets are transmitted on the output port.
>  *
>  * Supported packets are:
>  *   Ether / (vlan) / IP|IP6 / UDP|TCP|SCTP .
>  *   Ether / (vlan) / IP|IP6 / UDP / VxLAN / Ether / IP|IP6 / UDP|TCP|SCTP
>  *
>  * The network parser supposes that the packet is contiguous, which may
>  * not be the case in real life.

2 small things, see below.
Sorry, I probably was not very clear with that new flags.
Konstantin

> 
> Signed-off-by: Olivier Matz 
> ---
>  app/test-pmd/cmdline.c  | 156 ---
>  app/test-pmd/config.c   |  13 +-
>  app/test-pmd/csumonly.c | 679 
> ++--
>  app/test-pmd/testpmd.h  |  17 +-
>  4 files changed, 440 insertions(+), 425 deletions(-)
> 
> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
> index bb4e75c..722cd76 100644
> --- a/app/test-pmd/cmdline.c
> +++ b/app/test-pmd/cmdline.c
> @@ -316,19 +316,19 @@ static void cmd_help_long_parsed(void *parsed_result,
>   "Disable hardware insertion of a VLAN header in"
>   " packets sent on a port.\n\n"
> 
> - "tx_checksum set (mask) (port_id)\n"
> - "Enable hardware insertion of checksum offload with"
> - " the 8-bit mask, 0~0xff, in packets sent on a port.\n"
> - "bit 0 - insert ip   checksum offload if set\n"
> - "bit 1 - insert udp  checksum offload if set\n"
> - "bit 2 - insert tcp  checksum offload if set\n"
> - "bit 3 - insert sctp checksum offload if set\n"
> - "bit 4 - insert inner ip  checksum offload if 
> set\n"
> - "bit 5 - insert inner udp checksum offload if 
> set\n"
> - "bit 6 - insert inner tcp checksum offload if 
> set\n"
> - "bit 7 - insert inner sctp checksum offload if 
> set\n"
> + "tx_cksum set (ip|udp|tcp|sctp|vxlan) (hw|sw) 
> (port_id)\n"
> + "Select hardware or software calculation of the"
> + " checksum with when transmitting a packet using the"
> + " csum forward engine.\n"
> + "ip|udp|tcp|sctp always concern the inner layer.\n"
> + "vxlan concerns the outer IP and UDP layer (in"
> + " case the packet is recognized as a vxlan packet by"
> + " the forward engine)\n"
>   "Please check the NIC datasheet for HW limits.\n\n"
> 
> + "tx_checksum show (port_id)\n"
> + "Display tx checksum offload configuration\n\n"
> +
>   "set fwd (%s)\n"
>   "Set packet forwarding mode.\n\n"
> 
> @@ -2855,48 +2855,131 @@ cmdline_parse_inst_t cmd_tx_vlan_reset = {
> 
> 
>  /* *** ENABLE HARDWARE INSERTION OF CHECKSUM IN TX PACKETS *** */
> -struct cmd_tx_cksum_set_result {
> +struct cmd_tx_cksum_result {
>   cmdline_fixed_string_t tx_cksum;
> - cmdline_fixed_string_t set;
> - uint8_t cksum_mask;
> + cmdline_fixed_string_t mode;
> + cmdline_fixed_string_t proto;
> + cmdline_fixed_string_t hwsw;
>   uint8_t port_id;
>  };
> 
>  static void
> -cmd_tx_cksum_set_parsed(void *parsed_result,
> +cmd_tx_cksum_parsed(void *parsed_result,
>  __attribute__((unused)) struct cmdline *cl,
>  __a

[dpdk-dev] [PATCH v4 00/13] add TSO support

2014-11-26 Thread Thomas Monjalon
2014-11-26 16:04, Olivier Matz:
> This series add TSO support in ixgbe DPDK driver. This is a rework
> of the series sent earlier this week [1]. This work is based on
> another version [2] that was posted several months ago and
> which included a mbuf rework that is now in mainline.
> 
> Changes in v4:
> 
> - csum fwd engine: use PKT_TX_IPV4 and PKT_TX_IPV6 to tell the hardware
>   the IP version of the packet as suggested by Konstantin.
> - document these 2 flags, explaining they should be set for hw L4 cksum
>   offload or TSO.
> - rebase on latest head
> 
> Changes in v3:
> 
> - indicate that rte_get_rx_ol_flag_name() and rte_get_tx_ol_flag_name()
>   should be kept synchronized with flags definition
> - use sizeof() when appropriate in rte_raw_cksum()
> - remove double semicolon in ixgbe driver
> - reorder tx ol_flags as requested by Thomas
> - add missing copyrights when big modifications are made
> - enhance the help of tx_cksum command in testpmd
> - enhance the description of csumonly (comments)
> 
> Changes in v2:
> 
> - move rte_get_rx_ol_flag_name() and rte_get_tx_ol_flag_name() in
>   rte_mbuf.c, and fix comments
> - use IGB_TX_OFFLOAD_MASK and IXGBE_TX_OFFLOAD_MASK to replace
>   PKT_TX_OFFLOAD_MASK
> - fix inner_l2_len and inner_l3_len bitfields: use uint64_t instead
>   of uint16_t
> - replace assignation of l2_len and l3_len by assignation of tx_offload.
>   It now includes inner_l2_len and inner_l3_len at the same time.
> - introduce a new cksum api in rte_ip.h following discussion with
>   Konstantin
> - reorder commits to have all TSO commits at the end of the series
> - use ol_flags for phdr checksum calculation (this now matches ixgbe
>   API: standard pseudo hdr cksum for TCP cksum offload, pseudo hdr
>   cksum without ip paylen for TSO). This will probably be changed
>   with a dev_prep_tx() like function for 2.0 release.
> - rebase on latest head
> 
> 
> This series first fixes some bugs that were discovered during the
> development, adds some changes to the mbuf API (new l4_len and
> tso_segsz fields), adds TSO support in ixgbe, reworks testpmd
> csum forward engine, and finally adds TSO support in testpmd so it
> can be validated.
> 
> The new fields added in mbuf try to be generic enough to apply to
> other hardware in the future. To delegate the TCP segmentation to the
> hardware, the user has to:
> 
>   - set the PKT_TX_TCP_SEG flag in mbuf->ol_flags (this flag implies
> PKT_TX_TCP_CKSUM)
>   - set the flag PKT_TX_IPV4 or PKT_TX_IPV6
>   - if it's IPv4, set the PKT_TX_IP_CKSUM flag and write the IP checksum
> to 0 in the packet
>   - fill the mbuf offload information: l2_len, l3_len, l4_len, tso_segsz
> 
>   - calculate the pseudo header checksum without taking ip_len in account,
> and set it in the TCP header, for instance by using
> rte_ipv4_phdr_cksum(ip_hdr, ol_flags)
> 
> The test report will be added as an answer to this cover letter and
> could be linked in the concerned commits.
> 
> [1] http://dpdk.org/ml/archives/dev/2014-November/007953.html
> [2] http://dpdk.org/ml/archives/dev/2014-May/002537.html
> 
> Olivier Matz (13):
>   igb/ixgbe: fix IP checksum calculation
>   ixgbe: fix remaining pkt_flags variable size to 64 bits
>   mbuf: reorder tx ol_flags
>   mbuf: add help about TX checksum flags
>   mbuf: remove too specific PKT_TX_OFFLOAD_MASK definition
>   mbuf: add functions to get the name of an ol_flag
>   testpmd: fix use of offload flags in testpmd
>   testpmd: rework csum forward engine
>   mbuf: introduce new checksum API
>   mbuf: generic support for TCP segmentation offload
>   ixgbe: support TCP segmentation offload
>   testpmd: support TSO in csum forward engine
>   testpmd: add a verbose mode csum forward engine

Applied.

This feature triggered an important mbuf rework which was integrated previously.
Thank you Olivier and others for your work for several months.

On this base, some adjustments may be needed, especially for VXLAN.
Please let's close quickly these developments for DPDK 1.8.0 release.

-- 
Thomas


[dpdk-dev] maximum line size on patch

2014-11-26 Thread Thomas Monjalon
2014-11-26 17:31, De Lara Guarch, Pablo:
> I am trying to send a patch for new sample app UG, but the patch cannot be 
> sent because I am hitting the maximum line size on the patch.
> 
> fatal: 
> /tmp/35JFqgAmCA/0001-doc-Added-new-sample-app-UG-for-VM-power-management.patch:
>  29: patch contains a line longer than 998 characters
> 
> This is due to the included svg files. Is there any way I can include them on 
> the patch? Any other way?

Could you try --no-validate?

-- 
Thomas


[dpdk-dev] [PATCH v6 00/10] Virtual Machine Power Management

2014-11-26 Thread Thomas Monjalon
Hi Pablo and Alan,

2014-11-25 16:18, Pablo de Lara:
> Virtual Machine Power Management.
> 
> The following patches add two DPDK sample applications and an alternate
> implementation of librte_power for use in virtualized environments.
> The idea is to provide librte_power functionality from within a VM to address
> the lack of MSRs to facilitate frequency changes from within a VM.
> It is ideally suited for Haswell which provides per core frequency scaling.
> 
> The current librte_power affects frequency changes via the acpi-cpufreq
> 'userspace' power governor, accessed via sysfs.
> 
> General Overview:(more information in each patch that follows).
> The VM Power Management solution provides two components:
> 
>  1)VM: Allows for the a DPDK application in a VM to reuse the librte_power
>  interface. Each lcore opens a Virto-Serial endpoint channel to the host,
>  where the re-implementation of librte_power simply forwards the requests for
>  frequency change to a host based monitor. The host monitor itself uses
>  librte_power.
>  Each lcore channel corresponds to a
>  serial device '/dev/virtio-ports/virtio.serial.port.poweragent.'
>  which is opened in non-blocking mode.
>  While each Virtual CPU can be mapped to multiple physical CPUs it is
>  recommended that each vCPU should be mapped to a single core only.
> 
>  2)Host: The host monitor is managed by a CLI, it allows for adding qemu/KVM
>  virtual machines and associated channels to the monitor, manually changing
>  CPU frequency, inspecting the state of VMs, vCPU to pCPU pinning and managing
>  channels.
>  Host channel endpoints are Virto-Serial endpoints configured as AF_UNIX file
>  sockets which follow a specific naming convention
>  i.e /tmp/powermonitor/.,
>  each channel has an 1:1 mapping to a VM endpoint
>  i.e. /dev/virtio-ports/virtio.serial.port.poweragent.
>  Host channel endpoints are opened in non-blocking mode and are monitored via 
> epoll.
>  Requests over each channel to change frequency are forwarded to the original
>  librte_power.
>  
> Channels must be manually configured as qemu-kvm command line arguments or
> libvirt domain definition(xml) e.g.
> 
>  
> 
> 
>   
>   
> 
> 
> Where multiple channels can be configured by specifying multiple 
> elements, by replacing , .
> (port number) should be incremented by 1 for each new channel element.
> More information on Virtio-Serial can be found here:
> http://fedoraproject.org/wiki/Features/VirtioSerial
> To enable the Hypervisor creation of channels, the host endpoint directory
> must be created with qemu permissions:
> mkdir /tmp/powermonitor
> chown qemu:qemu /tmp/powermonitor
> 
> The host application runs on two separate lcores:
> Core N) CLI: For management of Virtual Machines adding channels to Monitor 
> thread,
>  inspecting state and manually setting CPU frequency [PATCH 02/09]
> Core N+1) Monitor Thread: An epoll based infinite loop that waits on channel 
> events
>  from VMs and calls the corresponding librte_power functions.
> 
> A sample application is also provided to run on Virtual Machines, this
> application provides a CLI to manually set the frequency of a 
> vCPU[PATCH 08/09]
> 
> The current l3fwd-power sample application can also be run on a VM.
> 
> Changes in V6:
>  Fixed typos and missing some identations and blank lines
> 
> Changes in V5:
>  Fixed default target in sample app Makefiles
> 
> Changes in V4:
>  Fixed double free of channel during VM shutdown.
> 
> Changes in V3:
>  Fixed crash in Guest CLI when host application is not running.
>  Renamed #defines to be more specific to the module they belong
>  Added vCPU pinning via CLI
> 
> Changes in V2:
>  Runtime selection of librte_power implementations.
>  Updated Unit tests to cover librte_power changes.
>  PATCH[0/3] was sent twice, again as PATCH[0/4]
>  Miscellaneous fixes.
> 
> Alan Carew (10):
>   Channel Manager and Monitor for VM Power Management(Host).
>   VM Power Management CLI(Host).
>   CPU Frequency Power Management(Host).
>   VM Power Management application and Makefile.
>   VM Power Management CLI(Guest).
>   VM communication channels for VM Power Management(Guest).
>   librte_power common interface for Guest and Host
>   Packet format for VM Power Management(Host and Guest).
>   Build system integration for VM Power Management(Guest and Host)
>   VM Power Management Unit Tests

Thanks to my shiny updated checkpatch, I was able to fix these 2 typos:

WARNING:MISSING_SPACE: break quoted strings at a space character
#831: FILE: examples/vm_power_manager/channel_manager.c:722:
+   RTE_LOG(ERR, CHANNEL_MANAGER, "Error connecting to %s, 
connection"
+   "already established\n", path);

WARNING:MISSING_SPACE: break quoted strings at a space character
#1424: FILE: examples/vm_power_manager/channel_monitor.c:181:
+   RTE_LOG(ERR, CHANNEL_MONITOR, "Unable to rte_malloc for"
+   "epoll events\n");

This codebase is

[dpdk-dev] maximum line size on patch

2014-11-26 Thread De Lara Guarch, Pablo
Hi,

I am trying to send a patch for new sample app UG, but the patch cannot be sent 
because I am hitting the maximum line size on the patch.

fatal: 
/tmp/35JFqgAmCA/0001-doc-Added-new-sample-app-UG-for-VM-power-management.patch: 
29: patch contains a line longer than 998 characters

This is due to the included svg files. Is there any way I can include them on 
the patch? Any other way?

Thanks,
Pablo


[dpdk-dev] [PATCH v3] examples/skeleton: very simple code for packet forwarding

2014-11-26 Thread Thomas Monjalon
> This is a very simple example app for doing packet forwarding with the
> Intel DPDK. It's designed to serve as a start point for people new to
> the Intel DPDK and who want to develop a new app.
> 
> Therefore it's meant to:
> * have as good a performance out-of-the-box as possible, using the
>   best-known settings for configuring the PMDs, so that any new apps can
>   be based off it.
> * be kept as short as possible to make it easy to understand it and get
>   started with it.
> 
> Signed-off-by: Bruce Richardson 

Applied

I think we should add some references to this example in the documentation.
It will really help newcomers.

Thanks
-- 
Thomas


[dpdk-dev] [PATCH v3] examples/skeleton: very simple code for packet forwarding

2014-11-26 Thread Thomas Monjalon
2014-11-26 15:49, Bruce Richardson:
> On Wed, Nov 26, 2014 at 04:42:06PM +0100, Thomas Monjalon wrote:
> > Bruce,
> > 
> > I made some minor changes to the skeleton app.
> > Could you confirm it's ok for you?
> 
> Yes, they are fine for me, though the makefile diff looks messed up compared 
> to
> the original version. Any way to force git to recognise it as a completely new
> file and not a copy of the higher level examples makefile?

Yes, the default is to recognise similarity with at least 50%.
The option -M80% raises the threshold to 80%.
I don't know how to put this configuration in gitconfig.

-- 
Thomas


[dpdk-dev] [PATCH v3 0/2] ADD mode 5(tlb) to link bonding pmd

2014-11-26 Thread Mrzyglod, DanielX T
v3 change:
Rebase patch version to HEAD of orgin/master.
Unit tests moved to the separate patch v3 2/2.

v2 change:
Add Unit Tests
Modification that updates obytes structure in virtualpmd driver.
change internals->slaves[i].last_obytes to have proper values.
Update codebase to Declan's patches.

v1 change:
Add support for mode 5 (Transmit load balancing) into pmd driver

This mode provides an adaptive transmit load balancing. 
It dynamically changes the transmitting slave, according to the computed load. 
Statistics are collected in 100ms intervals and scheduled every 10ms.

> -Original Message-
> From: Mrzyglod, DanielX T
> Sent: Wednesday, November 26, 2014 6:13 PM
> To: dev at dpdk.org
> Cc: Mrzyglod, DanielX T
> Subject: [PATCH v3 0/2] ADD mode 5(tlb) to link bonding pmd
> 
> This mode provides an adaptive transmit load balancing.
> It dynamically changes the transmitting slave, according to the computed load.
> Statistics are collected in 100ms intervals and scheduled every 10ms.
> 
> Daniel Mrzyglod (2):
>   This patch add support of mode 5 to link bonding pmd
>   Unit tests for Mode 5 of Bonding Transmit Load balancing.
> 
>  app/test/test_link_bonding.c   | 499 
> -
>  app/test/virtual_pmd.c |   6 +-
>  lib/librte_pmd_bond/rte_eth_bond.h |  11 +
>  lib/librte_pmd_bond/rte_eth_bond_args.c|   1 +
>  lib/librte_pmd_bond/rte_eth_bond_pmd.c | 160 -
>  lib/librte_pmd_bond/rte_eth_bond_private.h |   2 +-
>  6 files changed, 673 insertions(+), 6 deletions(-)
> 
> --
> 2.1.0



[dpdk-dev] [PATCH v3 2/2] Unit tests for Mode 5 of Bonding Transmit Load balancing.

2014-11-26 Thread Daniel Mrzyglod
This patch add unit tests for mode 5 - tlb - to the others 
link bonding unit tests.

Signed-off-by: Daniel Mrzyglod 
---
 app/test/test_link_bonding.c | 499 ++-
 app/test/virtual_pmd.c   |   6 +-
 2 files changed, 502 insertions(+), 3 deletions(-)

diff --git a/app/test/test_link_bonding.c b/app/test/test_link_bonding.c
index 93449af..f62c490 100644
--- a/app/test/test_link_bonding.c
+++ b/app/test/test_link_bonding.c
@@ -41,7 +41,7 @@
 #include 
 #include 
 #include 
-
+#include 
 #include 
 #include 
 #include 
@@ -3856,6 +3856,498 @@ testsuite_teardown(void)
return remove_slaves_and_stop_bonded_device();
 }

+static int
+test_tlb_tx_burst(void)
+{
+   int i, burst_size, nb_tx;
+   uint64_t nb_tx2 = 0;
+   struct rte_mbuf *pkt_burst[MAX_PKT_BURST];
+   struct rte_eth_stats port_stats[32];
+   uint64_t sum_ports_opackets = 0, all_bond_opackets = 0, all_bond_obytes 
= 0;
+   uint16_t pktlen;
+   uint64_t floor_obytes = 0, ceiling_obytes = 0;
+
+   TEST_ASSERT_SUCCESS(initialize_bonded_device_with_slaves
+   (BONDING_MODE_ADAPTIVE_TRANSMIT_LOAD_BALANCING, 1, 3, 
1),
+   "Failed to initialise bonded device");
+
+   burst_size = 20 * test_params->bonded_slave_count;
+
+   TEST_ASSERT(burst_size < MAX_PKT_BURST,
+   "Burst size specified is greater than supported.\n");
+
+
+   /* Generate 40 test bursts in 2s of packets to transmit  */
+   for (i = 0; i < 40; i++) {
+   /*test two types of mac src own(bonding) and others */
+   if (i % 2 == 0) {
+   initialize_eth_header(test_params->pkt_eth_hdr,
+   (struct ether_addr *)src_mac,
+   (struct ether_addr *)dst_mac_0, 0, 0);
+   } else {
+   initialize_eth_header(test_params->pkt_eth_hdr,
+   (struct ether_addr 
*)test_params->default_slave_mac,
+   (struct ether_addr *)dst_mac_0, 0, 0);
+   }
+   pktlen = initialize_udp_header(test_params->pkt_udp_hdr, 
src_port,
+   dst_port_0, 16);
+   pktlen = initialize_ipv4_header(test_params->pkt_ipv4_hdr, 
src_addr,
+   dst_addr_0, pktlen);
+   generate_packet_burst(test_params->mbuf_pool, pkt_burst,
+   test_params->pkt_eth_hdr, 0, 
test_params->pkt_ipv4_hdr,
+   1, test_params->pkt_udp_hdr, burst_size, 60, 1);
+   /* Send burst on bonded port */
+   nb_tx = rte_eth_tx_burst(test_params->bonded_port_id, 0, 
pkt_burst,
+   burst_size);
+   nb_tx2 += nb_tx;
+
+   TEST_ASSERT_EQUAL(nb_tx, burst_size,
+   "number of packet not equal burst size");
+
+   rte_delay_us(5);
+   }
+
+
+   /* Verify bonded port tx stats */
+   rte_eth_stats_get(test_params->bonded_port_id, &port_stats[0]);
+
+   all_bond_opackets = port_stats[0].opackets;
+   all_bond_obytes = port_stats[0].obytes;
+
+   TEST_ASSERT_EQUAL(port_stats[0].opackets, (uint64_t)nb_tx2,
+   "Bonded Port (%d) opackets value (%u) not as expected 
(%d)\n",
+   test_params->bonded_port_id, (unsigned 
int)port_stats[0].opackets,
+   burst_size);
+
+
+   /* Verify slave ports tx stats */
+   for (i = 0; i < test_params->bonded_slave_count; i++) {
+   rte_eth_stats_get(test_params->slave_port_ids[i], 
&port_stats[i]);
+   sum_ports_opackets += port_stats[i].opackets;
+   }
+
+   TEST_ASSERT_EQUAL(sum_ports_opackets, (uint64_t)all_bond_opackets,
+   "Total packets sent by slaves is not equal to packets 
sent by bond interface");
+   /* distribution of packets on each slave within +/- 10% of the expected 
value. */
+   for (i = 0; i < test_params->bonded_slave_count; i++) {
+
+   floor_obytes = 
(all_bond_obytes*90)/(test_params->bonded_slave_count*100);
+   ceiling_obytes = 
(all_bond_obytes*110)/(test_params->bonded_slave_count*100);
+   TEST_ASSERT(port_stats[i].obytes >= floor_obytes &&
+   port_stats[i].obytes <= ceiling_obytes,
+   "Distribution is not even");
+   }
+   /* Put all slaves down and try and transmit */
+   for (i = 0; i < test_params->bonded_slave_count; i++) {
+   virtual_ethdev_simulate_link_status_interrupt(
+   test_params->slave_port_ids[i], 0);
+   }
+
+   /* Send burst on bonded port */
+   nb_tx = rte_eth_tx_burst(test_params->bonded_port_id, 0, pkt_burst,
+   bur

[dpdk-dev] [PATCH v3 1/2] This patch add support of mode 5 to link bonding pmd

2014-11-26 Thread Daniel Mrzyglod
v3 change:
Rebase patch version to HEAD of orgin/master.
Unit tests moved to the separate patch v3 2/2.

v2 change:
Add Unit Tests
Modification that updates obytes structure in virtualpmd driver.
change internals->slaves[i].last_obytes to have proper values.
Update codebase to Declan's patches.

v1 change:
Add support for mode 5 (Transmit load balancing) into pmd driver

This mode provides an adaptive transmit load balancing. 
It dynamically changes the transmitting slave, according to the computed load. 
Statistics are collected in 100ms intervals and scheduled every 10ms.

Signed-off-by: Daniel Mrzyglod 
---
 lib/librte_pmd_bond/rte_eth_bond.h |  11 ++
 lib/librte_pmd_bond/rte_eth_bond_args.c|   1 +
 lib/librte_pmd_bond/rte_eth_bond_pmd.c | 160 -
 lib/librte_pmd_bond/rte_eth_bond_private.h |   2 +-
 4 files changed, 171 insertions(+), 3 deletions(-)

diff --git a/lib/librte_pmd_bond/rte_eth_bond.h 
b/lib/librte_pmd_bond/rte_eth_bond.h
index 085500b..29b9a89 100644
--- a/lib/librte_pmd_bond/rte_eth_bond.h
+++ b/lib/librte_pmd_bond/rte_eth_bond.h
@@ -77,6 +77,17 @@ extern "C" {
  * In this mode all transmitted packets will be transmitted on all available
  * active slaves of the bonded. */
 #endif
+#define BONDING_MODE_ADAPTIVE_TRANSMIT_LOAD_BALANCING  (5)
+/**< Adaptive TLB (Mode 5)
+ * Adaptive transmit load balancing: channel bonding that
+ * does not require any special switch support.  The
+ * outgoing traffic is distributed according to the
+ * current load (computed relative to the speed) on each
+ * slave.  Incoming traffic is received by the current
+ * slave.  If the receiving slave fails, another slave
+ * takes over the MAC address of the failed receiving
+ * slave.*/
+
 /* Balance Mode Transmit Policies */
 #define BALANCE_XMIT_POLICY_LAYER2 (0)
 /**< Layer 2 (Ethernet MAC) */
diff --git a/lib/librte_pmd_bond/rte_eth_bond_args.c 
b/lib/librte_pmd_bond/rte_eth_bond_args.c
index d8ce681..2675cf6 100644
--- a/lib/librte_pmd_bond/rte_eth_bond_args.c
+++ b/lib/librte_pmd_bond/rte_eth_bond_args.c
@@ -173,6 +173,7 @@ bond_ethdev_parse_slave_mode_kvarg(const char *key 
__rte_unused,
 #ifdef RTE_MBUF_REFCNT
case BONDING_MODE_BROADCAST:
 #endif
+   case BONDING_MODE_ADAPTIVE_TRANSMIT_LOAD_BALANCING:
return 0;
default:
RTE_BOND_LOG(ERR, "Invalid slave mode value (%s) specified", 
value);
diff --git a/lib/librte_pmd_bond/rte_eth_bond_pmd.c 
b/lib/librte_pmd_bond/rte_eth_bond_pmd.c
index cf2fbab..7a5dae6 100644
--- a/lib/librte_pmd_bond/rte_eth_bond_pmd.c
+++ b/lib/librte_pmd_bond/rte_eth_bond_pmd.c
@@ -30,7 +30,7 @@
  *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
  *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
-
+#include 
 #include 
 #include 
 #include 
@@ -41,10 +41,15 @@
 #include 
 #include 
 #include 
+#include 

 #include "rte_eth_bond.h"
 #include "rte_eth_bond_private.h"

+#define REORDER_PERIOD_MS 10
+/* Table for statistics in mode 5 TLB */
+static uint64_t tlb_last_obytets[RTE_MAX_ETHPORTS];
+
 static uint16_t
 bond_ethdev_rx_burst(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 {
@@ -288,6 +293,144 @@ xmit_slave_hash(const struct rte_mbuf *buf, uint8_t 
slave_count, uint8_t policy)
return hash % slave_count;
 }

+struct bwg_slave {
+   uint64_t bwg_left_int;
+   uint64_t bwg_left_remainder;
+   uint8_t slave;
+};
+
+static int
+bandwidth_cmp(const void *a, const void *b)
+{
+   const struct bwg_slave *bwg_a = a;
+   const struct bwg_slave *bwg_b = b;
+   int64_t diff = (int64_t)bwg_b->bwg_left_int - 
(int64_t)bwg_a->bwg_left_int;
+   int64_t diff2 = (int64_t)bwg_b->bwg_left_remainder -
+   (int64_t)bwg_a->bwg_left_remainder;
+   if (diff > 0)
+   return 1;
+   else if (diff < 0)
+   return -1;
+   else if (diff2 > 0)
+   return 1;
+   else if (diff2 < 0)
+   return -1;
+   else
+   return 0;
+}
+
+static void
+bandwidth_left(int port_id, uint64_t load, uint8_t update_idx,
+   struct bwg_slave *bwg_slave)
+{
+   struct rte_eth_link link_status;
+
+   rte_eth_link_get(port_id, &link_status);
+   uint64_t link_bwg = link_status.link_speed * 100ULL / 8;
+   if (link_bwg == 0)
+   return;
+   link_bwg = (link_bwg * (update_idx+1) * REORDER_PERIOD_MS);
+   bwg_slave->bwg_left_int = (link_bwg - 1000*load) / link_bwg;
+   bwg_slave->bwg_left_remainder = (link_bwg - 1000*load) % link_bwg;
+}
+
+static void
+bond_ethdev_update_tlb_slave_cb(void *arg)
+{
+   struct bond_dev_private *internals = arg;
+   struct rte_eth_stats slave_stats;
+   struct bwg_slave bwg_array[RTE_MAX_ETHPORTS];
+   uint8_t slave_count;
+   uint64_t tx_bytes;
+
+   uint8_t update_stats = 0;
+   uint8_t i, slave_id;
+
+   internals->slave

[dpdk-dev] [PATCH v3 0/2] ADD mode 5(tlb) to link bonding pmd

2014-11-26 Thread Daniel Mrzyglod
This mode provides an adaptive transmit load balancing.
It dynamically changes the transmitting slave, according to the computed load.
Statistics are collected in 100ms intervals and scheduled every 10ms.

Daniel Mrzyglod (2):
  This patch add support of mode 5 to link bonding pmd
  Unit tests for Mode 5 of Bonding Transmit Load balancing.

 app/test/test_link_bonding.c   | 499 -
 app/test/virtual_pmd.c |   6 +-
 lib/librte_pmd_bond/rte_eth_bond.h |  11 +
 lib/librte_pmd_bond/rte_eth_bond_args.c|   1 +
 lib/librte_pmd_bond/rte_eth_bond_pmd.c | 160 -
 lib/librte_pmd_bond/rte_eth_bond_private.h |   2 +-
 6 files changed, 673 insertions(+), 6 deletions(-)

-- 
2.1.0



[dpdk-dev] [PATCH v3] examples/skeleton: very simple code for packet forwarding

2014-11-26 Thread Thomas Monjalon
Bruce,

I made some minor changes to the skeleton app.
Could you confirm it's ok for you?

2014-11-26 15:38, Thomas Monjalon:
> v3 changes:
> - rename skeleton_app/ to skeleton/
> - add in examples Makefile
> - fix default target to native
> - reword header guard
> - rename rxRings to rx_rings and txRings to tx_rings



[dpdk-dev] [PATCH v3 08/13] testpmd: rework csum forward engine

2014-11-26 Thread Ananyev, Konstantin
Hi Oliver,

> -Original Message-
> From: Olivier MATZ [mailto:olivier.matz at 6wind.com]
> Sent: Wednesday, November 26, 2014 2:55 PM
> To: Ananyev, Konstantin; dev at dpdk.org
> Cc: Walukiewicz, Miroslaw; Liu, Jijiang; Liu, Yong; jigsaw at gmail.com; 
> Richardson, Bruce
> Subject: Re: [PATCH v3 08/13] testpmd: rework csum forward engine
> 
> Hi Konstantin,
> 
> On 11/26/2014 01:25 PM, Ananyev, Konstantin wrote:
> >> By the way (this is probably off-topic), but I'm wondering if the TX
> >> flags should have the same values than the RX flags:
> >>
> >> #define PKT_TX_IPV4  PKT_RX_IPV4_HDR
> >> #define PKT_TX_IPV6  PKT_RX_IPV6_HDR
> >
> > Thought about that too.
> >  From one side,  it is a bit out of our concept: separate RX and TX falgs.
> >  From other side, it allows us to save 2 bits in the ol_flags.
> > Don't have any strong opinion here.
> > What do you think?
> 
> I have no strong opinion too, but I have a preference for 2 different
> bit values for these flags:
> 
> - as you say, it's matches the concept (RX and TX flags are separated)
> 
> - 64 bits is a lot, we have some time before there is no more available
>bit... and I hope we it will never occur because it would become
>complex for an application to handle them all
> 
> - it will avoid to send a packet with a bad info:
>- we receive a Ether/IP6/IP4/L4/data packet
>- the driver sets PKT_RX_IPV6_HDR
>- the stack decapsulates IP6
>- the stack sends the packet, it has the PKT_TX_IPV6 flag but it's an
>  IPv4 packet

Ah yes, you right, if we keep them the same, then upper layer always has to 
clear
PKT_RX_IPV4_HDR |  PKT_RX_IPV6_HDR before setting TX offload flags and
passing packet to the PMD for TX.
And if the upper layer wouldn't do that - it might cause a problem.
With your example above - if at last step the stack sets  PKT_TX_IP_CKSUM for 
the packet,
then PMD will receive an mbuf with (PKT_TX_IPV6 |  PKT_TX_IP_CKSUM) set.
Though from PKT_TX_IP_CKSUM/ PKT_TX_IPV6/ PKT_TX_IPV4 only one can be set,
as they are mutually exclusive.
So i40e PMD will get confused and might not be able to arm TX descriptor 
propely.
So yes, we need to make them a proper TX flags.
Thanks for spotting it.
Konstantin

> 
>This is not a real problem as the flag will not be used by the
>driver/hardware (it's only mandatory for hw cksum / tso), but
>it can be confusing.
> 
> Regards,
> Olivier
> 



[dpdk-dev] 82576 Error Disabling LPLU D0

2014-11-26 Thread Tomasz K
Hello

I'm trying to run my own app using dpdk 1.7.1 and 82576 NIC but it keeps
failing.
So I tried to run testpmd which also failed.
Then I enabled e1000 debugs and found out that NIC initialization fails
when setting LPLU D0 (code tries to read MDI Control Register which fails)

This is probably a hardware issue but I'm asking just in case if anyone has
encountered similar issue.

Thanks
Tomasz

Setup:
Pentium(R) Dual-Core  CPU  E5400  @ 2.70GHz
3.13.0-39-generic #66-Ubuntu SMP Tue Oct 28 13:30:27 UTC 2014 x86_64 x86_64
x86_64 GNU/Linux
82576 Gigabit Network Connection (rev 01)


Logs from EAL Below:
./app/testpmd -c 0x3 -n 2 -- -i
EAL: Detected lcore 0 as core 0 on socket 0
EAL: Detected lcore 1 as core 1 on socket 0
EAL: Support maximum 64 logical core(s) by configuration.
EAL: Detected 2 lcore(s)
EAL:   cannot open VFIO container, error 2 (No such file or directory)
EAL: VFIO support could not be initialized
EAL: Setting up memory...
EAL: Ask a virtual area of 0x20 bytes
EAL: Virtual area found at 0x7f5ecda0 (size = 0x20)
EAL: Ask a virtual area of 0x20 bytes
EAL: Virtual area found at 0x7f5ecd60 (size = 0x20)
EAL: Ask a virtual area of 0xac0 bytes
EAL: Virtual area found at 0x7f5ec280 (size = 0xac0)
EAL: Ask a virtual area of 0x3400 bytes
EAL: Virtual area found at 0x7f5e8e60 (size = 0x3400)
EAL: Ask a virtual area of 0x20 bytes
EAL: Virtual area found at 0x7f5e8e20 (size = 0x20)
EAL: Ask a virtual area of 0xc0 bytes
EAL: Virtual area found at 0x7f5e8d40 (size = 0xc0)
EAL: Ask a virtual area of 0x20 bytes
EAL: Virtual area found at 0x7f5e8d00 (size = 0x20)
EAL: Requesting 512 pages of size 2MB from socket 0
EAL: TSC frequency is ~37353 KHz
EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using unreliable
clock cycles !
EAL: Master core 0 is ready (tid=cedb1840)
PMD: rte_ixgbe_pmd_init():  >>
PMD: rte_ixgbevf_pmd_init(): rte_ixgbevf_pmd_init
PMD: rte_igbvf_pmd_init(): rte_igbvf_pmd_init
EAL: Core 1 is ready (tid=8c5f8700)
EAL: PCI device :01:00.0 on NUMA socket -1
EAL:   probe driver: 8086:10c9 rte_igb_pmd
EAL:   PCI memory mapped at 0x7f5eced5b000
EAL:   PCI memory mapped at 0x7f5e8b9f8000
EAL:   PCI memory mapped at 0x7f5ecedc
PMD: e1000_set_mac_type(): e1000_set_mac_type
PMD: e1000_set_mac_type(): e1000_set_mac_type
PMD: e1000_init_mac_ops_generic(): e1000_init_mac_ops_generic
PMD: e1000_init_phy_ops_generic(): e1000_init_phy_ops_generic
PMD: e1000_init_nvm_ops_generic(): e1000_init_nvm_ops_generic
PMD: e1000_init_function_pointers_82575():
e1000_init_function_pointers_82575
PMD: e1000_null_ops_generic(): e1000_null_ops_generic
PMD: e1000_init_mac_params_82575(): e1000_init_mac_params_82575
PMD: e1000_get_hw_semaphore_generic(): e1000_get_hw_semaphore_generic
PMD: e1000_put_hw_semaphore_generic(): e1000_put_hw_semaphore_generic
PMD: e1000_acquire_swfw_sync_82575(): e1000_acquire_swfw_sync_82575
PMD: e1000_get_hw_semaphore_generic(): e1000_get_hw_semaphore_generic
PMD: e1000_put_hw_semaphore_generic(): e1000_put_hw_semaphore_generic
PMD: e1000_release_swfw_sync_82575(): e1000_release_swfw_sync_82575
PMD: e1000_get_hw_semaphore_generic(): e1000_get_hw_semaphore_generic
PMD: e1000_put_hw_semaphore_generic(): e1000_put_hw_semaphore_generic
PMD: e1000_acquire_swfw_sync_82575(): e1000_acquire_swfw_sync_82575
PMD: e1000_get_hw_semaphore_generic(): e1000_get_hw_semaphore_generic
PMD: e1000_put_hw_semaphore_generic(): e1000_put_hw_semaphore_generic
PMD: e1000_release_swfw_sync_82575(): e1000_release_swfw_sync_82575
PMD: e1000_get_hw_semaphore_generic(): e1000_get_hw_semaphore_generic
PMD: e1000_put_hw_semaphore_generic(): e1000_put_hw_semaphore_generic
PMD: e1000_set_mac_type(): e1000_set_mac_type
PMD: e1000_init_mac_ops_generic(): e1000_init_mac_ops_generic
PMD: e1000_init_phy_ops_generic(): e1000_init_phy_ops_generic
PMD: e1000_init_nvm_ops_generic(): e1000_init_nvm_ops_generic
PMD: e1000_init_function_pointers_82575():
e1000_init_function_pointers_82575
PMD: e1000_init_mac_params_82575(): e1000_init_mac_params_82575
PMD: e1000_init_nvm_params_82575(): e1000_init_nvm_params_82575
PMD: e1000_init_phy_params_82575(): e1000_init_phy_params_82575
PMD: e1000_reset_mdicnfg_82580(): e1000_reset_mdicnfg_82580
PMD: e1000_get_phy_id_82575(): e1000_get_phy_id_82575
PMD: e1000_get_phy_id(): e1000_get_phy_id
PMD: __e1000_read_phy_reg_igp(): __e1000_read_phy_reg_igp
PMD: e1000_acquire_phy_82575(): e1000_acquire_phy_82575
PMD: e1000_acquire_swfw_sync_82575(): e1000_acquire_swfw_sync_82575
PMD: e1000_get_hw_semaphore_generic(): e1000_get_hw_semaphore_generic
PMD: e1000_put_hw_semaphore_generic(): e1000_put_hw_semaphore_generic
PMD: e1000_read_phy_reg_mdic(): e1000_read_phy_reg_mdic
PMD: e1000_release_phy_82575(): e1000_release_phy_82575
PMD: e1000_release_swfw_sync_82575(): e1000_release_swfw_sync_82575
PMD: e1000_get_hw_semaphore_generic(): e1000_get_hw_semaphore_generic
PMD: e1000_put_hw_semaphore_generic(): e1000_

[dpdk-dev] [PATCH v4 13/13] testpmd: add a verbose mode csum forward engine

2014-11-26 Thread Olivier Matz
If the user specifies 'set verbose 1' in testpmd command line,
the csum forward engine will dump some informations about received
and transmitted packets, especially which flags are set and what
values are assigned to l2_len, l3_len, l4_len and tso_segsz.

This can help someone implementing TSO or hardware checksum offload to
understand how to configure the mbufs.

Example of output for one packet:

 --
 rx: l2_len=14 ethertype=800 l3_len=20 l4_proto=6 l4_len=20
 tx: m->l2_len=14 m->l3_len=20 m->l4_len=20
 tx: m->tso_segsz=800
 tx: flags=PKT_TX_IP_CKSUM PKT_TX_TCP_SEG
 --

Signed-off-by: Olivier Matz 
---
 app/test-pmd/csumonly.c | 51 +
 1 file changed, 51 insertions(+)

diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
index 490342f..1a67c9d 100644
--- a/app/test-pmd/csumonly.c
+++ b/app/test-pmd/csumonly.c
@@ -470,6 +470,57 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
m->tso_segsz = tso_segsz;
m->ol_flags = ol_flags;

+   /* if verbose mode is enabled, dump debug info */
+   if (verbose_level > 0) {
+   struct {
+   uint64_t flag;
+   uint64_t mask;
+   } tx_flags[] = {
+   { PKT_TX_IP_CKSUM, PKT_TX_IP_CKSUM },
+   { PKT_TX_UDP_CKSUM, PKT_TX_L4_MASK },
+   { PKT_TX_TCP_CKSUM, PKT_TX_L4_MASK },
+   { PKT_TX_SCTP_CKSUM, PKT_TX_L4_MASK },
+   { PKT_TX_VXLAN_CKSUM, PKT_TX_VXLAN_CKSUM },
+   { PKT_TX_TCP_SEG, PKT_TX_TCP_SEG },
+   };
+   unsigned j;
+   const char *name;
+
+   printf("-\n");
+   /* dump rx parsed packet info */
+   printf("rx: l2_len=%d ethertype=%x l3_len=%d "
+   "l4_proto=%d l4_len=%d\n",
+   l2_len, rte_be_to_cpu_16(ethertype),
+   l3_len, l4_proto, l4_len);
+   if (tunnel == 1)
+   printf("rx: outer_l2_len=%d outer_ethertype=%x "
+   "outer_l3_len=%d\n", outer_l2_len,
+   rte_be_to_cpu_16(outer_ethertype),
+   outer_l3_len);
+   /* dump tx packet info */
+   if ((testpmd_ol_flags & (TESTPMD_TX_OFFLOAD_IP_CKSUM |
+   TESTPMD_TX_OFFLOAD_UDP_CKSUM |
+   TESTPMD_TX_OFFLOAD_TCP_CKSUM |
+   TESTPMD_TX_OFFLOAD_SCTP_CKSUM)) 
||
+   tso_segsz != 0)
+   printf("tx: m->l2_len=%d m->l3_len=%d "
+   "m->l4_len=%d\n",
+   m->l2_len, m->l3_len, m->l4_len);
+   if ((tunnel == 1) &&
+   (testpmd_ol_flags & 
TESTPMD_TX_OFFLOAD_VXLAN_CKSUM))
+   printf("tx: m->inner_l2_len=%d 
m->inner_l3_len=%d\n",
+   m->inner_l2_len, m->inner_l3_len);
+   if (tso_segsz != 0)
+   printf("tx: m->tso_segsz=%d\n", m->tso_segsz);
+   printf("tx: flags=");
+   for (j = 0; j < sizeof(tx_flags)/sizeof(*tx_flags); 
j++) {
+   name = 
rte_get_tx_ol_flag_name(tx_flags[j].flag);
+   if ((m->ol_flags & tx_flags[j].mask) ==
+   tx_flags[j].flag)
+   printf("%s ", name);
+   }
+   printf("\n");
+   }
}
nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst, nb_rx);
fs->tx_packets += nb_tx;
-- 
2.1.0



[dpdk-dev] [PATCH v4 12/13] testpmd: support TSO in csum forward engine

2014-11-26 Thread Olivier Matz
Add two new commands in testpmd:

- tso set  
- tso show 

These commands can be used enable TSO when transmitting TCP packets in
the csum forward engine. Ex:

  set fwd csum
  tx_checksum set ip hw 0
  tso set 800 0
  start

Signed-off-by: Olivier Matz 
---
 app/test-pmd/cmdline.c  | 92 +
 app/test-pmd/csumonly.c | 64 --
 app/test-pmd/testpmd.h  |  1 +
 3 files changed, 139 insertions(+), 18 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 722cd76..2a8c260 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -329,6 +329,14 @@ static void cmd_help_long_parsed(void *parsed_result,
"tx_checksum show (port_id)\n"
"Display tx checksum offload configuration\n\n"

+   "tso set (segsize) (portid)\n"
+   "Enable TCP Segmentation Offload in csum forward"
+   " engine.\n"
+   "Please check the NIC datasheet for HW limits.\n\n"
+
+   "tso show (portid)"
+   "Display the status of TCP Segmentation 
Offload.\n\n"
+
"set fwd (%s)\n"
"Set packet forwarding mode.\n\n"

@@ -2984,6 +2992,88 @@ cmdline_parse_inst_t cmd_tx_cksum_show = {
},
 };

+/* *** ENABLE HARDWARE SEGMENTATION IN TX PACKETS *** */
+struct cmd_tso_set_result {
+   cmdline_fixed_string_t tso;
+   cmdline_fixed_string_t mode;
+   uint16_t tso_segsz;
+   uint8_t port_id;
+};
+
+static void
+cmd_tso_set_parsed(void *parsed_result,
+  __attribute__((unused)) struct cmdline *cl,
+  __attribute__((unused)) void *data)
+{
+   struct cmd_tso_set_result *res = parsed_result;
+   struct rte_eth_dev_info dev_info;
+
+   if (port_id_is_invalid(res->port_id))
+   return;
+
+   if (!strcmp(res->mode, "set"))
+   ports[res->port_id].tso_segsz = res->tso_segsz;
+
+   if (ports[res->port_id].tso_segsz == 0)
+   printf("TSO is disabled\n");
+   else
+   printf("TSO segment size is %d\n",
+   ports[res->port_id].tso_segsz);
+
+   /* display warnings if configuration is not supported by the NIC */
+   rte_eth_dev_info_get(res->port_id, &dev_info);
+   if ((ports[res->port_id].tso_segsz != 0) &&
+   (dev_info.tx_offload_capa & DEV_TX_OFFLOAD_TCP_TSO) == 0) {
+   printf("Warning: TSO enabled but not "
+   "supported by port %d\n", res->port_id);
+   }
+}
+
+cmdline_parse_token_string_t cmd_tso_set_tso =
+   TOKEN_STRING_INITIALIZER(struct cmd_tso_set_result,
+   tso, "tso");
+cmdline_parse_token_string_t cmd_tso_set_mode =
+   TOKEN_STRING_INITIALIZER(struct cmd_tso_set_result,
+   mode, "set");
+cmdline_parse_token_num_t cmd_tso_set_tso_segsz =
+   TOKEN_NUM_INITIALIZER(struct cmd_tso_set_result,
+   tso_segsz, UINT16);
+cmdline_parse_token_num_t cmd_tso_set_portid =
+   TOKEN_NUM_INITIALIZER(struct cmd_tso_set_result,
+   port_id, UINT8);
+
+cmdline_parse_inst_t cmd_tso_set = {
+   .f = cmd_tso_set_parsed,
+   .data = NULL,
+   .help_str = "Set TSO segment size for csum engine (0 to disable): "
+   "tso set  ",
+   .tokens = {
+   (void *)&cmd_tso_set_tso,
+   (void *)&cmd_tso_set_mode,
+   (void *)&cmd_tso_set_tso_segsz,
+   (void *)&cmd_tso_set_portid,
+   NULL,
+   },
+};
+
+cmdline_parse_token_string_t cmd_tso_show_mode =
+   TOKEN_STRING_INITIALIZER(struct cmd_tso_set_result,
+   mode, "show");
+
+
+cmdline_parse_inst_t cmd_tso_show = {
+   .f = cmd_tso_set_parsed,
+   .data = NULL,
+   .help_str = "Show TSO segment size for csum engine: "
+   "tso show ",
+   .tokens = {
+   (void *)&cmd_tso_set_tso,
+   (void *)&cmd_tso_show_mode,
+   (void *)&cmd_tso_set_portid,
+   NULL,
+   },
+};
+
 /* *** ENABLE/DISABLE FLUSH ON RX STREAMS *** */
 struct cmd_set_flush_rx {
cmdline_fixed_string_t set;
@@ -8660,6 +8750,8 @@ cmdline_parse_ctx_t main_ctx[] = {
(cmdline_parse_inst_t *)&cmd_tx_vlan_set_pvid,
(cmdline_parse_inst_t *)&cmd_tx_cksum_set,
(cmdline_parse_inst_t *)&cmd_tx_cksum_show,
+   (cmdline_parse_inst_t *)&cmd_tso_set,
+   (cmdline_parse_inst_t *)&cmd_tso_show,
(cmdline_parse_inst_t *)&cmd_link_flow_control_set,
(cmdline_parse_inst_t *)&cmd_link_flow_control_set_rx,
(cmdline_parse_inst_t *)&cmd_link_flow_control_set_tx,
diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
index 9a5408e..490342f 100644
--- a/app/test-pmd/

[dpdk-dev] [PATCH v4 11/13] ixgbe: support TCP segmentation offload

2014-11-26 Thread Olivier Matz
Implement TSO (TCP segmentation offload) in ixgbe driver. The driver is
now able to use PKT_TX_TCP_SEG mbuf flag and mbuf hardware offload infos
(l2_len, l3_len, l4_len, tso_segsz) to configure the hardware support of
TCP segmentation.

In ixgbe, when doing TSO, the IP length must not be included in the TCP
pseudo header checksum. A new function ixgbe_fix_tcp_phdr_cksum() is
used to fix the pseudo header checksum of the packet before giving it to
the hardware.

In the patch, the tx_desc_cksum_flags_to_olinfo() and
tx_desc_ol_flags_to_cmdtype() functions have been reworked to make them
clearer. This should not impact performance as gcc (version 4.8 in my
case) is smart enough to convert the tests into a code that does not
contain any branch instruction.

Signed-off-by: Olivier Matz 
Acked-by: Konstantin Ananyev 
---
 lib/librte_mbuf/rte_mbuf.h  |   5 +-
 lib/librte_pmd_ixgbe/ixgbe_ethdev.c |   3 +-
 lib/librte_pmd_ixgbe/ixgbe_rxtx.c   | 170 ++--
 lib/librte_pmd_ixgbe/ixgbe_rxtx.h   |  19 ++--
 4 files changed, 121 insertions(+), 76 deletions(-)

diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 04cbf41..38d7d0d 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -125,10 +125,10 @@ extern "C" {
 #define PKT_TX_IP_CKSUM  (1ULL << 54)/**< IP cksum of TX pkt. computed by 
NIC. */
 #define PKT_TX_IPV4_CSUM PKT_TX_IP_CKSUM /**< Alias of PKT_TX_IP_CKSUM. */

-/** Tell the NIC it's an IPv4 packet. Required for L4 checksum offload. */
+/** Tell the NIC it's an IPv4 packet. Required for L4 checksum offload or TSO. 
*/
 #define PKT_TX_IPV4  PKT_RX_IPV4_HDR

-/** Tell the NIC it's an IPv6 packet. Required for L4 checksum offload. */
+/** Tell the NIC it's an IPv6 packet. Required for L4 checksum offload or TSO. 
*/
 #define PKT_TX_IPV6  PKT_RX_IPV6_HDR

 #define PKT_TX_VLAN_PKT  (1ULL << 55) /**< TX packet is a 802.1q VLAN 
packet. */
@@ -138,6 +138,7 @@ extern "C" {
  * packet to be transmitted on hardware supporting TSO:
  *  - set the PKT_TX_TCP_SEG flag in mbuf->ol_flags (this flag implies
  *PKT_TX_TCP_CKSUM)
+ *  - set the flag PKT_TX_IPV4 or PKT_TX_IPV6
  *  - if it's IPv4, set the PKT_TX_IP_CKSUM flag and write the IP checksum
  *to 0 in the packet
  *  - fill the mbuf offload information: l2_len, l3_len, l4_len, tso_segsz
diff --git a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c 
b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
index 08e3db4..937fc3c 100644
--- a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
+++ b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
@@ -1973,7 +1973,8 @@ ixgbe_dev_info_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
DEV_TX_OFFLOAD_IPV4_CKSUM  |
DEV_TX_OFFLOAD_UDP_CKSUM   |
DEV_TX_OFFLOAD_TCP_CKSUM   |
-   DEV_TX_OFFLOAD_SCTP_CKSUM;
+   DEV_TX_OFFLOAD_SCTP_CKSUM  |
+   DEV_TX_OFFLOAD_TCP_TSO;

dev_info->default_rxconf = (struct rte_eth_rxconf) {
.rx_thresh = {
diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c 
b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
index 2df3385..63216fa 100644
--- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
+++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
@@ -2,6 +2,7 @@
  *   BSD LICENSE
  *
  *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   Copyright 2014 6WIND S.A.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -94,7 +95,8 @@
 #define IXGBE_TX_OFFLOAD_MASK ( \
PKT_TX_VLAN_PKT |\
PKT_TX_IP_CKSUM |\
-   PKT_TX_L4_MASK)
+   PKT_TX_L4_MASK | \
+   PKT_TX_TCP_SEG)

 static inline struct rte_mbuf *
 rte_rxmbuf_alloc(struct rte_mempool *mp)
@@ -363,59 +365,84 @@ ixgbe_xmit_pkts_simple(void *tx_queue, struct rte_mbuf 
**tx_pkts,
 static inline void
 ixgbe_set_xmit_ctx(struct igb_tx_queue* txq,
volatile struct ixgbe_adv_tx_context_desc *ctx_txd,
-   uint64_t ol_flags, uint32_t vlan_macip_lens)
+   uint64_t ol_flags, union ixgbe_tx_offload tx_offload)
 {
uint32_t type_tucmd_mlhl;
-   uint32_t mss_l4len_idx;
+   uint32_t mss_l4len_idx = 0;
uint32_t ctx_idx;
-   uint32_t cmp_mask;
+   uint32_t vlan_macip_lens;
+   union ixgbe_tx_offload tx_offload_mask;

ctx_idx = txq->ctx_curr;
-   cmp_mask = 0;
+   tx_offload_mask.data = 0;
type_tucmd_mlhl = 0;

+   /* Specify which HW CTX to upload. */
+   mss_l4len_idx |= (ctx_idx << IXGBE_ADVTXD_IDX_SHIFT);
+
if (ol_flags & PKT_TX_VLAN_PKT) {
-   cmp_mask |= TX_VLAN_CMP_MASK;
+   tx_offload_mask.vlan_tci = ~0;
}

-   if (ol_flags & PKT_TX_IP_CKSUM) {
-   type_tucmd_mlhl = IXGBE_ADVTXD_TUCMD_IPV4;
-   cmp_mask |= TX_MACIP_LEN_CMP_MASK;
-   }
+   /* check 

[dpdk-dev] [PATCH v4 10/13] mbuf: generic support for TCP segmentation offload

2014-11-26 Thread Olivier Matz
Some of the NICs supported by DPDK have a possibility to accelerate TCP
traffic by using segmentation offload. The application prepares a packet
with valid TCP header with size up to 64K and deleguates the
segmentation to the NIC.

Implement the generic part of TCP segmentation offload in rte_mbuf. It
introduces 2 new fields in rte_mbuf: l4_len (length of L4 header in bytes)
and tso_segsz (MSS of packets).

To delegate the TCP segmentation to the hardware, the user has to:

- set the PKT_TX_TCP_SEG flag in mbuf->ol_flags (this flag implies
  PKT_TX_TCP_CKSUM)
- set the flag PKT_TX_IPV4 or PKT_TX_IPV6
- set PKT_TX_IP_CKSUM if it's IPv4, and set the IP checksum to 0 in
  the packet
- fill the mbuf offload information: l2_len, l3_len, l4_len, tso_segsz
- calculate the pseudo header checksum without taking ip_len in account,
  and set it in the TCP header, for instance by using
  rte_ipv4_phdr_cksum(ip_hdr, ol_flags)

The API is inspired from ixgbe hardware (the next commit adds the
support for ixgbe), but it seems generic enough to be used for other
hw/drivers in the future.

This commit also reworks the way l2_len and l3_len are used in igb
and ixgbe drivers as the l2_l3_len is not available anymore in mbuf.

Signed-off-by: Mirek Walukiewicz 
Signed-off-by: Olivier Matz 
Acked-by: Konstantin Ananyev 
---
 app/test-pmd/testpmd.c|  2 +-
 examples/ipv4_multicast/main.c|  2 +-
 lib/librte_mbuf/rte_mbuf.c|  1 +
 lib/librte_mbuf/rte_mbuf.h| 45 +++
 lib/librte_net/rte_ip.h   | 39 +++--
 lib/librte_pmd_e1000/igb_rxtx.c   | 11 +-
 lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 11 +-
 7 files changed, 82 insertions(+), 29 deletions(-)

diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 8a4190b..d2d127d 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -408,7 +408,7 @@ testpmd_mbuf_ctor(struct rte_mempool *mp,
mb->ol_flags = 0;
mb->data_off = RTE_PKTMBUF_HEADROOM;
mb->nb_segs  = 1;
-   mb->l2_l3_len   = 0;
+   mb->tx_offload   = 0;
mb->vlan_tci = 0;
mb->hash.rss = 0;
 }
diff --git a/examples/ipv4_multicast/main.c b/examples/ipv4_multicast/main.c
index 590d11a..80c5140 100644
--- a/examples/ipv4_multicast/main.c
+++ b/examples/ipv4_multicast/main.c
@@ -302,7 +302,7 @@ mcast_out_pkt(struct rte_mbuf *pkt, int use_clone)
/* copy metadata from source packet*/
hdr->port = pkt->port;
hdr->vlan_tci = pkt->vlan_tci;
-   hdr->l2_l3_len = pkt->l2_l3_len;
+   hdr->tx_offload = pkt->tx_offload;
hdr->hash = pkt->hash;

hdr->ol_flags = pkt->ol_flags;
diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
index 9b57b3a..87c2963 100644
--- a/lib/librte_mbuf/rte_mbuf.c
+++ b/lib/librte_mbuf/rte_mbuf.c
@@ -241,6 +241,7 @@ const char *rte_get_tx_ol_flag_name(uint64_t mask)
case PKT_TX_UDP_CKSUM: return "PKT_TX_UDP_CKSUM";
case PKT_TX_IEEE1588_TMST: return "PKT_TX_IEEE1588_TMST";
case PKT_TX_VXLAN_CKSUM: return "PKT_TX_VXLAN_CKSUM";
+   case PKT_TX_TCP_SEG: return "PKT_TX_TCP_SEG";
default: return NULL;
}
 }
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 832fe0a..04cbf41 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -2,6 +2,7 @@
  *   BSD LICENSE
  *
  *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   Copyright 2014 6WIND S.A.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -132,6 +133,20 @@ extern "C" {

 #define PKT_TX_VLAN_PKT  (1ULL << 55) /**< TX packet is a 802.1q VLAN 
packet. */

+/**
+ * TCP segmentation offload. To enable this offload feature for a
+ * packet to be transmitted on hardware supporting TSO:
+ *  - set the PKT_TX_TCP_SEG flag in mbuf->ol_flags (this flag implies
+ *PKT_TX_TCP_CKSUM)
+ *  - if it's IPv4, set the PKT_TX_IP_CKSUM flag and write the IP checksum
+ *to 0 in the packet
+ *  - fill the mbuf offload information: l2_len, l3_len, l4_len, tso_segsz
+ *  - calculate the pseudo header checksum without taking ip_len in accound,
+ *and set it in the TCP header. Refer to rte_ipv4_phdr_cksum() and
+ *rte_ipv6_phdr_cksum() that can be used as helpers.
+ */
+#define PKT_TX_TCP_SEG   (1ULL << 49)
+
 /* Use final bit of flags to indicate a control mbuf */
 #define CTRL_MBUF_FLAG   (1ULL << 63) /**< Mbuf contains control data */

@@ -242,22 +257,18 @@ struct rte_mbuf {

/* fields to support TX offloads */
union {
-   uint16_t l2_l3_len; /**< combined l2/l3 lengths as single var */
+   uint64_t tx_offload;   /**< combined for easy fetch */
struct {
-   uint16_t l3_len:9;  /**< L3 (IP) Header Length. */
-   uint16_t l2_len:7;  /**< L2 (MAC

[dpdk-dev] [PATCH v4 09/13] mbuf: introduce new checksum API

2014-11-26 Thread Olivier Matz
Introduce new functions to calculate checksums. These new functions
are derivated from the ones provided csumonly.c but slightly reworked.
There is still some room for future optimization of these functions
(maybe SSE/AVX, ...).

This API will be modified in tbe next commits by the introduction of
TSO that requires a different pseudo header checksum to be set in the
packet.

Signed-off-by: Olivier Matz 
Acked-by: Konstantin Ananyev 
---
 app/test-pmd/csumonly.c| 133 ++--
 lib/librte_mbuf/rte_mbuf.h |   3 +-
 lib/librte_net/rte_ip.h| 183 +
 3 files changed, 193 insertions(+), 126 deletions(-)

diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
index 6b28003..9a5408e 100644
--- a/app/test-pmd/csumonly.c
+++ b/app/test-pmd/csumonly.c
@@ -87,137 +87,22 @@
 #define _htons(x) (x)
 #endif

-static inline uint16_t
-get_16b_sum(uint16_t *ptr16, uint32_t nr)
-{
-   uint32_t sum = 0;
-   while (nr > 1)
-   {
-   sum +=*ptr16;
-   nr -= sizeof(uint16_t);
-   ptr16++;
-   if (sum > UINT16_MAX)
-   sum -= UINT16_MAX;
-   }
-
-   /* If length is in odd bytes */
-   if (nr)
-   sum += *((uint8_t*)ptr16);
-
-   sum = ((sum & 0x) >> 16) + (sum & 0x);
-   sum &= 0x0;
-   return (uint16_t)sum;
-}
-
-static inline uint16_t
-get_ipv4_cksum(struct ipv4_hdr *ipv4_hdr)
-{
-   uint16_t cksum;
-   cksum = get_16b_sum((uint16_t*)ipv4_hdr, sizeof(struct ipv4_hdr));
-   return (uint16_t)((cksum == 0x)?cksum:~cksum);
-}
-
-
-static inline uint16_t
-get_ipv4_psd_sum(struct ipv4_hdr *ip_hdr)
-{
-   /* Pseudo Header for IPv4/UDP/TCP checksum */
-   union ipv4_psd_header {
-   struct {
-   uint32_t src_addr; /* IP address of source host. */
-   uint32_t dst_addr; /* IP address of destination 
host(s). */
-   uint8_t  zero; /* zero. */
-   uint8_t  proto;/* L4 protocol type. */
-   uint16_t len;  /* L4 length. */
-   } __attribute__((__packed__));
-   uint16_t u16_arr[0];
-   } psd_hdr;
-
-   psd_hdr.src_addr = ip_hdr->src_addr;
-   psd_hdr.dst_addr = ip_hdr->dst_addr;
-   psd_hdr.zero = 0;
-   psd_hdr.proto= ip_hdr->next_proto_id;
-   psd_hdr.len  = 
rte_cpu_to_be_16((uint16_t)(rte_be_to_cpu_16(ip_hdr->total_length)
-   - sizeof(struct ipv4_hdr)));
-   return get_16b_sum(psd_hdr.u16_arr, sizeof(psd_hdr));
-}
-
-static inline uint16_t
-get_ipv6_psd_sum(struct ipv6_hdr *ip_hdr)
-{
-   /* Pseudo Header for IPv6/UDP/TCP checksum */
-   union ipv6_psd_header {
-   struct {
-   uint8_t src_addr[16]; /* IP address of source host. */
-   uint8_t dst_addr[16]; /* IP address of destination 
host(s). */
-   uint32_t len; /* L4 length. */
-   uint32_t proto;   /* L4 protocol - top 3 bytes must 
be zero */
-   } __attribute__((__packed__));
-
-   uint16_t u16_arr[0]; /* allow use as 16-bit values with safe 
aliasing */
-   } psd_hdr;
-
-   rte_memcpy(&psd_hdr.src_addr, ip_hdr->src_addr,
-   sizeof(ip_hdr->src_addr) + sizeof(ip_hdr->dst_addr));
-   psd_hdr.len   = ip_hdr->payload_len;
-   psd_hdr.proto = (ip_hdr->proto << 24);
-
-   return get_16b_sum(psd_hdr.u16_arr, sizeof(psd_hdr));
-}
-
 static uint16_t
 get_psd_sum(void *l3_hdr, uint16_t ethertype)
 {
if (ethertype == _htons(ETHER_TYPE_IPv4))
-   return get_ipv4_psd_sum(l3_hdr);
+   return rte_ipv4_phdr_cksum(l3_hdr);
else /* assume ethertype == ETHER_TYPE_IPv6 */
-   return get_ipv6_psd_sum(l3_hdr);
-}
-
-static inline uint16_t
-get_ipv4_udptcp_checksum(struct ipv4_hdr *ipv4_hdr, uint16_t *l4_hdr)
-{
-   uint32_t cksum;
-   uint32_t l4_len;
-
-   l4_len = rte_be_to_cpu_16(ipv4_hdr->total_length) - sizeof(struct 
ipv4_hdr);
-
-   cksum = get_16b_sum(l4_hdr, l4_len);
-   cksum += get_ipv4_psd_sum(ipv4_hdr);
-
-   cksum = ((cksum & 0x) >> 16) + (cksum & 0x);
-   cksum = (~cksum) & 0x;
-   if (cksum == 0)
-   cksum = 0x;
-   return (uint16_t)cksum;
-}
-
-static inline uint16_t
-get_ipv6_udptcp_checksum(struct ipv6_hdr *ipv6_hdr, uint16_t *l4_hdr)
-{
-   uint32_t cksum;
-   uint32_t l4_len;
-
-   l4_len = rte_be_to_cpu_16(ipv6_hdr->payload_len);
-
-   cksum = get_16b_sum(l4_hdr, l4_len);
-   cksum += get_ipv6_psd_sum(ipv6_hdr);
-
-   cksum = ((cksum & 0x) >> 16) + (cksum & 0x);
-   cksum = (~cksum) & 0x;
-   if (cksum == 0)
-   cksum = 0x;
-
-   return (uint16_t)cksum;
+ 

[dpdk-dev] [PATCH v4 08/13] testpmd: rework csum forward engine

2014-11-26 Thread Olivier Matz
The csum forward engine was becoming too complex to be used and
extended (the next commits want to add the support of TSO):

- no explaination about what the code does
- code is not factorized, lots of code duplicated, especially between
  ipv4/ipv6
- user command line api: use of bitmasks that need to be calculated by
  the user
- the user flags don't have the same semantic:
  - for legacy IP/UDP/TCP/SCTP, it selects software or hardware checksum
  - for other (vxlan), it selects between hardware checksum or no
checksum
- the code relies too much on flags set by the driver without software
  alternative (ex: PKT_RX_TUNNEL_IPV4_HDR). It is nice to be able to
  compare a software implementation with the hardware offload.

This commit tries to fix these issues, and provide a simple definition
of what is done by the forward engine:

 * Receive a burst of packets, and for supported packet types:
 *  - modify the IPs
 *  - reprocess the checksum in SW or HW, depending on testpmd command line
 *configuration
 * Then packets are transmitted on the output port.
 *
 * Supported packets are:
 *   Ether / (vlan) / IP|IP6 / UDP|TCP|SCTP .
 *   Ether / (vlan) / IP|IP6 / UDP / VxLAN / Ether / IP|IP6 / UDP|TCP|SCTP
 *
 * The network parser supposes that the packet is contiguous, which may
 * not be the case in real life.

Signed-off-by: Olivier Matz 
---
 app/test-pmd/cmdline.c  | 156 ---
 app/test-pmd/config.c   |  13 +-
 app/test-pmd/csumonly.c | 679 ++--
 app/test-pmd/testpmd.h  |  17 +-
 4 files changed, 440 insertions(+), 425 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index bb4e75c..722cd76 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -316,19 +316,19 @@ static void cmd_help_long_parsed(void *parsed_result,
"Disable hardware insertion of a VLAN header in"
" packets sent on a port.\n\n"

-   "tx_checksum set (mask) (port_id)\n"
-   "Enable hardware insertion of checksum offload with"
-   " the 8-bit mask, 0~0xff, in packets sent on a port.\n"
-   "bit 0 - insert ip   checksum offload if set\n"
-   "bit 1 - insert udp  checksum offload if set\n"
-   "bit 2 - insert tcp  checksum offload if set\n"
-   "bit 3 - insert sctp checksum offload if set\n"
-   "bit 4 - insert inner ip  checksum offload if 
set\n"
-   "bit 5 - insert inner udp checksum offload if 
set\n"
-   "bit 6 - insert inner tcp checksum offload if 
set\n"
-   "bit 7 - insert inner sctp checksum offload if 
set\n"
+   "tx_cksum set (ip|udp|tcp|sctp|vxlan) (hw|sw) 
(port_id)\n"
+   "Select hardware or software calculation of the"
+   " checksum with when transmitting a packet using the"
+   " csum forward engine.\n"
+   "ip|udp|tcp|sctp always concern the inner layer.\n"
+   "vxlan concerns the outer IP and UDP layer (in"
+   " case the packet is recognized as a vxlan packet by"
+   " the forward engine)\n"
"Please check the NIC datasheet for HW limits.\n\n"

+   "tx_checksum show (port_id)\n"
+   "Display tx checksum offload configuration\n\n"
+
"set fwd (%s)\n"
"Set packet forwarding mode.\n\n"

@@ -2855,48 +2855,131 @@ cmdline_parse_inst_t cmd_tx_vlan_reset = {


 /* *** ENABLE HARDWARE INSERTION OF CHECKSUM IN TX PACKETS *** */
-struct cmd_tx_cksum_set_result {
+struct cmd_tx_cksum_result {
cmdline_fixed_string_t tx_cksum;
-   cmdline_fixed_string_t set;
-   uint8_t cksum_mask;
+   cmdline_fixed_string_t mode;
+   cmdline_fixed_string_t proto;
+   cmdline_fixed_string_t hwsw;
uint8_t port_id;
 };

 static void
-cmd_tx_cksum_set_parsed(void *parsed_result,
+cmd_tx_cksum_parsed(void *parsed_result,
   __attribute__((unused)) struct cmdline *cl,
   __attribute__((unused)) void *data)
 {
-   struct cmd_tx_cksum_set_result *res = parsed_result;
+   struct cmd_tx_cksum_result *res = parsed_result;
+   int hw = 0;
+   uint16_t ol_flags, mask = 0;
+   struct rte_eth_dev_info dev_info;
+
+   if (port_id_is_invalid(res->port_id)) {
+   printf("invalid port %d\n", res->port_id);
+   return;
+   }

-   tx_cksum_set(res->port_id, res->cksum_mask);
+   if (!strcmp(res->mode, "set")) {
+
+   if (!strcmp(res->hwsw, "hw"))
+   hw = 1;
+
+   if (!strcmp(res->

[dpdk-dev] [PATCH v4 07/13] testpmd: fix use of offload flags in testpmd

2014-11-26 Thread Olivier Matz
In testpmd the rte_port->tx_ol_flags flag was used in 2 incompatible
manners:
- sometimes used with testpmd specific flags (0xff for checksums, and
  bit 11 for vlan)
- sometimes assigned to m->ol_flags directly, which is wrong in case
  of checksum flags

This commit replaces the hardcoded values by named definitions, which
are not compatible with mbuf flags. The testpmd forward engines are
fixed to use the flags properly.

Signed-off-by: Olivier Matz 
Acked-by: Konstantin Ananyev 
---
 app/test-pmd/config.c   |  4 ++--
 app/test-pmd/csumonly.c | 40 +++-
 app/test-pmd/macfwd.c   |  5 -
 app/test-pmd/macswap.c  |  5 -
 app/test-pmd/testpmd.h  | 28 +---
 app/test-pmd/txonly.c   |  9 ++---
 6 files changed, 60 insertions(+), 31 deletions(-)

diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index a322d8b..c5ac8a5 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -1683,7 +1683,7 @@ tx_vlan_set(portid_t port_id, uint16_t vlan_id)
return;
if (vlan_id_is_invalid(vlan_id))
return;
-   ports[port_id].tx_ol_flags |= PKT_TX_VLAN_PKT;
+   ports[port_id].tx_ol_flags |= TESTPMD_TX_OFFLOAD_INSERT_VLAN;
ports[port_id].tx_vlan_id = vlan_id;
 }

@@ -1692,7 +1692,7 @@ tx_vlan_reset(portid_t port_id)
 {
if (port_id_is_invalid(port_id))
return;
-   ports[port_id].tx_ol_flags &= ~PKT_TX_VLAN_PKT;
+   ports[port_id].tx_ol_flags &= ~TESTPMD_TX_OFFLOAD_INSERT_VLAN;
 }

 void
diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
index 8d10bfd..743094a 100644
--- a/app/test-pmd/csumonly.c
+++ b/app/test-pmd/csumonly.c
@@ -322,7 +322,7 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
/* Do not delete, this is required by HW*/
ipv4_hdr->hdr_checksum = 0;

-   if (tx_ol_flags & 0x1) {
+   if (tx_ol_flags & TESTPMD_TX_OFFLOAD_IP_CKSUM) {
/* HW checksum */
ol_flags |= PKT_TX_IP_CKSUM;
}
@@ -336,7 +336,7 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
if (l4_proto == IPPROTO_UDP) {
udp_hdr = (struct udp_hdr*) 
(rte_pktmbuf_mtod(mb,
unsigned char *) + l2_len + 
l3_len);
-   if (tx_ol_flags & 0x2) {
+   if (tx_ol_flags & TESTPMD_TX_OFFLOAD_UDP_CKSUM) 
{
/* HW Offload */
ol_flags |= PKT_TX_UDP_CKSUM;
if (ipv4_tunnel)
@@ -358,7 +358,7 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
uint16_t len;

/* Check if inner L3/L4 checkum flag is 
set */
-   if (tx_ol_flags & 0xF0)
+   if (tx_ol_flags & 
TESTPMD_TX_OFFLOAD_INNER_CKSUM_MASK)
ol_flags |= PKT_TX_VXLAN_CKSUM;

inner_l2_len  = sizeof(struct 
ether_hdr);
@@ -381,7 +381,7 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
unsigned char 
*) + len);
inner_l4_proto = 
inner_ipv4_hdr->next_proto_id;

-   if (tx_ol_flags & 0x10) {
+   if (tx_ol_flags & 
TESTPMD_TX_OFFLOAD_INNER_IP_CKSUM) {

/* Do not delete, this 
is required by HW*/

inner_ipv4_hdr->hdr_checksum = 0;
@@ -394,7 +394,8 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
unsigned char 
*) + len);
inner_l4_proto = 
inner_ipv6_hdr->proto;
}
-   if ((inner_l4_proto == IPPROTO_UDP) && 
(tx_ol_flags & 0x20)) {
+   if ((inner_l4_proto == IPPROTO_UDP) &&
+   (tx_ol_flags & 
TESTPMD_TX_OFFLOAD_INNER_UDP_CKSUM)) {

/* HW Offload */
ol_flags |= PKT_TX_UDP_CKSUM;
@@ -405,7 +406,8 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
else if (eth_type == 
ETHER_TYPE_IPv6)

inner_udp_hdr->dgram_cksum = get_ipv6_psd_sum(inner_ipv6_hdr);

-   } else if ((inne

[dpdk-dev] [PATCH v4 06/13] mbuf: add functions to get the name of an ol_flag

2014-11-26 Thread Olivier Matz
In test-pmd (rxonly.c), the code is able to dump the list of ol_flags.
The issue is that the list of flags in the application has to be
synchronized with the flags defined in rte_mbuf.h.

This patch introduces 2 new functions rte_get_rx_ol_flag_name()
and rte_get_tx_ol_flag_name() that returns the name of a flag from
its mask. It also fixes rxonly.c to use this new functions and to
display the proper flags.

Signed-off-by: Olivier Matz 
---
 app/test-pmd/rxonly.c  | 36 ++
 lib/librte_mbuf/rte_mbuf.c | 48 ++
 lib/librte_mbuf/rte_mbuf.h | 25 
 3 files changed, 83 insertions(+), 26 deletions(-)

diff --git a/app/test-pmd/rxonly.c b/app/test-pmd/rxonly.c
index 88b65bc..fdfe990 100644
--- a/app/test-pmd/rxonly.c
+++ b/app/test-pmd/rxonly.c
@@ -71,26 +71,6 @@

 #include "testpmd.h"

-#define MAX_PKT_RX_FLAGS 13
-static const char *pkt_rx_flag_names[MAX_PKT_RX_FLAGS] = {
-   "VLAN_PKT",
-   "RSS_HASH",
-   "PKT_RX_FDIR",
-   "IP_CKSUM",
-   "IP_CKSUM_BAD",
-
-   "IPV4_HDR",
-   "IPV4_HDR_EXT",
-   "IPV6_HDR",
-   "IPV6_HDR_EXT",
-
-   "IEEE1588_PTP",
-   "IEEE1588_TMST",
-
-   "TUNNEL_IPV4_HDR",
-   "TUNNEL_IPV6_HDR",
-};
-
 static inline void
 print_ether_addr(const char *what, struct ether_addr *eth_addr)
 {
@@ -222,12 +202,16 @@ pkt_burst_receive(struct fwd_stream *fs)
printf(" - Receive queue=0x%x", (unsigned) fs->rx_queue);
printf("\n");
if (ol_flags != 0) {
-   int rxf;
-
-   for (rxf = 0; rxf < MAX_PKT_RX_FLAGS; rxf++) {
-   if (ol_flags & (1 << rxf))
-   printf("  PKT_RX_%s\n",
-  pkt_rx_flag_names[rxf]);
+   unsigned rxf;
+   const char *name;
+
+   for (rxf = 0; rxf < sizeof(mb->ol_flags) * 8; rxf++) {
+   if ((ol_flags & (1ULL << rxf)) == 0)
+   continue;
+   name = rte_get_rx_ol_flag_name(1ULL << rxf);
+   if (name == NULL)
+   continue;
+   printf("  %s\n", name);
}
}
rte_pktmbuf_free(mb);
diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
index 52e7574..9b57b3a 100644
--- a/lib/librte_mbuf/rte_mbuf.c
+++ b/lib/librte_mbuf/rte_mbuf.c
@@ -2,6 +2,7 @@
  *   BSD LICENSE
  *
  *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   Copyright 2014 6WIND S.A.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -196,3 +197,50 @@ rte_pktmbuf_dump(FILE *f, const struct rte_mbuf *m, 
unsigned dump_len)
nb_segs --;
}
 }
+
+/*
+ * Get the name of a RX offload flag. Must be kept synchronized with flag
+ * definitions in rte_mbuf.h.
+ */
+const char *rte_get_rx_ol_flag_name(uint64_t mask)
+{
+   switch (mask) {
+   case PKT_RX_VLAN_PKT: return "PKT_RX_VLAN_PKT";
+   case PKT_RX_RSS_HASH: return "PKT_RX_RSS_HASH";
+   case PKT_RX_FDIR: return "PKT_RX_FDIR";
+   case PKT_RX_L4_CKSUM_BAD: return "PKT_RX_L4_CKSUM_BAD";
+   case PKT_RX_IP_CKSUM_BAD: return "PKT_RX_IP_CKSUM_BAD";
+   /* case PKT_RX_EIP_CKSUM_BAD: return "PKT_RX_EIP_CKSUM_BAD"; */
+   /* case PKT_RX_OVERSIZE: return "PKT_RX_OVERSIZE"; */
+   /* case PKT_RX_HBUF_OVERFLOW: return "PKT_RX_HBUF_OVERFLOW"; */
+   /* case PKT_RX_RECIP_ERR: return "PKT_RX_RECIP_ERR"; */
+   /* case PKT_RX_MAC_ERR: return "PKT_RX_MAC_ERR"; */
+   case PKT_RX_IPV4_HDR: return "PKT_RX_IPV4_HDR";
+   case PKT_RX_IPV4_HDR_EXT: return "PKT_RX_IPV4_HDR_EXT";
+   case PKT_RX_IPV6_HDR: return "PKT_RX_IPV6_HDR";
+   case PKT_RX_IPV6_HDR_EXT: return "PKT_RX_IPV6_HDR_EXT";
+   case PKT_RX_IEEE1588_PTP: return "PKT_RX_IEEE1588_PTP";
+   case PKT_RX_IEEE1588_TMST: return "PKT_RX_IEEE1588_TMST";
+   case PKT_RX_TUNNEL_IPV4_HDR: return "PKT_RX_TUNNEL_IPV4_HDR";
+   case PKT_RX_TUNNEL_IPV6_HDR: return "PKT_RX_TUNNEL_IPV6_HDR";
+   default: return NULL;
+   }
+}
+
+/*
+ * Get the name of a TX offload flag. Must be kept synchronized with flag
+ * definitions in rte_mbuf.h.
+ */
+const char *rte_get_tx_ol_flag_name(uint64_t mask)
+{
+   switch (mask) {
+   case PKT_TX_VLAN_PKT: return "PKT_TX_VLAN_PKT";
+   case PKT_TX_IP_CKSUM: return "PKT_TX_IP_CKSUM";
+   case PKT_TX_TCP_CKSUM: return "PKT_TX_TCP_CKSUM";
+   case PKT_TX_SCTP_CKSUM: return "PKT_TX_SCTP_CKSUM";
+   case PKT_TX_UDP_CKSUM: return "PKT_TX_UDP_CKSUM";
+   case PKT_TX_IEEE1588_TMST: return "PKT_TX_IEEE1588_TMST";
+   case PKT_TX_VXLAN_CKSUM: return "PKT_TX_VXLAN_CKSUM";

[dpdk-dev] [PATCH v4 05/13] mbuf: remove too specific PKT_TX_OFFLOAD_MASK definition

2014-11-26 Thread Olivier Matz
This definition is specific to Intel PMD drivers and its definition
"indicate what bits required for building TX context" shows that it
should not be in the generic rte_mbuf.h but in the PMD driver.

Signed-off-by: Olivier Matz 
Acked-by: Bruce Richardson 
---
 lib/librte_mbuf/rte_mbuf.h| 5 -
 lib/librte_pmd_e1000/igb_rxtx.c   | 8 +++-
 lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 8 +++-
 3 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 6d9ef21..c2f4685 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -131,11 +131,6 @@ extern "C" {
 /* Use final bit of flags to indicate a control mbuf */
 #define CTRL_MBUF_FLAG   (1ULL << 63) /**< Mbuf contains control data */

-/**
- * Bit Mask to indicate what bits required for building TX context
- */
-#define PKT_TX_OFFLOAD_MASK (PKT_TX_VLAN_PKT | PKT_TX_IP_CKSUM | 
PKT_TX_L4_MASK)
-
 /* define a set of marker types that can be used to refer to set points in the
  * mbuf */
 typedef void*MARKER[0];   /**< generic marker for a point in a structure */
diff --git a/lib/librte_pmd_e1000/igb_rxtx.c b/lib/librte_pmd_e1000/igb_rxtx.c
index b406397..433c616 100644
--- a/lib/librte_pmd_e1000/igb_rxtx.c
+++ b/lib/librte_pmd_e1000/igb_rxtx.c
@@ -84,6 +84,12 @@
ETH_RSS_IPV6_UDP | \
ETH_RSS_IPV6_UDP_EX)

+/* Bit Mask to indicate what bits required for building TX context */
+#define IGB_TX_OFFLOAD_MASK (   \
+   PKT_TX_VLAN_PKT |\
+   PKT_TX_IP_CKSUM |\
+   PKT_TX_L4_MASK)
+
 static inline struct rte_mbuf *
 rte_rxmbuf_alloc(struct rte_mempool *mp)
 {
@@ -400,7 +406,7 @@ eth_igb_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
ol_flags = tx_pkt->ol_flags;
vlan_macip_lens.f.vlan_tci = tx_pkt->vlan_tci;
vlan_macip_lens.f.l2_l3_len = tx_pkt->l2_l3_len;
-   tx_ol_req = ol_flags & PKT_TX_OFFLOAD_MASK;
+   tx_ol_req = ol_flags & IGB_TX_OFFLOAD_MASK;

/* If a Context Descriptor need be built . */
if (tx_ol_req) {
diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c 
b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
index 7e470ce..ca35db2 100644
--- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
+++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
@@ -90,6 +90,12 @@
ETH_RSS_IPV6_UDP | \
ETH_RSS_IPV6_UDP_EX)

+/* Bit Mask to indicate what bits required for building TX context */
+#define IXGBE_TX_OFFLOAD_MASK ( \
+   PKT_TX_VLAN_PKT |\
+   PKT_TX_IP_CKSUM |\
+   PKT_TX_L4_MASK)
+
 static inline struct rte_mbuf *
 rte_rxmbuf_alloc(struct rte_mempool *mp)
 {
@@ -580,7 +586,7 @@ ixgbe_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
ol_flags = tx_pkt->ol_flags;

/* If hardware offload required */
-   tx_ol_req = ol_flags & PKT_TX_OFFLOAD_MASK;
+   tx_ol_req = ol_flags & IXGBE_TX_OFFLOAD_MASK;
if (tx_ol_req) {
vlan_macip_lens.f.vlan_tci = tx_pkt->vlan_tci;
vlan_macip_lens.f.l2_l3_len = tx_pkt->l2_l3_len;
-- 
2.1.0



[dpdk-dev] [PATCH v4 04/13] mbuf: add help about TX checksum flags

2014-11-26 Thread Olivier Matz
Describe how to use hardware checksum API.

Signed-off-by: Olivier Matz 
Acked-by: Bruce Richardson 
---
 lib/librte_mbuf/rte_mbuf.h | 28 ++--
 1 file changed, 18 insertions(+), 10 deletions(-)

diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index faa9924..6d9ef21 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -100,23 +100,31 @@ extern "C" {
 /* add new TX flags here */
 #define PKT_TX_VXLAN_CKSUM   (1ULL << 50) /**< TX checksum of VXLAN computed 
by NIC */
 #define PKT_TX_IEEE1588_TMST (1ULL << 51) /**< TX IEEE1588 packet to 
timestamp. */
-/*
- * Bits 52+53 used for L4 packet type with checksum enabled.
- * 00: Reserved
- * 01: TCP checksum
- * 10: SCTP checksum
- * 11: UDP checksum
+
+/**
+ * Bits 52+53 used for L4 packet type with checksum enabled: 00: Reserved,
+ * 01: TCP checksum, 10: SCTP checksum, 11: UDP checksum. To use hardware
+ * L4 checksum offload, the user needs to:
+ *  - fill l2_len and l3_len in mbuf
+ *  - set the flags PKT_TX_TCP_CKSUM, PKT_TX_SCTP_CKSUM or PKT_TX_UDP_CKSUM
+ *  - set the flag PKT_TX_IPV4 or PKT_TX_IPV6
+ *  - calculate the pseudo header checksum and set it in the L4 header (only
+ *for TCP or UDP). For SCTP, set the crc field to 0.
  */
-#define PKT_TX_L4_NO_CKSUM   (0ULL << 52) /**< Disable L4 cksum of TX pkt. */
+#define PKT_TX_L4_NO_CKSUM   (0ULL << 52) /* Disable L4 cksum of TX pkt. */
 #define PKT_TX_TCP_CKSUM (1ULL << 52) /**< TCP cksum of TX pkt. computed 
by NIC. */
 #define PKT_TX_SCTP_CKSUM(2ULL << 52) /**< SCTP cksum of TX pkt. computed 
by NIC. */
 #define PKT_TX_UDP_CKSUM (3ULL << 52) /**< UDP cksum of TX pkt. computed 
by NIC. */
 #define PKT_TX_L4_MASK   (3ULL << 52) /**< Mask for L4 cksum offload 
request. */

-#define PKT_TX_IP_CKSUM  (1ULL << 54) /**< IP cksum of TX pkt. computed by 
NIC. */
+#define PKT_TX_IP_CKSUM  (1ULL << 54)/**< IP cksum of TX pkt. computed by 
NIC. */
 #define PKT_TX_IPV4_CSUM PKT_TX_IP_CKSUM /**< Alias of PKT_TX_IP_CKSUM. */
-#define PKT_TX_IPV4  PKT_RX_IPV4_HDR /**< IPv4 with no IP checksum 
offload. */
-#define PKT_TX_IPV6  PKT_RX_IPV6_HDR /**< IPv6 packet */
+
+/** Tell the NIC it's an IPv4 packet. Required for L4 checksum offload. */
+#define PKT_TX_IPV4  PKT_RX_IPV4_HDR
+
+/** Tell the NIC it's an IPv6 packet. Required for L4 checksum offload. */
+#define PKT_TX_IPV6  PKT_RX_IPV6_HDR

 #define PKT_TX_VLAN_PKT  (1ULL << 55) /**< TX packet is a 802.1q VLAN 
packet. */

-- 
2.1.0



[dpdk-dev] [PATCH v4 03/13] mbuf: reorder tx ol_flags

2014-11-26 Thread Olivier Matz
The tx mbuf flags are now ordered from the lowest value to the
the highest. Add comments to explain where to add new flags.

By the way, move the PKT_TX_VXLAN_CKSUM at the right place.

Signed-off-by: Olivier Matz 
Acked-by: Thomas Monjalon 
---
 lib/librte_mbuf/rte_mbuf.h | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 5899e5c..faa9924 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -95,14 +95,11 @@ extern "C" {
 #define PKT_RX_TUNNEL_IPV6_HDR (1ULL << 12) /**< RX tunnel packet with IPv6 
header. */
 #define PKT_RX_FDIR_ID   (1ULL << 13) /**< FD id reported if FDIR match. */
 #define PKT_RX_FDIR_FLX  (1ULL << 14) /**< Flexible bytes reported if FDIR 
match. */
+/* add new RX flags here */

-#define PKT_TX_VLAN_PKT  (1ULL << 55) /**< TX packet is a 802.1q VLAN 
packet. */
-#define PKT_TX_IP_CKSUM  (1ULL << 54) /**< IP cksum of TX pkt. computed by 
NIC. */
+/* add new TX flags here */
 #define PKT_TX_VXLAN_CKSUM   (1ULL << 50) /**< TX checksum of VXLAN computed 
by NIC */
-#define PKT_TX_IPV4_CSUM PKT_TX_IP_CKSUM /**< Alias of PKT_TX_IP_CKSUM. */
-#define PKT_TX_IPV4  PKT_RX_IPV4_HDR /**< IPv4 with no IP checksum 
offload. */
-#define PKT_TX_IPV6  PKT_RX_IPV6_HDR /**< IPv6 packet */
-
+#define PKT_TX_IEEE1588_TMST (1ULL << 51) /**< TX IEEE1588 packet to 
timestamp. */
 /*
  * Bits 52+53 used for L4 packet type with checksum enabled.
  * 00: Reserved
@@ -116,8 +113,12 @@ extern "C" {
 #define PKT_TX_UDP_CKSUM (3ULL << 52) /**< UDP cksum of TX pkt. computed 
by NIC. */
 #define PKT_TX_L4_MASK   (3ULL << 52) /**< Mask for L4 cksum offload 
request. */

-/* Bit 51 - IEEE1588*/
-#define PKT_TX_IEEE1588_TMST (1ULL << 51) /**< TX IEEE1588 packet to 
timestamp. */
+#define PKT_TX_IP_CKSUM  (1ULL << 54) /**< IP cksum of TX pkt. computed by 
NIC. */
+#define PKT_TX_IPV4_CSUM PKT_TX_IP_CKSUM /**< Alias of PKT_TX_IP_CKSUM. */
+#define PKT_TX_IPV4  PKT_RX_IPV4_HDR /**< IPv4 with no IP checksum 
offload. */
+#define PKT_TX_IPV6  PKT_RX_IPV6_HDR /**< IPv6 packet */
+
+#define PKT_TX_VLAN_PKT  (1ULL << 55) /**< TX packet is a 802.1q VLAN 
packet. */

 /* Use final bit of flags to indicate a control mbuf */
 #define CTRL_MBUF_FLAG   (1ULL << 63) /**< Mbuf contains control data */
-- 
2.1.0



[dpdk-dev] [PATCH v4 02/13] ixgbe: fix remaining pkt_flags variable size to 64 bits

2014-11-26 Thread Olivier Matz
Since commit 4332beee9 "mbuf: expand ol_flags field to 64-bits", the
packet flags are now 64 bits wide. Some occurences were forgotten in
the ixgbe driver.

Signed-off-by: Olivier Matz 
Acked-by: Bruce Richardson 
---
 lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c 
b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
index ecebbf6..7e470ce 100644
--- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
+++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
@@ -817,7 +817,7 @@ end_of_tx:
 static inline uint64_t
 rx_desc_hlen_type_rss_to_pkt_flags(uint32_t hl_tp_rs)
 {
-   uint16_t pkt_flags;
+   uint64_t pkt_flags;

static uint64_t ip_pkt_types_map[16] = {
0, PKT_RX_IPV4_HDR, PKT_RX_IPV4_HDR_EXT, PKT_RX_IPV4_HDR_EXT,
@@ -834,7 +834,7 @@ rx_desc_hlen_type_rss_to_pkt_flags(uint32_t hl_tp_rs)
};

 #ifdef RTE_LIBRTE_IEEE1588
-   static uint32_t ip_pkt_etqf_map[8] = {
+   static uint64_t ip_pkt_etqf_map[8] = {
0, 0, 0, PKT_RX_IEEE1588_PTP,
0, 0, 0, 0,
};
@@ -903,7 +903,7 @@ ixgbe_rx_scan_hw_ring(struct igb_rx_queue *rxq)
struct igb_rx_entry *rxep;
struct rte_mbuf *mb;
uint16_t pkt_len;
-   uint16_t pkt_flags;
+   uint64_t pkt_flags;
int s[LOOK_AHEAD], nb_dd;
int i, j, nb_rx = 0;

@@ -1335,7 +1335,7 @@ ixgbe_recv_scattered_pkts(void *rx_queue, struct rte_mbuf 
**rx_pkts,
uint16_t nb_rx;
uint16_t nb_hold;
uint16_t data_len;
-   uint16_t pkt_flags;
+   uint64_t pkt_flags;

nb_rx = 0;
nb_hold = 0;
@@ -1511,9 +1511,9 @@ ixgbe_recv_scattered_pkts(void *rx_queue, struct rte_mbuf 
**rx_pkts,
first_seg->vlan_tci = rte_le_to_cpu_16(rxd.wb.upper.vlan);
hlen_type_rss = rte_le_to_cpu_32(rxd.wb.lower.lo_dword.data);
pkt_flags = rx_desc_hlen_type_rss_to_pkt_flags(hlen_type_rss);
-   pkt_flags = (uint16_t)(pkt_flags |
+   pkt_flags = (pkt_flags |
rx_desc_status_to_pkt_flags(staterr));
-   pkt_flags = (uint16_t)(pkt_flags |
+   pkt_flags = (pkt_flags |
rx_desc_error_to_pkt_flags(staterr));
first_seg->ol_flags = pkt_flags;

-- 
2.1.0



[dpdk-dev] [PATCH v4 01/13] igb/ixgbe: fix IP checksum calculation

2014-11-26 Thread Olivier Matz
According to Intel? 82599 10 GbE Controller Datasheet (Table 7-38), both
L2 and L3 lengths are needed to offload the IP checksum.

Note that the e1000 driver does not need to be patched as it already
contains the fix.

Signed-off-by: Olivier Matz 
Acked-by: Konstantin Ananyev 
---
 lib/librte_pmd_e1000/igb_rxtx.c   | 2 +-
 lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_pmd_e1000/igb_rxtx.c b/lib/librte_pmd_e1000/igb_rxtx.c
index 0dca7b7..b406397 100644
--- a/lib/librte_pmd_e1000/igb_rxtx.c
+++ b/lib/librte_pmd_e1000/igb_rxtx.c
@@ -262,7 +262,7 @@ igbe_set_xmit_ctx(struct igb_tx_queue* txq,

if (ol_flags & PKT_TX_IP_CKSUM) {
type_tucmd_mlhl = E1000_ADVTXD_TUCMD_IPV4;
-   cmp_mask |= TX_MAC_LEN_CMP_MASK;
+   cmp_mask |= TX_MACIP_LEN_CMP_MASK;
}

/* Specify which HW CTX to upload. */
diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c 
b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
index f9b3fe3..ecebbf6 100644
--- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
+++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
@@ -374,7 +374,7 @@ ixgbe_set_xmit_ctx(struct igb_tx_queue* txq,

if (ol_flags & PKT_TX_IP_CKSUM) {
type_tucmd_mlhl = IXGBE_ADVTXD_TUCMD_IPV4;
-   cmp_mask |= TX_MAC_LEN_CMP_MASK;
+   cmp_mask |= TX_MACIP_LEN_CMP_MASK;
}

/* Specify which HW CTX to upload. */
-- 
2.1.0



[dpdk-dev] [PATCH v4 00/13] add TSO support

2014-11-26 Thread Olivier Matz
This series add TSO support in ixgbe DPDK driver. This is a rework
of the series sent earlier this week [1]. This work is based on
another version [2] that was posted several months ago and
which included a mbuf rework that is now in mainline.

Changes in v4:

- csum fwd engine: use PKT_TX_IPV4 and PKT_TX_IPV6 to tell the hardware
  the IP version of the packet as suggested by Konstantin.
- document these 2 flags, explaining they should be set for hw L4 cksum
  offload or TSO.
- rebase on latest head

Changes in v3:

- indicate that rte_get_rx_ol_flag_name() and rte_get_tx_ol_flag_name()
  should be kept synchronized with flags definition
- use sizeof() when appropriate in rte_raw_cksum()
- remove double semicolon in ixgbe driver
- reorder tx ol_flags as requested by Thomas
- add missing copyrights when big modifications are made
- enhance the help of tx_cksum command in testpmd
- enhance the description of csumonly (comments)

Changes in v2:

- move rte_get_rx_ol_flag_name() and rte_get_tx_ol_flag_name() in
  rte_mbuf.c, and fix comments
- use IGB_TX_OFFLOAD_MASK and IXGBE_TX_OFFLOAD_MASK to replace
  PKT_TX_OFFLOAD_MASK
- fix inner_l2_len and inner_l3_len bitfields: use uint64_t instead
  of uint16_t
- replace assignation of l2_len and l3_len by assignation of tx_offload.
  It now includes inner_l2_len and inner_l3_len at the same time.
- introduce a new cksum api in rte_ip.h following discussion with
  Konstantin
- reorder commits to have all TSO commits at the end of the series
- use ol_flags for phdr checksum calculation (this now matches ixgbe
  API: standard pseudo hdr cksum for TCP cksum offload, pseudo hdr
  cksum without ip paylen for TSO). This will probably be changed
  with a dev_prep_tx() like function for 2.0 release.
- rebase on latest head


This series first fixes some bugs that were discovered during the
development, adds some changes to the mbuf API (new l4_len and
tso_segsz fields), adds TSO support in ixgbe, reworks testpmd
csum forward engine, and finally adds TSO support in testpmd so it
can be validated.

The new fields added in mbuf try to be generic enough to apply to
other hardware in the future. To delegate the TCP segmentation to the
hardware, the user has to:

  - set the PKT_TX_TCP_SEG flag in mbuf->ol_flags (this flag implies
PKT_TX_TCP_CKSUM)
  - set the flag PKT_TX_IPV4 or PKT_TX_IPV6
  - if it's IPv4, set the PKT_TX_IP_CKSUM flag and write the IP checksum
to 0 in the packet
  - fill the mbuf offload information: l2_len, l3_len, l4_len, tso_segsz

  - calculate the pseudo header checksum without taking ip_len in account,
and set it in the TCP header, for instance by using
rte_ipv4_phdr_cksum(ip_hdr, ol_flags)

The test report will be added as an answer to this cover letter and
could be linked in the concerned commits.

[1] http://dpdk.org/ml/archives/dev/2014-November/007953.html
[2] http://dpdk.org/ml/archives/dev/2014-May/002537.html

Olivier Matz (13):
  igb/ixgbe: fix IP checksum calculation
  ixgbe: fix remaining pkt_flags variable size to 64 bits
  mbuf: reorder tx ol_flags
  mbuf: add help about TX checksum flags
  mbuf: remove too specific PKT_TX_OFFLOAD_MASK definition
  mbuf: add functions to get the name of an ol_flag
  testpmd: fix use of offload flags in testpmd
  testpmd: rework csum forward engine
  mbuf: introduce new checksum API
  mbuf: generic support for TCP segmentation offload
  ixgbe: support TCP segmentation offload
  testpmd: support TSO in csum forward engine
  testpmd: add a verbose mode csum forward engine

 app/test-pmd/cmdline.c  | 248 +--
 app/test-pmd/config.c   |  17 +-
 app/test-pmd/csumonly.c | 817 
 app/test-pmd/macfwd.c   |   5 +-
 app/test-pmd/macswap.c  |   5 +-
 app/test-pmd/rxonly.c   |  36 +-
 app/test-pmd/testpmd.c  |   2 +-
 app/test-pmd/testpmd.h  |  24 +-
 app/test-pmd/txonly.c   |   9 +-
 examples/ipv4_multicast/main.c  |   2 +-
 lib/librte_mbuf/rte_mbuf.c  |  49 +++
 lib/librte_mbuf/rte_mbuf.h  | 108 +++--
 lib/librte_net/rte_ip.h | 208 +
 lib/librte_pmd_e1000/igb_rxtx.c |  21 +-
 lib/librte_pmd_ixgbe/ixgbe_ethdev.c |   3 +-
 lib/librte_pmd_ixgbe/ixgbe_rxtx.c   | 179 +---
 lib/librte_pmd_ixgbe/ixgbe_rxtx.h   |  19 +-
 17 files changed, 1103 insertions(+), 649 deletions(-)

-- 
2.1.0



[dpdk-dev] [PATCH v3 08/13] testpmd: rework csum forward engine

2014-11-26 Thread Olivier MATZ
Hi Konstantin,

On 11/26/2014 01:25 PM, Ananyev, Konstantin wrote:
>> By the way (this is probably off-topic), but I'm wondering if the TX
>> flags should have the same values than the RX flags:
>>
>> #define PKT_TX_IPV4  PKT_RX_IPV4_HDR
>> #define PKT_TX_IPV6  PKT_RX_IPV6_HDR
>
> Thought about that too.
>  From one side,  it is a bit out of our concept: separate RX and TX falgs.
>  From other side, it allows us to save 2 bits in the ol_flags.
> Don't have any strong opinion here.
> What do you think?

I have no strong opinion too, but I have a preference for 2 different
bit values for these flags:

- as you say, it's matches the concept (RX and TX flags are separated)

- 64 bits is a lot, we have some time before there is no more available
   bit... and I hope we it will never occur because it would become
   complex for an application to handle them all

- it will avoid to send a packet with a bad info:
   - we receive a Ether/IP6/IP4/L4/data packet
   - the driver sets PKT_RX_IPV6_HDR
   - the stack decapsulates IP6
   - the stack sends the packet, it has the PKT_TX_IPV6 flag but it's an
 IPv4 packet

   This is not a real problem as the flag will not be used by the
   driver/hardware (it's only mandatory for hw cksum / tso), but
   it can be confusing.

Regards,
Olivier




[dpdk-dev] [PATCH v3] examples/skeleton: very simple code for packet forwarding

2014-11-26 Thread Bruce Richardson
On Wed, Nov 26, 2014 at 04:42:06PM +0100, Thomas Monjalon wrote:
> Bruce,
> 
> I made some minor changes to the skeleton app.
> Could you confirm it's ok for you?

Yes, they are fine for me, though the makefile diff looks messed up compared to
the original version. Any way to force git to recognise it as a completely new
file and not a copy of the higher level examples makefile?

/Bruce

> 
> 2014-11-26 15:38, Thomas Monjalon:
> > v3 changes:
> > - rename skeleton_app/ to skeleton/
> > - add in examples Makefile
> > - fix default target to native
> > - reword header guard
> > - rename rxRings to rx_rings and txRings to tx_rings
> 


[dpdk-dev] [PATCH v3] examples/skeleton: very simple code for packet forwarding

2014-11-26 Thread Thomas Monjalon
From: Bruce Richardson 

This is a very simple example app for doing packet forwarding with the
Intel DPDK. It's designed to serve as a start point for people new to
the Intel DPDK and who want to develop a new app.

Therefore it's meant to:
* have as good a performance out-of-the-box as possible, using the
  best-known settings for configuring the PMDs, so that any new apps can
  be based off it.
* be kept as short as possible to make it easy to understand it and get
  started with it.

Signed-off-by: Bruce Richardson 
---

v3 changes:
- rename skeleton_app/ to skeleton/
- add in examples Makefile
- fix default target to native
- reword header guard
- rename rxRings to rx_rings and txRings to tx_rings

 examples/Makefile|   1 +
 examples/{ => skeleton}/Makefile |  52 ---
 examples/skeleton/basicfwd.c | 183 +++
 examples/skeleton/basicfwd.h |  46 ++
 4 files changed, 249 insertions(+), 33 deletions(-)
 copy examples/{ => skeleton}/Makefile (60%)
 create mode 100644 examples/skeleton/basicfwd.c
 create mode 100644 examples/skeleton/basicfwd.h

diff --git a/examples/Makefile b/examples/Makefile
index 121dab4..8055297 100644
--- a/examples/Makefile
+++ b/examples/Makefile
@@ -61,6 +61,7 @@ DIRS-y += netmap_compat/bridge
 DIRS-$(CONFIG_RTE_LIBRTE_METER) += qos_meter
 DIRS-$(CONFIG_RTE_LIBRTE_SCHED) += qos_sched
 DIRS-y += quota_watermark
+DIRS-y += skeleton
 DIRS-y += timer
 DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += vhost
 DIRS-$(CONFIG_RTE_LIBRTE_XEN_DOM0) += vhost_xen
diff --git a/examples/Makefile b/examples/skeleton/Makefile
similarity index 60%
copy from examples/Makefile
copy to examples/skeleton/Makefile
index 121dab4..4a5d99f 100644
--- a/examples/Makefile
+++ b/examples/skeleton/Makefile
@@ -1,6 +1,7 @@
 #   BSD LICENSE
 #
-#   Copyright(c) 2014 6WIND S.A.
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
 #
 #   Redistribution and use in source and binary forms, with or without
 #   modification, are permitted provided that the following conditions
@@ -12,7 +13,7 @@
 #   notice, this list of conditions and the following disclaimer in
 #   the documentation and/or other materials provided with the
 #   distribution.
-# * Neither the name of 6WIND S.A. nor the names of its
+# * Neither the name of Intel Corporation nor the names of its
 #   contributors may be used to endorse or promote products derived
 #   from this software without specific prior written permission.
 #
@@ -32,40 +33,25 @@ ifeq ($(RTE_SDK),)
 $(error "Please define RTE_SDK environment variable")
 endif

-# Default target, can be overriden by command line or environment
+# Default target, can be overridden by command line or environment
 RTE_TARGET ?= x86_64-native-linuxapp-gcc

 include $(RTE_SDK)/mk/rte.vars.mk

-DIRS-y += cmdline
-ifneq ($(ICP_ROOT),)
-DIRS-y += dpdk_qat
+# binary name
+APP = basicfwd
+
+# all source are stored in SRCS-y
+SRCS-y := basicfwd.c
+
+CFLAGS += $(WERROR_FLAGS)
+
+# workaround for a gcc bug with noreturn attribute
+# http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
+ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y)
+CFLAGS_main.o += -Wno-return-type
 endif
-DIRS-y += exception_path
-DIRS-y += helloworld
-DIRS-y += ip_pipeline
-DIRS-y += ip_reassembly
-DIRS-$(CONFIG_RTE_MBUF_REFCNT) += ip_fragmentation
-DIRS-$(CONFIG_RTE_MBUF_REFCNT) += ipv4_multicast
-DIRS-$(CONFIG_RTE_LIBRTE_KNI) += kni
-DIRS-y += l2fwd
-DIRS-$(CONFIG_RTE_LIBRTE_IVSHMEM) += l2fwd-ivshmem
-DIRS-y += l3fwd
-DIRS-$(CONFIG_RTE_LIBRTE_ACL) += l3fwd-acl
-DIRS-$(CONFIG_RTE_LIBRTE_POWER) += l3fwd-power
-DIRS-y += l3fwd-vf
-DIRS-y += link_status_interrupt
-DIRS-y += load_balancer
-DIRS-y += multi_process
-DIRS-y += netmap_compat/bridge
-DIRS-$(CONFIG_RTE_LIBRTE_METER) += qos_meter
-DIRS-$(CONFIG_RTE_LIBRTE_SCHED) += qos_sched
-DIRS-y += quota_watermark
-DIRS-y += timer
-DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += vhost
-DIRS-$(CONFIG_RTE_LIBRTE_XEN_DOM0) += vhost_xen
-DIRS-y += vmdq
-DIRS-y += vmdq_dcb
-DIRS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) += distributor

-include $(RTE_SDK)/mk/rte.extsubdir.mk
+EXTRA_CFLAGS += -O3 -g -Wfatal-errors
+
+include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/examples/skeleton/basicfwd.c b/examples/skeleton/basicfwd.c
new file mode 100644
index 000..ef8f90c
--- /dev/null
+++ b/examples/skeleton/basicfwd.c
@@ -0,0 +1,183 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in

[dpdk-dev] [PATCH] i40e: Use one bit flag for all hardware detected RX packet errors

2014-11-26 Thread Olivier MATZ
Hi Konstantin,

On 11/26/2014 02:38 PM, Ananyev, Konstantin wrote:
>>> Probably I didn't explain myself clear enough, sorry.
>>> I didn't suggest to get rid of setting bits that indicate L3/L4 checksum 
>>> errors:
>>> PKT_RX_IP_CKSUM_BAD, PKT_RX_L4_CKSUM_BAD, PKT_RX_EIP_CKSUM_BAD.
>>> I think these flags should be set as before.
>>>
>>> I was talking only about collapsing only these 4 RX error flags into one:
>>>
>>> #define PKT_RX_OVERSIZE  (0ULL << 0)  /**< Num of desc of an RX pkt 
>>> oversize. */
>>> #define PKT_RX_HBUF_OVERFLOW (0ULL << 0)  /**< Header buffer overflow. */
>>> #define PKT_RX_RECIP_ERR (0ULL << 0)  /**< Hardware processing error. */
>>> #define PKT_RX_MAC_ERR   (0ULL << 0)  /**< MAC error. */
>>>
>>>   From my point of view the difference of these 2 groups are:
>>> First - HW was able to receive whole packet without a problem, but L3/L4 
>>> checksum check failed.
>>>
>>> Second - HW was not able to receive whole packet properly by whatever 
>>> reason.
>>>   From upper layer SW perspective - there it probably makes little 
>>> difference, what caused it,
>>> as most likely SW has to throw away erroneous packet.
>>> And for debugging purposes, we can add PMD_LOG(DEBUG, ...) that would print 
>>> what exactly HW error happened.
>>
>> I agree with Konstantin that there are 2 different cases:
>>
>> a) the packet is properly received by the hardware, but has a bad
>>  checksum (or another protocol error, for instance an invalid ip len,
>>  a ip_version == 8 :))
>>
>>  in this case, it is useful to the application to have the mbuf with
>>  the data + an error flag. Then using a tcpdump-like tool could help
>>  to debug what is the cause of the error and what equipment generates
>>  a bad packet.
>>
>> b) the packet is not properly received by the hardware. In this case
>>  the data is invalid in the mbuf and not useable by the application.
>>  I suggest to only have a stats counter in this case, as receiving the
>>  mbuf is cpu time consuming and the only thing the application can do
>>  is to drop the packet.
>
> So for b) you suggest to drop the packet straight in PMD RX function?
> Something like:
> if (unlikely(error_bits & ...)) {
>  PMD_LOG(DEBUG, ...);
>   rte_pktmbuf_free(mb);
> }
> ?

Yes

> That's probably a bit too radical.
> Yes, mbuf doesn't contain the whole packet, but it may contain at least part 
> of it, let say in case of 'packet oversize'.
> So for debugging purposes the user may still like to examine the mbuf 
> contents.

As soon as there is some exploitable data in the mbuf, I agree it can
be transfered to the application (ex: bad header, bad len, bad
checksum...).

But if the hardware is not able to provide any exploitable data, it
looks a bit overkill to give an mbuf with an error flag.

But grouping the flags as you suggest is already a good clean-up to me,
I don't want to be more catholic than the Pope ;)

Regards,
Olivier



[dpdk-dev] maximum line size on patch

2014-11-26 Thread Neil Horman
On Wed, Nov 26, 2014 at 05:31:12PM +, De Lara Guarch, Pablo wrote:
> Hi,
> 
> I am trying to send a patch for new sample app UG, but the patch cannot be 
> sent because I am hitting the maximum line size on the patch.
> 
> fatal: 
> /tmp/35JFqgAmCA/0001-doc-Added-new-sample-app-UG-for-VM-power-management.patch:
>  29: patch contains a line longer than 998 characters
> 
> This is due to the included svg files. Is there any way I can include them on 
> the patch? Any other way?
> 
You can also just put your local git tree in a public place and send a pull
request to the list.
Neil

> Thanks,
> Pablo
> 


[dpdk-dev] [PATCH v6 2/2] testpmd: add mode 4 support v6

2014-11-26 Thread Thomas Monjalon
2014-11-26 13:00, Wodkowski, PawelX:
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas Monjalon
> > 2014-11-26 11:17, Michal Jastrzebski:
> > > From: Pawel Wodkowski 
> > > --- a/app/test-pmd/csumonly.c
> > > +++ b/app/test-pmd/csumonly.c
> > > @@ -254,8 +254,17 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
> > >*/
> > >   nb_rx = rte_eth_rx_burst(fs->rx_port, fs->rx_queue, pkts_burst,
> > >nb_pkt_per_burst);
> > > +#ifndef RTE_LIBRTE_PMD_BOND
> > >   if (unlikely(nb_rx == 0))
> > >   return;
> > > +#else
> > > + if (unlikely(nb_rx == 0 && (fs->forward_timeout == 0 ||
> > > + fs->next_forward_time > rte_rdtsc(
> > > + return;
> > > +
> > > + if (fs->forward_timeout != 0)
> > > + fs->next_forward_time = rte_rdtsc() + fs->forward_timeout;
> > > +#endif
> > 
> > I don't understand why you need to make such change for bonding,
> > and there is no comment to explain.
> > Bonding should be a PMD like any other and shouldn't require such change.
> > I don't know mode 4 but it seems there is a design problem here.
> > 
> 
> It is an implication of requirement that was formed on beginning of bonding 
> implementation - bonded interface should be transparent to user app. But this
> requirement in is in collision with mode 4. It need to periodically receive 
> and 
> transmit frames (LACP and marker) that are not passed to user app but 
> processed/produced in background. If this will not happen in at least 10 times
> per second mode 4 will not work.
> 
> Most of (all?) user applications do RX/TX more often than 10 times per 
> second, 
> so this will have neglectable impact to those apps (it will have to check 
> this 
> 100ms maximum interval of rx/tx as I did in code you pointed).
> 
> We had discussed all options with Declan and Bruce, and this seems to be the
> most transparent way to implement mode 4 without using any kind of locking
> inside library.

So you agree there is a design problem and you were initially trying to push it
without raising the problem in the hope that nobody will see it?
It's really not the good way to work in an Open Source project.

Is there any comment in the API to explain this new constraint?
Do you think we can change how Rx/Tx works in DPDK to integrate this feature?

Actually, I think these bonding features should be implemented in a layer on
top of DPDK. It's not the DPDK responsibility to make some protocol processing.
Bonding was integrated with the promise that it's transparent and really close
to the hardware ports.

Today I see we clearly need a discussion to know what should be implemented
in DPDK. Which protocol layer is the limit?
I explained my point of view but the decision belongs to the whole community.

-- 
Thomas


[dpdk-dev] [PATCH] i40e: Use one bit flag for all hardware detected RX packet errors

2014-11-26 Thread Helin Zhang
There were some bit flags of 0 for RX packet errors detected by hardware.
Actually only one bit of error flag is enough for all hardware detected
RX packet errors.

Signed-off-by: Helin Zhang 
---
 lib/librte_mbuf/rte_mbuf.h  |  6 +-
 lib/librte_pmd_i40e/i40e_rxtx.c | 31 +++
 2 files changed, 4 insertions(+), 33 deletions(-)

diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 5899e5c..897fd26 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -80,11 +80,6 @@ extern "C" {
 #define PKT_RX_FDIR  (1ULL << 2)  /**< RX packet with FDIR match 
indicate. */
 #define PKT_RX_L4_CKSUM_BAD  (1ULL << 3)  /**< L4 cksum of RX pkt. is not OK. 
*/
 #define PKT_RX_IP_CKSUM_BAD  (1ULL << 4)  /**< IP cksum of RX pkt. is not OK. 
*/
-#define PKT_RX_EIP_CKSUM_BAD (0ULL << 0)  /**< External IP header checksum 
error. */
-#define PKT_RX_OVERSIZE  (0ULL << 0)  /**< Num of desc of an RX pkt 
oversize. */
-#define PKT_RX_HBUF_OVERFLOW (0ULL << 0)  /**< Header buffer overflow. */
-#define PKT_RX_RECIP_ERR (0ULL << 0)  /**< Hardware processing error. */
-#define PKT_RX_MAC_ERR   (0ULL << 0)  /**< MAC error. */
 #define PKT_RX_IPV4_HDR  (1ULL << 5)  /**< RX packet with IPv4 header. */
 #define PKT_RX_IPV4_HDR_EXT  (1ULL << 6)  /**< RX packet with extended IPv4 
header. */
 #define PKT_RX_IPV6_HDR  (1ULL << 7)  /**< RX packet with IPv6 header. */
@@ -95,6 +90,7 @@ extern "C" {
 #define PKT_RX_TUNNEL_IPV6_HDR (1ULL << 12) /**< RX tunnel packet with IPv6 
header. */
 #define PKT_RX_FDIR_ID   (1ULL << 13) /**< FD id reported if FDIR match. */
 #define PKT_RX_FDIR_FLX  (1ULL << 14) /**< Flexible bytes reported if FDIR 
match. */
+#define PKT_RX_ERR_HW(1ULL << 15) /**< RX packet error detected by 
hardware. */

 #define PKT_TX_VLAN_PKT  (1ULL << 55) /**< TX packet is a 802.1q VLAN 
packet. */
 #define PKT_TX_IP_CKSUM  (1ULL << 54) /**< IP cksum of TX pkt. computed by 
NIC. */
diff --git a/lib/librte_pmd_i40e/i40e_rxtx.c b/lib/librte_pmd_i40e/i40e_rxtx.c
index cce6911..3b2195d 100644
--- a/lib/librte_pmd_i40e/i40e_rxtx.c
+++ b/lib/librte_pmd_i40e/i40e_rxtx.c
@@ -115,35 +115,10 @@ i40e_rxd_status_to_pkt_flags(uint64_t qword)
 static inline uint64_t
 i40e_rxd_error_to_pkt_flags(uint64_t qword)
 {
-   uint64_t flags = 0;
-   uint64_t error_bits = (qword >> I40E_RXD_QW1_ERROR_SHIFT);
-
-#define I40E_RX_ERR_BITS 0x3f
-   if (likely((error_bits & I40E_RX_ERR_BITS) == 0))
-   return flags;
-   /* If RXE bit set, all other status bits are meaningless */
-   if (unlikely(error_bits & (1 << I40E_RX_DESC_ERROR_RXE_SHIFT))) {
-   flags |= PKT_RX_MAC_ERR;
-   return flags;
-   }
-
-   /* If RECIPE bit set, all other status indications should be ignored */
-   if (unlikely(error_bits & (1 << I40E_RX_DESC_ERROR_RECIPE_SHIFT))) {
-   flags |= PKT_RX_RECIP_ERR;
-   return flags;
-   }
-   if (unlikely(error_bits & (1 << I40E_RX_DESC_ERROR_HBO_SHIFT)))
-   flags |= PKT_RX_HBUF_OVERFLOW;
-   if (unlikely(error_bits & (1 << I40E_RX_DESC_ERROR_IPE_SHIFT)))
-   flags |= PKT_RX_IP_CKSUM_BAD;
-   if (unlikely(error_bits & (1 << I40E_RX_DESC_ERROR_L4E_SHIFT)))
-   flags |= PKT_RX_L4_CKSUM_BAD;
-   if (unlikely(error_bits & (1 << I40E_RX_DESC_ERROR_EIPE_SHIFT)))
-   flags |= PKT_RX_EIP_CKSUM_BAD;
-   if (unlikely(error_bits & (1 << I40E_RX_DESC_ERROR_OVERSIZE_SHIFT)))
-   flags |= PKT_RX_OVERSIZE;
+   if (unlikely(qword & I40E_RXD_QW1_ERROR_MASK))
+   return PKT_RX_ERR_HW;

-   return flags;
+   return 0;
 }

 /* Translate pkt types to pkt flags */
-- 
1.8.1.4



[dpdk-dev] [PATCH v6 2/2] testpmd: add mode 4 support v6

2014-11-26 Thread Jastrzebski, MichalX K
> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Wednesday, November 26, 2014 2:31 PM
> To: Wodkowski, PawelX
> Cc: Jastrzebski, MichalX K; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v6 2/2] testpmd: add mode 4 support v6
> 
> 2014-11-26 13:00, Wodkowski, PawelX:
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas Monjalon
> > > 2014-11-26 11:17, Michal Jastrzebski:
> > > > From: Pawel Wodkowski 
> > > > --- a/app/test-pmd/csumonly.c
> > > > +++ b/app/test-pmd/csumonly.c
> > > > @@ -254,8 +254,17 @@ pkt_burst_checksum_forward(struct
> fwd_stream *fs)
> > > >  */
> > > > nb_rx = rte_eth_rx_burst(fs->rx_port, fs->rx_queue, pkts_burst,
> > > >  nb_pkt_per_burst);
> > > > +#ifndef RTE_LIBRTE_PMD_BOND
> > > > if (unlikely(nb_rx == 0))
> > > > return;
> > > > +#else
> > > > +   if (unlikely(nb_rx == 0 && (fs->forward_timeout == 0 ||
> > > > +   fs->next_forward_time > rte_rdtsc(
> > > > +   return;
> > > > +
> > > > +   if (fs->forward_timeout != 0)
> > > > +   fs->next_forward_time = rte_rdtsc() + 
> > > > fs->forward_timeout;
> > > > +#endif
> > >
> > > I don't understand why you need to make such change for bonding,
> > > and there is no comment to explain.
> > > Bonding should be a PMD like any other and shouldn't require such
> change.
> > > I don't know mode 4 but it seems there is a design problem here.
> > >
> >
> > It is an implication of requirement that was formed on beginning of
> bonding
> > implementation - bonded interface should be transparent to user app. But
> this
> > requirement in is in collision with mode 4. It need to periodically receive
> and
> > transmit frames (LACP and marker) that are not passed to user app but
> > processed/produced in background. If this will not happen in at least 10
> times
> > per second mode 4 will not work.
> >
> > Most of (all?) user applications do RX/TX more often than 10 times per
> second,
> > so this will have neglectable impact to those apps (it will have to check 
> > this
> > 100ms maximum interval of rx/tx as I did in code you pointed).
> >
> > We had discussed all options with Declan and Bruce, and this seems to be
> the
> > most transparent way to implement mode 4 without using any kind of
> locking
> > inside library.
> 
> So you agree there is a design problem and you were initially trying to push 
> it
> without raising the problem in the hope that nobody will see it?
No, we didn't want to hide anything. 
> It's really not the good way to work in an Open Source project.
> 
> Is there any comment in the API to explain this new constraint?
No, we haven't put in the code a straight comment. I wrote about it in cover 
letter in v6
and there is also show_warnings function in patch 1/2 which will print a 
warning to the 
application.
> Do you think we can change how Rx/Tx works in DPDK to integrate this
> feature?
> 
> Actually, I think these bonding features should be implemented in a layer on
> top of DPDK. It's not the DPDK responsibility to make some protocol
> processing.
> Bonding was integrated with the promise that it's transparent and really
> close
> to the hardware ports.
> 
> Today I see we clearly need a discussion to know what should be
> implemented
> in DPDK. Which protocol layer is the limit?
> I explained my point of view but the decision belongs to the whole
> community.
> 
> --
> Thomas


[dpdk-dev] [PATCH v3 08/13] testpmd: rework csum forward engine

2014-11-26 Thread Liu, Jijiang


> -Original Message-
> From: Olivier MATZ [mailto:olivier.matz at 6wind.com]
> Sent: Wednesday, November 26, 2014 7:15 PM
> To: Ananyev, Konstantin; dev at dpdk.org
> Cc: Walukiewicz, Miroslaw; Liu, Jijiang; Liu, Yong; jigsaw at gmail.com; 
> Richardson,
> Bruce
> Subject: Re: [PATCH v3 08/13] testpmd: rework csum forward engine
> 
> Hi Konstantin,
> 
> On 11/26/2014 11:10 AM, Ananyev, Konstantin wrote:
> > As I can see you removed code that sets up TX_PKT_IPV4 and TX_PKT_IPV6  of
> ol_flags.
> > I think that we need to keep it.
> > The reason for that is:
> > With FVL, to make HW TX checksum offload work, SW is responsible to provide
> to the HW information about L3 header.
> > Possible values are:
> > - IPv4 hdr with HW checksum calculation
> > - IPV4 hdr (checksum done by SW)
> > - IPV6 hdr
> > - unknown
> > So let say to for the packet: ETHER_HDR/IPV6_HDR/TCP_HDR/DATA To
> > request HW TCP checksum offload,  SW have to provide to HW information
> > that it is a packet with IPV6 header (plus as for ixgbe: l2_hdr_len, 
> > l3_hdr_len,
> l4_type, l4_hdr_len).
> > That's why TX_PKT_IPV4 and TX_PKT_IPV6   were introduced.
> >
> > Yes, it is  a change in public API for HW TX offload, but I don't see
> > any other way we can overcome it (apart from make TX function itself to 
> > parse
> a packet, which is obviously not a good choice).
> > Note that existing apps working on existing HW (ixgbe/igb/em) are not 
> > affected.
> > Though apps that supposed to be run on FVL HW too have to follow new
> convention.
> >
> > So I suggest we keep setting these flags in csumonly.c
> 
> Right, I missed these flags.
> It's indeed an API change, but maybe it makes sense, and setting it is not a 
> big
> cost for the application.
> 
> So I would also need to slightly modify the API help in the following
> patches:
>   - [04/13] mbuf: add help about TX checksum flags
>   - [10/13] mbuf: generic support for TCP segmentation offload
> 
> I'll send a v4 this afternoon that integrates this change.

After your patch is applied, I will send a patch of  i40e driver change for 
VXLAN Tx checksum.

> Do you know precisely when the flags PKT_TX_IPV4 and PKT_TX_IPV6 must be
> set by the application? Is it only the hw checksum and tso use case?
> If yes, I'll add it in the API help too.
> 
> By the way (this is probably off-topic), but I'm wondering if the TX flags 
> should
> have the same values than the RX flags:
> 
>#define PKT_TX_IPV4  PKT_RX_IPV4_HDR
>#define PKT_TX_IPV6  PKT_RX_IPV6_HDR
> 
> > Apart from that , the patch looks good to me.
> > And yes, we would need to change the  the way we handle TX offload for
> tunnelled packets.
> 
> Thank you very much Konstantin for your review.
> 
> Regards,
> Olivier



[dpdk-dev] [PATCH 2/4] doc: Corrected info for tx_checksum set mask function, in testpmd UG

2014-11-26 Thread Thomas Monjalon
Hi Pablo,

2014-11-17 10:47, De Lara Guarch, Pablo:
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Olivier MATZ
> > On 11/15/2014 08:13 PM, Pablo de Lara wrote:
> > > tx_checksum set mask function now allows 4 extra bits in the mask
> > > for TX checksum offload
> > >
> > > Signed-off-by: Pablo de Lara 
> > > ---
> > >  doc/guides/testpmd_app_ug/testpmd_funcs.rst |   10 +-
> > >  1 files changed, 9 insertions(+), 1 deletions(-)
> > 
> > A patch reworking the csumonly API is pending:
> > http://dpdk.org/ml/archives/dev/2014-November/008188.html
> > 
> > I don't know if it will be accepted, but just to mention that
> > these 2 patches will conflict in this case.
> 
> Thanks for spotting it! I guess that at this point, all we can do is wait.
> If you patch gets applied before mine, I will send a v2 with the changes.
> If it gets applied after, then I will send another patch to fix it.

Oliver will send a v4 of his TSO patchset which should be applied shortly.
Please could you adjust the documentation and make a v2?

Bernard, we have to wait for this change.

Thanks to all
-- 
Thomas


[dpdk-dev] [PATCH] i40e: Use one bit flag for all hardware detected RX packet errors

2014-11-26 Thread Ananyev, Konstantin


> -Original Message-
> From: Olivier MATZ [mailto:olivier.matz at 6wind.com]
> Sent: Wednesday, November 26, 2014 11:22 AM
> To: Ananyev, Konstantin; Zhang, Helin; dev at dpdk.org
> Cc: Cao, Waterman; Cao, Min
> Subject: Re: [PATCH] i40e: Use one bit flag for all hardware detected RX 
> packet errors
> 
> Hi Konstantin, Hi Helin,
> 
> On 11/26/2014 11:49 AM, Ananyev, Konstantin wrote:
> > Hi Helin,
> >
> >> -Original Message-
> >> From: Zhang, Helin
> >> Sent: Wednesday, November 26, 2014 6:07 AM
> >> To: dev at dpdk.org
> >> Cc: Cao, Waterman; Cao, Min; Ananyev, Konstantin; olivier.matz at 
> >> 6wind.com; Zhang, Helin
> >> Subject: [PATCH] i40e: Use one bit flag for all hardware detected RX 
> >> packet errors
> >>
> >> There were some bit flags of 0 for RX packet errors detected by hardware.
> >> Actually only one bit of error flag is enough for all hardware detected
> >> RX packet errors.
> >>
> >> Signed-off-by: Helin Zhang 
> >> ---
> >>   lib/librte_mbuf/rte_mbuf.h  |  6 +-
> >>   lib/librte_pmd_i40e/i40e_rxtx.c | 31 +++
> >>   2 files changed, 4 insertions(+), 33 deletions(-)
> >>
> >> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> >> index 5899e5c..897fd26 100644
> >> --- a/lib/librte_mbuf/rte_mbuf.h
> >> +++ b/lib/librte_mbuf/rte_mbuf.h
> >> @@ -80,11 +80,6 @@ extern "C" {
> >>   #define PKT_RX_FDIR  (1ULL << 2)  /**< RX packet with FDIR match 
> >> indicate. */
> >>   #define PKT_RX_L4_CKSUM_BAD  (1ULL << 3)  /**< L4 cksum of RX pkt. is 
> >> not OK. */
> >>   #define PKT_RX_IP_CKSUM_BAD  (1ULL << 4)  /**< IP cksum of RX pkt. is 
> >> not OK. */
> >> -#define PKT_RX_EIP_CKSUM_BAD (0ULL << 0)  /**< External IP header 
> >> checksum error. */
> >> -#define PKT_RX_OVERSIZE  (0ULL << 0)  /**< Num of desc of an RX pkt 
> >> oversize. */
> >> -#define PKT_RX_HBUF_OVERFLOW (0ULL << 0)  /**< Header buffer overflow. */
> >> -#define PKT_RX_RECIP_ERR (0ULL << 0)  /**< Hardware processing error. 
> >> */
> >> -#define PKT_RX_MAC_ERR   (0ULL << 0)  /**< MAC error. */
> >>   #define PKT_RX_IPV4_HDR  (1ULL << 5)  /**< RX packet with IPv4 
> >> header. */
> >>   #define PKT_RX_IPV4_HDR_EXT  (1ULL << 6)  /**< RX packet with extended 
> >> IPv4 header. */
> >>   #define PKT_RX_IPV6_HDR  (1ULL << 7)  /**< RX packet with IPv6 
> >> header. */
> >> @@ -95,6 +90,7 @@ extern "C" {
> >>   #define PKT_RX_TUNNEL_IPV6_HDR (1ULL << 12) /**< RX tunnel packet with 
> >> IPv6 header. */
> >>   #define PKT_RX_FDIR_ID   (1ULL << 13) /**< FD id reported if FDIR 
> >> match. */
> >>   #define PKT_RX_FDIR_FLX  (1ULL << 14) /**< Flexible bytes reported 
> >> if FDIR match. */
> >> +#define PKT_RX_ERR_HW(1ULL << 15) /**< RX packet error detected 
> >> by hardware. */
> >>
> >>   #define PKT_TX_VLAN_PKT  (1ULL << 55) /**< TX packet is a 802.1q 
> >> VLAN packet. */
> >>   #define PKT_TX_IP_CKSUM  (1ULL << 54) /**< IP cksum of TX pkt. 
> >> computed by NIC. */
> >> diff --git a/lib/librte_pmd_i40e/i40e_rxtx.c 
> >> b/lib/librte_pmd_i40e/i40e_rxtx.c
> >> index cce6911..3b2195d 100644
> >> --- a/lib/librte_pmd_i40e/i40e_rxtx.c
> >> +++ b/lib/librte_pmd_i40e/i40e_rxtx.c
> >> @@ -115,35 +115,10 @@ i40e_rxd_status_to_pkt_flags(uint64_t qword)
> >>   static inline uint64_t
> >>   i40e_rxd_error_to_pkt_flags(uint64_t qword)
> >>   {
> >> -  uint64_t flags = 0;
> >> -  uint64_t error_bits = (qword >> I40E_RXD_QW1_ERROR_SHIFT);
> >> -
> >> -#define I40E_RX_ERR_BITS 0x3f
> >> -  if (likely((error_bits & I40E_RX_ERR_BITS) == 0))
> >> -  return flags;
> >> -  /* If RXE bit set, all other status bits are meaningless */
> >> -  if (unlikely(error_bits & (1 << I40E_RX_DESC_ERROR_RXE_SHIFT))) {
> >> -  flags |= PKT_RX_MAC_ERR;
> >> -  return flags;
> >> -  }
> >> -
> >> -  /* If RECIPE bit set, all other status indications should be ignored */
> >> -  if (unlikely(error_bits & (1 << I40E_RX_DESC_ERROR_RECIPE_SHIFT))) {
> >> -  flags |= PKT_RX_RECIP_ERR;
> >> -  return flags;
> >> -  }
> >> -  if (unlikely(error_bits & (1 << I40E_RX_DESC_ERROR_HBO_SHIFT)))
> >> -  flags |= PKT_RX_HBUF_OVERFLOW;
> >> -  if (unlikely(error_bits & (1 << I40E_RX_DESC_ERROR_IPE_SHIFT)))
> >> -  flags |= PKT_RX_IP_CKSUM_BAD;
> >> -  if (unlikely(error_bits & (1 << I40E_RX_DESC_ERROR_L4E_SHIFT)))
> >> -  flags |= PKT_RX_L4_CKSUM_BAD;
> >> -  if (unlikely(error_bits & (1 << I40E_RX_DESC_ERROR_EIPE_SHIFT)))
> >> -  flags |= PKT_RX_EIP_CKSUM_BAD;
> >> -  if (unlikely(error_bits & (1 << I40E_RX_DESC_ERROR_OVERSIZE_SHIFT)))
> >> -  flags |= PKT_RX_OVERSIZE;
> >> +  if (unlikely(qword & I40E_RXD_QW1_ERROR_MASK))
> >> +  return PKT_RX_ERR_HW;
> >
> > Probably I didn't explain myself clear enough, sorry.
> > I didn't suggest to get rid of setting bits that indicate L3/L4 checksum 
> > errors:
> > PKT_RX_IP_CKSUM_BAD, PKT_RX_L4_CKSUM_BAD, PKT_RX_EIP_CKSUM_BAD.
> > I th

[dpdk-dev] [PATCH v6 2/2] testpmd: add mode 4 support v6

2014-11-26 Thread Thomas Monjalon
2014-11-26 11:17, Michal Jastrzebski:
> From: Pawel Wodkowski 
> --- a/app/test-pmd/csumonly.c
> +++ b/app/test-pmd/csumonly.c
> @@ -254,8 +254,17 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
>*/
>   nb_rx = rte_eth_rx_burst(fs->rx_port, fs->rx_queue, pkts_burst,
>nb_pkt_per_burst);
> +#ifndef RTE_LIBRTE_PMD_BOND
>   if (unlikely(nb_rx == 0))
>   return;
> +#else
> + if (unlikely(nb_rx == 0 && (fs->forward_timeout == 0 ||
> + fs->next_forward_time > rte_rdtsc(
> + return;
> +
> + if (fs->forward_timeout != 0)
> + fs->next_forward_time = rte_rdtsc() + fs->forward_timeout;
> +#endif

I don't understand why you need to make such change for bonding,
and there is no comment to explain.
Bonding should be a PMD like any other and shouldn't require such change.
I don't know mode 4 but it seems there is a design problem here.

-- 
Thomas


[dpdk-dev] [PATCH v6 0/2] bond: mode 4 support

2014-11-26 Thread Jastrzebski, MichalX K
> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Wednesday, November 26, 2014 1:27 PM
> To: Jastrzebski, MichalX K
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v6 0/2] bond: mode 4 support
> 
> 2014-11-26 12:24, Jastrzebski, MichalX K:
> > Hi Thomas,
> > I put a brief description of mode 4 in patch 0/2 (it's under revision 
> > history) -
> I thought this is the best place. In patch 1/2 and 2/2 I put only one phrase
> telling what this particular patch do.
> > Would you like me to move this description to patch 1/2?
> 
> Yes please, the cover letter (0/2) is not a patch and won't go in git history.
> But please wait before sending a new version, I have a comment for patch
> 2/2.
> 
> Please, don't top post.
> 
> > > -Original Message-
> > > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > > Sent: Wednesday, November 26, 2014 1:17 PM
> > > To: Jastrzebski, MichalX K
> > > Cc: dev at dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH v6 0/2] bond: mode 4 support
> > >
> > > Hi Michal,
> > >
> > > 2014-11-26 11:17, Michal Jastrzebski:
> > > > v6 changes
> > > > - add commit log description to link bondig mode 4
> > >
> > > Please check your patches, I don't see any description.
> > >
> > > --
> > > Thomas

Hi Thomas,
I will wait with v7 till your doubts to patch 2/2 will be resolved.

p.s.
Sorry for top posting. 


[dpdk-dev] [PATCH v6 0/2] bond: mode 4 support

2014-11-26 Thread Thomas Monjalon
2014-11-26 12:24, Jastrzebski, MichalX K:
> Hi Thomas,
> I put a brief description of mode 4 in patch 0/2 (it's under revision 
> history) - I thought this is the best place. In patch 1/2 and 2/2 I put only 
> one phrase telling what this particular patch do.
> Would you like me to move this description to patch 1/2?

Yes please, the cover letter (0/2) is not a patch and won't go in git history.
But please wait before sending a new version, I have a comment for patch 2/2.

Please, don't top post.

> > -Original Message-
> > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > Sent: Wednesday, November 26, 2014 1:17 PM
> > To: Jastrzebski, MichalX K
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v6 0/2] bond: mode 4 support
> > 
> > Hi Michal,
> > 
> > 2014-11-26 11:17, Michal Jastrzebski:
> > > v6 changes
> > > - add commit log description to link bondig mode 4
> > 
> > Please check your patches, I don't see any description.
> > 
> > --
> > Thomas



[dpdk-dev] [PATCH 2/4] doc: Corrected info for tx_checksum set mask function, in testpmd UG

2014-11-26 Thread De Lara Guarch, Pablo


> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Wednesday, November 26, 2014 12:41 PM
> To: De Lara Guarch, Pablo
> Cc: dev at dpdk.org; Olivier MATZ; Iremonger, Bernard
> Subject: Re: [dpdk-dev] [PATCH 2/4] doc: Corrected info for tx_checksum set
> mask function, in testpmd UG
> 
> Hi Pablo,
> 
> 2014-11-17 10:47, De Lara Guarch, Pablo:
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Olivier MATZ
> > > On 11/15/2014 08:13 PM, Pablo de Lara wrote:
> > > > tx_checksum set mask function now allows 4 extra bits in the mask
> > > > for TX checksum offload
> > > >
> > > > Signed-off-by: Pablo de Lara 
> > > > ---
> > > >  doc/guides/testpmd_app_ug/testpmd_funcs.rst |   10 +-
> > > >  1 files changed, 9 insertions(+), 1 deletions(-)
> > >
> > > A patch reworking the csumonly API is pending:
> > > http://dpdk.org/ml/archives/dev/2014-November/008188.html
> > >
> > > I don't know if it will be accepted, but just to mention that
> > > these 2 patches will conflict in this case.
> >
> > Thanks for spotting it! I guess that at this point, all we can do is wait.
> > If you patch gets applied before mine, I will send a v2 with the changes.
> > If it gets applied after, then I will send another patch to fix it.
> 
> Oliver will send a v4 of his TSO patchset which should be applied shortly.
> Please could you adjust the documentation and make a v2?
> 
> Bernard, we have to wait for this change.

Sure, no problem.
> 
> Thanks to all
> --
> Thomas


[dpdk-dev] [PATCH v6 0/2] bond: mode 4 support

2014-11-26 Thread Thomas Monjalon
Hi Michal,

2014-11-26 11:17, Michal Jastrzebski:
> v6 changes
> - add commit log description to link bondig mode 4

Please check your patches, I don't see any description.

-- 
Thomas


[dpdk-dev] [PATCH v6 2/2] testpmd: add mode 4 support v6

2014-11-26 Thread Wodkowski, PawelX
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas Monjalon
> Sent: Wednesday, November 26, 2014 1:31 PM
> To: Jastrzebski, MichalX K
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v6 2/2] testpmd: add mode 4 support v6
> 
> 2014-11-26 11:17, Michal Jastrzebski:
> > From: Pawel Wodkowski 
> > --- a/app/test-pmd/csumonly.c
> > +++ b/app/test-pmd/csumonly.c
> > @@ -254,8 +254,17 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
> >  */
> > nb_rx = rte_eth_rx_burst(fs->rx_port, fs->rx_queue, pkts_burst,
> >  nb_pkt_per_burst);
> > +#ifndef RTE_LIBRTE_PMD_BOND
> > if (unlikely(nb_rx == 0))
> > return;
> > +#else
> > +   if (unlikely(nb_rx == 0 && (fs->forward_timeout == 0 ||
> > +   fs->next_forward_time > rte_rdtsc(
> > +   return;
> > +
> > +   if (fs->forward_timeout != 0)
> > +   fs->next_forward_time = rte_rdtsc() + fs->forward_timeout;
> > +#endif
> 
> I don't understand why you need to make such change for bonding,
> and there is no comment to explain.
> Bonding should be a PMD like any other and shouldn't require such change.
> I don't know mode 4 but it seems there is a design problem here.
> 

It is an implication of requirement that was formed on beginning of bonding 
implementation - bonded interface should be transparent to user app. But this
requirement in is in collision with mode 4. It need to periodically receive and 
transmit frames (LACP and marker) that are not passed to user app but 
processed/produced in background. If this will not happen in at least 10 times
per second mode 4 will not work.

Most of (all?) user applications do RX/TX more often than 10 times per second, 
so this will have neglectable impact to those apps (it will have to check this 
100ms maximum interval of rx/tx as I did in code you pointed).

We had discussed all options with Declan and Bruce, and this seems to be the
most transparent way to implement mode 4 without using any kind of locking
inside library.

Pawe?


[dpdk-dev] [PATCH v3 08/13] testpmd: rework csum forward engine

2014-11-26 Thread Ananyev, Konstantin


> -Original Message-
> From: Olivier MATZ [mailto:olivier.matz at 6wind.com]
> Sent: Wednesday, November 26, 2014 11:15 AM
> To: Ananyev, Konstantin; dev at dpdk.org
> Cc: Walukiewicz, Miroslaw; Liu, Jijiang; Liu, Yong; jigsaw at gmail.com; 
> Richardson, Bruce
> Subject: Re: [PATCH v3 08/13] testpmd: rework csum forward engine
> 
> Hi Konstantin,
> 
> On 11/26/2014 11:10 AM, Ananyev, Konstantin wrote:
> > As I can see you removed code that sets up TX_PKT_IPV4 and TX_PKT_IPV6  of 
> > ol_flags.
> > I think that we need to keep it.
> > The reason for that is:
> > With FVL, to make HW TX checksum offload work, SW is responsible to provide 
> > to the HW information about L3 header.
> > Possible values are:
> > - IPv4 hdr with HW checksum calculation
> > - IPV4 hdr (checksum done by SW)
> > - IPV6 hdr
> > - unknown
> > So let say to for the packet: ETHER_HDR/IPV6_HDR/TCP_HDR/DATA
> > To request HW TCP checksum offload,  SW have to provide to HW information 
> > that it is a packet with IPV6 header
> > (plus as for ixgbe: l2_hdr_len, l3_hdr_len, l4_type, l4_hdr_len).
> > That's why TX_PKT_IPV4 and TX_PKT_IPV6   were introduced.
> >
> > Yes, it is  a change in public API for HW TX offload, but I don't see any 
> > other way we can overcome it
> > (apart from make TX function itself to parse a packet, which is obviously 
> > not a good choice).
> > Note that existing apps working on existing HW (ixgbe/igb/em) are not 
> > affected.
> > Though apps that supposed to be run on FVL HW too have to follow new 
> > convention.
> >
> > So I suggest we keep setting these flags in csumonly.c
> 
> Right, I missed these flags.
> It's indeed an API change, but maybe it makes sense, and setting it
> is not a big cost for the application.
> 
> So I would also need to slightly modify the API help in the following
> patches:
>   - [04/13] mbuf: add help about TX checksum flags
>   - [10/13] mbuf: generic support for TCP segmentation offload
> 
> I'll send a v4 this afternoon that integrates this change.

Ok, thanks.

> 
> Do you know precisely when the flags PKT_TX_IPV4 and PKT_TX_IPV6 must
> be set by the application? Is it only the hw checksum and tso use case?

Yes, I believe it should be set only for hw checksum and tso.

> If yes, I'll add it in the API help too.
> 
> By the way (this is probably off-topic), but I'm wondering if the TX
> flags should have the same values than the RX flags:
> 
>#define PKT_TX_IPV4  PKT_RX_IPV4_HDR
>#define PKT_TX_IPV6  PKT_RX_IPV6_HDR

Thought about that too.
>From one side,  it is a bit out of our concept: separate RX and TX falgs.
>From other side, it allows us to save 2 bits in the ol_flags.
Don't have any strong opinion here.
What do you think?  

> 
> > Apart from that , the patch looks good to me.
> > And yes, we would need to change the  the way we handle TX offload for 
> > tunnelled packets.
> 
> Thank you very much Konstantin for your review.
> 
> Regards,
> Olivier



[dpdk-dev] [PATCH v6 0/2] bond: mode 4 support

2014-11-26 Thread Jastrzebski, MichalX K
Hi Thomas,
I put a brief description of mode 4 in patch 0/2 (it's under revision history) 
- I thought this is the best place. In patch 1/2 and 2/2 I put only one phrase 
telling what this particular patch do.
Would you like me to move this description to patch 1/2?

Best regards
Michal

> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Wednesday, November 26, 2014 1:17 PM
> To: Jastrzebski, MichalX K
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v6 0/2] bond: mode 4 support
> 
> Hi Michal,
> 
> 2014-11-26 11:17, Michal Jastrzebski:
> > v6 changes
> > - add commit log description to link bondig mode 4
> 
> Please check your patches, I don't see any description.
> 
> --
> Thomas


[dpdk-dev] [RFC PATCH 0/6] DPDK support to bifurcated driver

2014-11-26 Thread Zhou, Danny

> -Original Message-
> From: Walukiewicz, Miroslaw
> Sent: Wednesday, November 26, 2014 6:45 PM
> To: Zhou, Danny; Richardson, Bruce
> Cc: dev at dpdk.org
> Subject: RE: [dpdk-dev] [RFC PATCH 0/6] DPDK support to bifurcated driver
> 
> Thank you for explanation.
> 
> I have a few  questions regarding the setup flow yet:
> 
> 1. Why we need this step:
> >   3. Setup a flow director rule to distribute packets with source ip
> > > > > > >  0.0.0.0 to rxq No.0
> > > > > > >   > ethtool -N eth0  flow-type udp4 src-ip 0.0.0.0 action 0
> 
DZ: By default, ixgbe kernel driver uses 32 (0-31) rx/tx queue pairs. Above 
example setup a filter
to route a UDP flow with src_ip 0.0.0.0 to queue No.0 which is used by kernel 
driver' rx/tx routine.

> 
> 2. You presented the filter setup for receiving all udp4 packets on specific 
> queue
> > > > > > >   5. Setup a flow director rule to distribute packets with source 
> > > > > > > ip
> > > > > > >  1.1.1.1 to rxq No.32. This needs to be done after testpmd 
> > > > > > > starts.
> > > > > > >   > ethtool -N eth0 flow-type udp4 src-ip 1.1.1.1 action 32
> 
> How to configure flow director to receive all packets with dst-ip = 1.1.1.1 
> on qpair=32?
DZ: You can certainly do it using ethtool command-line like "ethtool -N eth0 
flow-type udp4 dst-ip 1.1.1.1 action 32" to do it.

> Will TCP SYN packets caught by such filter setup?
DZ: Unfortunately, unlike DPDK that provides ixgbe_add_syn_filter() API to 
allows program SYN Packet Queue Filter register, the 
in_kernel ixgbe kernel driver does not touch that register. While I had seen 
ixgbe 3.18.7 driver hard-code a value in that register.
For all cases, there is no easy way to use ixgbe bifurcated driver to config 
it. Under bifurcated mode, DPDK cannot access that register.

> 3.  Do we have a possibility to setup a rule like:
> Forward all TCPv4 rx packets with dst-ip =1.1.1.1 and TCP port  to 
> qpair=32 including SYN packets?
DZ: Yes, ethtool and flow director supports that. Will send you a separated 
email regarding ethtool usage regarding flow director configuration.

> 3. In your application example you present that qpair number (32) is known 
> before start of application
> > > > > > >   > ./x86_64-native-linuxapp-gcc/app/testpmd -c 0x3 -n 4 \
> > > > > > >   >  --vdev=rte_bifurc,iface=eth0,qpairs=1 -- \
> > > > > > >   >  -i --rxfreet=32 --txfreet=32 --txrst=32
> 
> Is there a possibility to dynamic queue allocation? I ask about API.
>  I mean dynamic attaching and detaching queue from application level and not 
> specifying the numbers in the command line.
> 
DZ: The example is just for experiment. When DPDK request queue pairs from 
ixgbe bifurcated driver, it only specify number of qpairs, the kernel
driver actually returns the absolute qpair index of assigned qpairs to 
application. Application can hence use it to invoke ethtool command-line to do 
it or
directly invoke IOCTL to bifurcated driver to setup FD.

> 4. Is there a possibility to create a rule with perfect match and directing 
> the packets to the specific queue.
> I mean here a rule like:
> Forward all TCPv4 rx packets with dst-ip=1.1.1.1 src-ip=2.2.2.2 dst-port= 
> src-port=1234 to queue 33
> 
DZ: Yes, of course you can.

> Regards,
> 
> Mirek
> 
> > -Original Message-
> > From: Zhou, Danny
> > Sent: Tuesday, November 25, 2014 4:23 PM
> > To: Richardson, Bruce; Walukiewicz, Miroslaw
> > Cc: dev at dpdk.org
> > Subject: RE: [dpdk-dev] [RFC PATCH 0/6] DPDK support to bifurcated driver
> >
> >
> >
> > > -Original Message-
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce Richardson
> > > Sent: Tuesday, November 25, 2014 11:03 PM
> > > To: Walukiewicz, Miroslaw
> > > Cc: dev at dpdk.org
> > > Subject: Re: [dpdk-dev] [RFC PATCH 0/6] DPDK support to bifurcated driver
> > >
> > > On Tue, Nov 25, 2014 at 02:57:13PM +, Walukiewicz, Miroslaw wrote:
> > > > Thank you Bruce for explanation of the idea.
> > >
> > > Actually, credit goes to Steve Liang, not me, for the explanation. :-)
> > >
> > > >
> > > > I have question regarding TCP SYN packets? Do you have any idea how to
> > share the TCP SYN requests between kernel and
> > > user-space application?
> > >
> > > As I'm giving the credit to Steve, I'll also pass the buck for answering 
> > > that
> > > question to him too! :-)
> > >
> > > /Bruce
> >
> > On ixgbe' Rx queuing flow, match SYN filter stage is prior to Flow Director
> > filter stage. When working at bifurcated driver support mode,
> > DPDK cannot access those NIC registers except for the ones that are used to
> > rx/tx packets for assigned rx/tx queue pairs. So basically it really
> > depends on user to use ethtool or other interface to setup SYN filter via
> > ixgbe bifurcated driver. User can distribute TCP SYN packets to
> > kernel bifurcated driver owned rx queues or DPDK owned rx queues, for the
> > latter case, DPDK can still push them back to kernel via KNI if DPDK
> > does not want 

[dpdk-dev] [PATCH] i40e: Use one bit flag for all hardware detected RX packet errors

2014-11-26 Thread Olivier MATZ
Hi Konstantin, Hi Helin,

On 11/26/2014 11:49 AM, Ananyev, Konstantin wrote:
> Hi Helin,
>
>> -Original Message-
>> From: Zhang, Helin
>> Sent: Wednesday, November 26, 2014 6:07 AM
>> To: dev at dpdk.org
>> Cc: Cao, Waterman; Cao, Min; Ananyev, Konstantin; olivier.matz at 6wind.com; 
>> Zhang, Helin
>> Subject: [PATCH] i40e: Use one bit flag for all hardware detected RX packet 
>> errors
>>
>> There were some bit flags of 0 for RX packet errors detected by hardware.
>> Actually only one bit of error flag is enough for all hardware detected
>> RX packet errors.
>>
>> Signed-off-by: Helin Zhang 
>> ---
>>   lib/librte_mbuf/rte_mbuf.h  |  6 +-
>>   lib/librte_pmd_i40e/i40e_rxtx.c | 31 +++
>>   2 files changed, 4 insertions(+), 33 deletions(-)
>>
>> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
>> index 5899e5c..897fd26 100644
>> --- a/lib/librte_mbuf/rte_mbuf.h
>> +++ b/lib/librte_mbuf/rte_mbuf.h
>> @@ -80,11 +80,6 @@ extern "C" {
>>   #define PKT_RX_FDIR  (1ULL << 2)  /**< RX packet with FDIR match 
>> indicate. */
>>   #define PKT_RX_L4_CKSUM_BAD  (1ULL << 3)  /**< L4 cksum of RX pkt. is not 
>> OK. */
>>   #define PKT_RX_IP_CKSUM_BAD  (1ULL << 4)  /**< IP cksum of RX pkt. is not 
>> OK. */
>> -#define PKT_RX_EIP_CKSUM_BAD (0ULL << 0)  /**< External IP header checksum 
>> error. */
>> -#define PKT_RX_OVERSIZE  (0ULL << 0)  /**< Num of desc of an RX pkt 
>> oversize. */
>> -#define PKT_RX_HBUF_OVERFLOW (0ULL << 0)  /**< Header buffer overflow. */
>> -#define PKT_RX_RECIP_ERR (0ULL << 0)  /**< Hardware processing error. */
>> -#define PKT_RX_MAC_ERR   (0ULL << 0)  /**< MAC error. */
>>   #define PKT_RX_IPV4_HDR  (1ULL << 5)  /**< RX packet with IPv4 header. 
>> */
>>   #define PKT_RX_IPV4_HDR_EXT  (1ULL << 6)  /**< RX packet with extended 
>> IPv4 header. */
>>   #define PKT_RX_IPV6_HDR  (1ULL << 7)  /**< RX packet with IPv6 header. 
>> */
>> @@ -95,6 +90,7 @@ extern "C" {
>>   #define PKT_RX_TUNNEL_IPV6_HDR (1ULL << 12) /**< RX tunnel packet with 
>> IPv6 header. */
>>   #define PKT_RX_FDIR_ID   (1ULL << 13) /**< FD id reported if FDIR 
>> match. */
>>   #define PKT_RX_FDIR_FLX  (1ULL << 14) /**< Flexible bytes reported if 
>> FDIR match. */
>> +#define PKT_RX_ERR_HW(1ULL << 15) /**< RX packet error detected by 
>> hardware. */
>>
>>   #define PKT_TX_VLAN_PKT  (1ULL << 55) /**< TX packet is a 802.1q VLAN 
>> packet. */
>>   #define PKT_TX_IP_CKSUM  (1ULL << 54) /**< IP cksum of TX pkt. 
>> computed by NIC. */
>> diff --git a/lib/librte_pmd_i40e/i40e_rxtx.c 
>> b/lib/librte_pmd_i40e/i40e_rxtx.c
>> index cce6911..3b2195d 100644
>> --- a/lib/librte_pmd_i40e/i40e_rxtx.c
>> +++ b/lib/librte_pmd_i40e/i40e_rxtx.c
>> @@ -115,35 +115,10 @@ i40e_rxd_status_to_pkt_flags(uint64_t qword)
>>   static inline uint64_t
>>   i40e_rxd_error_to_pkt_flags(uint64_t qword)
>>   {
>> -uint64_t flags = 0;
>> -uint64_t error_bits = (qword >> I40E_RXD_QW1_ERROR_SHIFT);
>> -
>> -#define I40E_RX_ERR_BITS 0x3f
>> -if (likely((error_bits & I40E_RX_ERR_BITS) == 0))
>> -return flags;
>> -/* If RXE bit set, all other status bits are meaningless */
>> -if (unlikely(error_bits & (1 << I40E_RX_DESC_ERROR_RXE_SHIFT))) {
>> -flags |= PKT_RX_MAC_ERR;
>> -return flags;
>> -}
>> -
>> -/* If RECIPE bit set, all other status indications should be ignored */
>> -if (unlikely(error_bits & (1 << I40E_RX_DESC_ERROR_RECIPE_SHIFT))) {
>> -flags |= PKT_RX_RECIP_ERR;
>> -return flags;
>> -}
>> -if (unlikely(error_bits & (1 << I40E_RX_DESC_ERROR_HBO_SHIFT)))
>> -flags |= PKT_RX_HBUF_OVERFLOW;
>> -if (unlikely(error_bits & (1 << I40E_RX_DESC_ERROR_IPE_SHIFT)))
>> -flags |= PKT_RX_IP_CKSUM_BAD;
>> -if (unlikely(error_bits & (1 << I40E_RX_DESC_ERROR_L4E_SHIFT)))
>> -flags |= PKT_RX_L4_CKSUM_BAD;
>> -if (unlikely(error_bits & (1 << I40E_RX_DESC_ERROR_EIPE_SHIFT)))
>> -flags |= PKT_RX_EIP_CKSUM_BAD;
>> -if (unlikely(error_bits & (1 << I40E_RX_DESC_ERROR_OVERSIZE_SHIFT)))
>> -flags |= PKT_RX_OVERSIZE;
>> +if (unlikely(qword & I40E_RXD_QW1_ERROR_MASK))
>> +return PKT_RX_ERR_HW;
>
> Probably I didn't explain myself clear enough, sorry.
> I didn't suggest to get rid of setting bits that indicate L3/L4 checksum 
> errors:
> PKT_RX_IP_CKSUM_BAD, PKT_RX_L4_CKSUM_BAD, PKT_RX_EIP_CKSUM_BAD.
> I think these flags should be set as before.
>
> I was talking only about collapsing only these 4 RX error flags into one:
>
> #define PKT_RX_OVERSIZE  (0ULL << 0)  /**< Num of desc of an RX pkt 
> oversize. */
> #define PKT_RX_HBUF_OVERFLOW (0ULL << 0)  /**< Header buffer overflow. */
> #define PKT_RX_RECIP_ERR (0ULL << 0)  /**< Hardware processing error. */
> #define PKT_RX_MAC_ERR   (0ULL << 0)  /**< MAC error. */
>
>  From my point of view the difference of these 2

[dpdk-dev] [PATCH v3 08/13] testpmd: rework csum forward engine

2014-11-26 Thread Olivier MATZ
Hi Konstantin,

On 11/26/2014 11:10 AM, Ananyev, Konstantin wrote:
> As I can see you removed code that sets up TX_PKT_IPV4 and TX_PKT_IPV6  of 
> ol_flags.
> I think that we need to keep it.
> The reason for that is:
> With FVL, to make HW TX checksum offload work, SW is responsible to provide 
> to the HW information about L3 header.
> Possible values are:
> - IPv4 hdr with HW checksum calculation
> - IPV4 hdr (checksum done by SW)
> - IPV6 hdr
> - unknown
> So let say to for the packet: ETHER_HDR/IPV6_HDR/TCP_HDR/DATA
> To request HW TCP checksum offload,  SW have to provide to HW information 
> that it is a packet with IPV6 header
> (plus as for ixgbe: l2_hdr_len, l3_hdr_len, l4_type, l4_hdr_len).
> That's why TX_PKT_IPV4 and TX_PKT_IPV6   were introduced.
>
> Yes, it is  a change in public API for HW TX offload, but I don't see any 
> other way we can overcome it
> (apart from make TX function itself to parse a packet, which is obviously not 
> a good choice).
> Note that existing apps working on existing HW (ixgbe/igb/em) are not 
> affected.
> Though apps that supposed to be run on FVL HW too have to follow new 
> convention.
>
> So I suggest we keep setting these flags in csumonly.c

Right, I missed these flags.
It's indeed an API change, but maybe it makes sense, and setting it
is not a big cost for the application.

So I would also need to slightly modify the API help in the following
patches:
  - [04/13] mbuf: add help about TX checksum flags
  - [10/13] mbuf: generic support for TCP segmentation offload

I'll send a v4 this afternoon that integrates this change.

Do you know precisely when the flags PKT_TX_IPV4 and PKT_TX_IPV6 must
be set by the application? Is it only the hw checksum and tso use case?
If yes, I'll add it in the API help too.

By the way (this is probably off-topic), but I'm wondering if the TX
flags should have the same values than the RX flags:

   #define PKT_TX_IPV4  PKT_RX_IPV4_HDR
   #define PKT_TX_IPV6  PKT_RX_IPV6_HDR

> Apart from that , the patch looks good to me.
> And yes, we would need to change the  the way we handle TX offload for 
> tunnelled packets.

Thank you very much Konstantin for your review.

Regards,
Olivier



[dpdk-dev] [PULL REQUEST] doc: document modifications in testpmd_app_ug and freebsd_gsg

2014-11-26 Thread Bernard Iremonger
These changes are DPDK 1.8 modifications and some corrections to the
TestPMD Application User Guide and the FreeBSD Getting Started Guide.

The following changes since commit c4f136db8ec532c3c930be5698cc84321c64192d:

  eal/linux: map pci memory resources after hugepages (2014-11-25 18:16:41 
+0100)

are available in the git repository at:
  git://dpdk.org/next/dpdk-doc  master

Bruce Richardson (3):
  doc: change hardcoded date to auto-generated
  doc: adjust line lengths in FreeBSD GSG rst files
  doc: update FreeBSD GSG to document ports install

Pablo de Lara (4):
  doc: Added new commands in testpmd UG
  doc: Corrected info for tx_checksum set mask function, in testpmd UG
  doc: Moved commands in testpmd UG to match testpmd command help order
  doc: Various document fixes in testpmd UG

 doc/guides/freebsd_gsg/build_dpdk.rst |  295 +--
 doc/guides/freebsd_gsg/build_sample_apps.rst  |   96 ++-
 doc/guides/freebsd_gsg/index.rst  |4 +-
 doc/guides/freebsd_gsg/install_from_ports.rst |  162 
 doc/guides/freebsd_gsg/intro.rst  |   55 +-
 doc/guides/freebsd_gsg/sys_reqs.rst   |  163 
 doc/guides/linux_gsg/index.rst|2 +-
 doc/guides/prog_guide/index.rst   |2 +-
 doc/guides/rel_notes/index.rst|2 +-
 doc/guides/sample_app_ug/index.rst|2 +-
 doc/guides/testpmd_app_ug/index.rst   |2 +-
 doc/guides/testpmd_app_ug/testpmd_funcs.rst   | 1085 +
 12 files changed, 1034 insertions(+), 836 deletions(-)
 create mode 100644 doc/guides/freebsd_gsg/install_from_ports.rst
 delete mode 100644 doc/guides/freebsd_gsg/sys_reqs.rst


[dpdk-dev] [PATCH 10/10] eal: add option --master-lcore

2014-11-26 Thread Simon Kuenzer
On 25.11.2014 14:39, Bruce Richardson wrote:
> On Tue, Nov 25, 2014 at 01:45:22PM +0100, Thomas Monjalon wrote:
>> Hi Simon,
>>
>> 2014-11-25 10:09, Simon Kuenzer:
>>> thanks for your work. I have one (minor) comment for this patch that
>>> should be fixed in a later version.
>>
 +  /* default master lcore is the first one */
 +  if (cfg->master_lcore == 0)
 +  cfg->master_lcore = rte_get_next_lcore(-1, 0, 0);
 +
>>>
>>> Might be confusing if a user specifies --master-lcore 0 and uses a
>>> coremask/corelist where core id 0 is not specified.
>>
>> Yes, in this corner case, master-lcore will be adjusted instead of having
>> an error.
>>
>>> What about setting cfg->master_lcore to (RTE_MAX_LCORE + 1) on
>>> initialization in order to distinguish if a master_lcore got specified
>>> by the user or not?
>>
>> Even simpler, I can fix it by introducing a flag master_lcore_parsed and
>> do the adjustment only if the option is not parsed.
>>
> I agree that that sounds like a simpler approach, since we already have flags
> for what args are parsed or not.
>
> /Bruce
>

Fine with me :-). I also agree that having the flag is even a cleaner 
solution.

Thanks,

Simon


[dpdk-dev] [PATCH v6 2/2] testpmd: add mode 4 support v6

2014-11-26 Thread Michal Jastrzebski
From: Pawel Wodkowski 

This patch add mode 4 support to testpmd application.

Signed-off-by: Pawel Wodkowski 
---
 app/test-pmd/cmdline.c  |   28 ++--
 app/test-pmd/csumonly.c |9 
 app/test-pmd/icmpecho.c |   21 +-
 app/test-pmd/iofwd.c|9 
 app/test-pmd/macfwd-retry.c |9 
 app/test-pmd/macfwd.c   |9 
 app/test-pmd/macswap.c  |9 
 app/test-pmd/testpmd.c  |   50 +--
 app/test-pmd/testpmd.h  |   11 --
 9 files changed, 144 insertions(+), 11 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index bb4e75c..7d1b38e 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -41,6 +41,8 @@
 #include 
 #include 
 #include 
+
+#include "rte_eth_bond_8023ad.h"
 #ifndef __linux__
 #ifndef __FreeBSD__
 #include 
@@ -87,6 +89,7 @@
 #include 
 #ifdef RTE_LIBRTE_PMD_BOND
 #include 
+#include 
 #endif

 #include "testpmd.h"
@@ -3441,13 +3444,18 @@ static void cmd_show_bonding_config_parsed(void 
*parsed_result,
__attribute__((unused)) void *data)
 {
struct cmd_show_bonding_config_result *res = parsed_result;
+   struct rte_eth_bond_8023ad_slave_info slave_info;
+   static const char * const state_labels[] = {
+   "ACT", "TIMEOUT", "AGG", "SYNC", "COL", "DIST", "DEF", "EXP"
+   };
int bonding_mode;
uint8_t slaves[RTE_MAX_ETHPORTS];
int num_slaves, num_active_slaves;
int primary_id;
-   int i;
+   int i, j;
portid_t port_id = res->port_id;

+
/* Display the bonding mode.*/
bonding_mode = rte_eth_bond_mode_get(port_id);
if (bonding_mode < 0) {
@@ -3456,7 +3464,8 @@ static void cmd_show_bonding_config_parsed(void 
*parsed_result,
} else
printf("\tBonding mode: %d\n", bonding_mode);

-   if (bonding_mode == BONDING_MODE_BALANCE) {
+   if (bonding_mode == BONDING_MODE_BALANCE ||
+   bonding_mode == BONDING_MODE_8023AD) {
int balance_xmit_policy;

balance_xmit_policy = rte_eth_bond_xmit_policy_get(port_id);
@@ -3513,6 +3522,19 @@ static void cmd_show_bonding_config_parsed(void 
*parsed_result,

printf("%d]\n", slaves[num_active_slaves - 1]);

+   if (bonding_mode == BONDING_MODE_8023AD) {
+   for (i = 0; i < num_active_slaves; i++) {
+   rte_eth_bond_8023ad_slave_info(port_id, 
slaves[i], &slave_info);
+
+   printf("\tSlave %u state: ", slaves[i]);
+   for (j = 0; j < 8; j++) {
+   if ((slave_info.actor_state >> j) & 1)
+   printf("%s ", state_labels[j]);
+   }
+   printf("\n");
+   }
+   }
+
} else {
printf("\tActive Slaves: []\n");

@@ -3760,6 +3782,8 @@ static void cmd_create_bonded_device_parsed(void 
*parsed_result,
/* Update number of ports */
nb_ports = rte_eth_dev_count();
reconfig(port_id, res->socket);
+   /* Save bonding mode here as it is constat. */
+   ports[port_id].bond_mode = res->mode;
rte_eth_promiscuous_enable(port_id);
}

diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
index 8d10bfd..c433eea 100644
--- a/app/test-pmd/csumonly.c
+++ b/app/test-pmd/csumonly.c
@@ -254,8 +254,17 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
 */
nb_rx = rte_eth_rx_burst(fs->rx_port, fs->rx_queue, pkts_burst,
 nb_pkt_per_burst);
+#ifndef RTE_LIBRTE_PMD_BOND
if (unlikely(nb_rx == 0))
return;
+#else
+   if (unlikely(nb_rx == 0 && (fs->forward_timeout == 0 ||
+   fs->next_forward_time > rte_rdtsc(
+   return;
+
+   if (fs->forward_timeout != 0)
+   fs->next_forward_time = rte_rdtsc() + fs->forward_timeout;
+#endif

 #ifdef RTE_TEST_PMD_RECORD_BURST_STATS
fs->rx_burst_stats.pkt_burst_spread[nb_rx]++;
diff --git a/app/test-pmd/icmpecho.c b/app/test-pmd/icmpecho.c
index 774924e..bcd5ffb 100644
--- a/app/test-pmd/icmpecho.c
+++ b/app/test-pmd/icmpecho.c
@@ -293,6 +293,9 @@ reply_to_icmp_echo_rqsts(struct fwd_stream *fs)
uint16_t arp_pro;
uint8_t  i;
int l2_len;
+#ifdef RTE_LIBRTE_PMD_BOND
+   uint8_t force_tx_burst;
+#endif
 #ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
uint64_t start_tsc;
uint64_t end_tsc;
@@ -308,8 +311,20 @@ reply_to_icmp_echo_rqsts(struct fwd_stream *fs)
 */
nb_rx = rte_eth_rx_burst(fs->rx_port, fs->rx_queue, pkts_burst,
 nb_pkt_per_burst);
+#ifndef RTE_LIB

[dpdk-dev] [PATCH v6 1/2] bond: add mode 4 support v6

2014-11-26 Thread Michal Jastrzebski
From: Pawel Wodkowski 

This patch add mode 4 support to other link bonding modes.

Signed-off-by: Pawel Wodkowski 
---
 lib/librte_ether/rte_ether.h  |1 +
 lib/librte_pmd_bond/Makefile  |2 +
 lib/librte_pmd_bond/rte_eth_bond.h|5 +
 lib/librte_pmd_bond/rte_eth_bond_8023ad.c | 1216 +
 lib/librte_pmd_bond/rte_eth_bond_8023ad.h |  214 
 lib/librte_pmd_bond/rte_eth_bond_8023ad_private.h |  308 ++
 lib/librte_pmd_bond/rte_eth_bond_api.c|   91 +-
 lib/librte_pmd_bond/rte_eth_bond_args.c   |1 +
 lib/librte_pmd_bond/rte_eth_bond_pmd.c|  262 -
 lib/librte_pmd_bond/rte_eth_bond_private.h|   31 +-
 10 files changed, 2083 insertions(+), 48 deletions(-)
 create mode 100644 lib/librte_pmd_bond/rte_eth_bond_8023ad.c
 create mode 100644 lib/librte_pmd_bond/rte_eth_bond_8023ad.h
 create mode 100644 lib/librte_pmd_bond/rte_eth_bond_8023ad_private.h

diff --git a/lib/librte_ether/rte_ether.h b/lib/librte_ether/rte_ether.h
index 187608d..7e7d22c 100644
--- a/lib/librte_ether/rte_ether.h
+++ b/lib/librte_ether/rte_ether.h
@@ -328,6 +328,7 @@ struct vxlan_hdr {
 #define ETHER_TYPE_RARP 0x8035 /**< Reverse Arp Protocol. */
 #define ETHER_TYPE_VLAN 0x8100 /**< IEEE 802.1Q VLAN tagging. */
 #define ETHER_TYPE_1588 0x88F7 /**< IEEE 802.1AS 1588 Precise Time Protocol. */
+#define ETHER_TYPE_SLOW 0x8809 /**< Slow protocols (LACP and Marker). */

 #define ETHER_VXLAN_HLEN (sizeof(struct udp_hdr) + sizeof(struct vxlan_hdr))
 /**< VXLAN tunnel header length. */
diff --git a/lib/librte_pmd_bond/Makefile b/lib/librte_pmd_bond/Makefile
index d4e10bf..cdff126 100644
--- a/lib/librte_pmd_bond/Makefile
+++ b/lib/librte_pmd_bond/Makefile
@@ -45,6 +45,7 @@ CFLAGS += $(WERROR_FLAGS)
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += rte_eth_bond_api.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += rte_eth_bond_pmd.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += rte_eth_bond_args.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += rte_eth_bond_8023ad.c

 ifeq ($(CONFIG_RTE_MBUF_REFCNT),n)
 $(info WARNING: Link Bonding Broadcast mode is disabled because it needs 
MBUF_REFCNT.)
@@ -54,6 +55,7 @@ endif
 # Export include files
 #
 SYMLINK-y-include += rte_eth_bond.h
+SYMLINK-y-include += rte_eth_bond_8023ad.h

 # this lib depends upon:
 DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += lib/librte_mbuf
diff --git a/lib/librte_pmd_bond/rte_eth_bond.h 
b/lib/librte_pmd_bond/rte_eth_bond.h
index 085500b..dd04743 100644
--- a/lib/librte_pmd_bond/rte_eth_bond.h
+++ b/lib/librte_pmd_bond/rte_eth_bond.h
@@ -77,6 +77,11 @@ extern "C" {
  * In this mode all transmitted packets will be transmitted on all available
  * active slaves of the bonded. */
 #endif
+#define BONDING_MODE_8023AD(4)
+/**< 802.3AD (Mode 4).
+ * In this mode transmission and reception of packets is managed by LACP
+ * protocol specified in 802.3AD documentation. */
+
 /* Balance Mode Transmit Policies */
 #define BALANCE_XMIT_POLICY_LAYER2 (0)
 /**< Layer 2 (Ethernet MAC) */
diff --git a/lib/librte_pmd_bond/rte_eth_bond_8023ad.c 
b/lib/librte_pmd_bond/rte_eth_bond_8023ad.c
new file mode 100644
index 000..5a0714e
--- /dev/null
+++ b/lib/librte_pmd_bond/rte_eth_bond_8023ad.c
@@ -0,0 +1,1216 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF 

[dpdk-dev] [PATCH v6 0/2] bond: mode 4 support

2014-11-26 Thread Michal Jastrzebski
v6 changes
- add commit log description to link bondig mode 4

v5 changes
- fix compilation errors when CONFIG_RTE_LIBRTE_PMD_BOND=n

v4 changes:
- fix compilation error when building without mbuf refcnt
- testpmd: add slave state information in "show bonding config X" command
- change patch dependency to Declan Doherty v6

v3 changes:
This is a rework of previous patchset. Basic functionality is the same but
contain following changes:
- use one global array of slaves instead separate instance for every bonded
  device (reduce memory usage). This also allow use of port id instead of
  offsetting to current active slaves.
- make mode 4 immune to partner timings standard violation.
- fix possible buffer overflow in RX function if caller provide buffer
  that is less than received packets (additional slow packets).
- change/fix promiscus mode and MAC management.
- fix compiling issues on gcc version less than 4.5
- bring API for tunig mode 4 parameters and expose mode 4 frames structure.
- prevent console flood with warning messages if mode 4 RX/TX buffers are full.

test-pmd:
- add mode 4 support (force periodic TX if no packets received during 100ms
  period). Some forwardning modes (ex. rx only) does not allow mode 4 usage.
- 'port start X' - check if X is valid value

v2 changes:
New version handles race issues with setting/cancelin callbacks,
fixes promiscus mode setting in mode 4 and some other minor errors in mode 4
implementation.

changes not related to mode 4:
- fix memcpy() usage in bond_ethdev_tx_burst_balance() (OOM/undfined behaviour
  if TX burst fail)

This patch set add support for dynamic link aggregation (mode 4) to the
librte_pmd_bond library. This mode provides auto negotiation/configuration 
of peers and well as link status changes monitoring using out of band 
LACP (link aggregation control protocol) messages. For further details of
LACP specification see the IEEE 802.3ad/802.1AX standards. It is also
described here
https://www.kernel.org/doc/Documentation/networking/bonding.txt.

In this implementation we have an array of mode 4 settings for each slave.
There is also assumption that for every port is one aggregator (it might
be unused if better is found).

Difference in this implementation vs Linux implementation:
- this implementation it is not directly based on state machines but current
  state is calculated from actor and partner states (and other things too).

Some implementation details:
- during rx burst every packet Is checked if this is LACP or marker packet.
  If it is LACP frame it is passed to mode 4 logic using slaves rx ring  and 
  removed from rx buffer before it is returned
- in tx burst, packets from mode 4 (if any) are injected into each slave.
- there is a timer running in background to process/produce mode 4 
  frames form rx/to tx functions.

Some requirements for this mode:
- for LACP mode to work rx and tx burst functions must be invoked
  at least in 100ms intervals (testpmd modified to satisfy this requirement)
- provided buffer to rx burst should be at least 2x slave count size. This is
  not needed but might increase performance especially during initial
  handshake.



Pawel Wodkowski (2):
  bond: add mode 4 support v6
  testpmd: add mode 4 support v6

 app/test-pmd/cmdline.c|   28 +-
 app/test-pmd/csumonly.c   |9 +
 app/test-pmd/icmpecho.c   |   21 +-
 app/test-pmd/iofwd.c  |9 +
 app/test-pmd/macfwd-retry.c   |9 +
 app/test-pmd/macfwd.c |9 +
 app/test-pmd/macswap.c|9 +
 app/test-pmd/testpmd.c|   50 +-
 app/test-pmd/testpmd.h|   11 +-
 lib/librte_ether/rte_ether.h  |1 +
 lib/librte_pmd_bond/Makefile  |2 +
 lib/librte_pmd_bond/rte_eth_bond.h|5 +
 lib/librte_pmd_bond/rte_eth_bond_8023ad.c | 1216 +
 lib/librte_pmd_bond/rte_eth_bond_8023ad.h |  214 
 lib/librte_pmd_bond/rte_eth_bond_8023ad_private.h |  308 ++
 lib/librte_pmd_bond/rte_eth_bond_api.c|   91 +-
 lib/librte_pmd_bond/rte_eth_bond_args.c   |1 +
 lib/librte_pmd_bond/rte_eth_bond_pmd.c|  262 -
 lib/librte_pmd_bond/rte_eth_bond_private.h|   31 +-
 19 files changed, 2227 insertions(+), 59 deletions(-)
 create mode 100644 lib/librte_pmd_bond/rte_eth_bond_8023ad.c
 create mode 100644 lib/librte_pmd_bond/rte_eth_bond_8023ad.h
 create mode 100644 lib/librte_pmd_bond/rte_eth_bond_8023ad_private.h

-- 
1.7.9.5



[dpdk-dev] [PATCH] i40e: Use one bit flag for all hardware detected RX packet errors

2014-11-26 Thread Ananyev, Konstantin
Hi Helin,

> -Original Message-
> From: Zhang, Helin
> Sent: Wednesday, November 26, 2014 6:07 AM
> To: dev at dpdk.org
> Cc: Cao, Waterman; Cao, Min; Ananyev, Konstantin; olivier.matz at 6wind.com; 
> Zhang, Helin
> Subject: [PATCH] i40e: Use one bit flag for all hardware detected RX packet 
> errors
> 
> There were some bit flags of 0 for RX packet errors detected by hardware.
> Actually only one bit of error flag is enough for all hardware detected
> RX packet errors.
> 
> Signed-off-by: Helin Zhang 
> ---
>  lib/librte_mbuf/rte_mbuf.h  |  6 +-
>  lib/librte_pmd_i40e/i40e_rxtx.c | 31 +++
>  2 files changed, 4 insertions(+), 33 deletions(-)
> 
> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> index 5899e5c..897fd26 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -80,11 +80,6 @@ extern "C" {
>  #define PKT_RX_FDIR  (1ULL << 2)  /**< RX packet with FDIR match 
> indicate. */
>  #define PKT_RX_L4_CKSUM_BAD  (1ULL << 3)  /**< L4 cksum of RX pkt. is not 
> OK. */
>  #define PKT_RX_IP_CKSUM_BAD  (1ULL << 4)  /**< IP cksum of RX pkt. is not 
> OK. */
> -#define PKT_RX_EIP_CKSUM_BAD (0ULL << 0)  /**< External IP header checksum 
> error. */
> -#define PKT_RX_OVERSIZE  (0ULL << 0)  /**< Num of desc of an RX pkt 
> oversize. */
> -#define PKT_RX_HBUF_OVERFLOW (0ULL << 0)  /**< Header buffer overflow. */
> -#define PKT_RX_RECIP_ERR (0ULL << 0)  /**< Hardware processing error. */
> -#define PKT_RX_MAC_ERR   (0ULL << 0)  /**< MAC error. */
>  #define PKT_RX_IPV4_HDR  (1ULL << 5)  /**< RX packet with IPv4 header. */
>  #define PKT_RX_IPV4_HDR_EXT  (1ULL << 6)  /**< RX packet with extended IPv4 
> header. */
>  #define PKT_RX_IPV6_HDR  (1ULL << 7)  /**< RX packet with IPv6 header. */
> @@ -95,6 +90,7 @@ extern "C" {
>  #define PKT_RX_TUNNEL_IPV6_HDR (1ULL << 12) /**< RX tunnel packet with IPv6 
> header. */
>  #define PKT_RX_FDIR_ID   (1ULL << 13) /**< FD id reported if FDIR match. 
> */
>  #define PKT_RX_FDIR_FLX  (1ULL << 14) /**< Flexible bytes reported if 
> FDIR match. */
> +#define PKT_RX_ERR_HW(1ULL << 15) /**< RX packet error detected by 
> hardware. */
> 
>  #define PKT_TX_VLAN_PKT  (1ULL << 55) /**< TX packet is a 802.1q VLAN 
> packet. */
>  #define PKT_TX_IP_CKSUM  (1ULL << 54) /**< IP cksum of TX pkt. computed 
> by NIC. */
> diff --git a/lib/librte_pmd_i40e/i40e_rxtx.c b/lib/librte_pmd_i40e/i40e_rxtx.c
> index cce6911..3b2195d 100644
> --- a/lib/librte_pmd_i40e/i40e_rxtx.c
> +++ b/lib/librte_pmd_i40e/i40e_rxtx.c
> @@ -115,35 +115,10 @@ i40e_rxd_status_to_pkt_flags(uint64_t qword)
>  static inline uint64_t
>  i40e_rxd_error_to_pkt_flags(uint64_t qword)
>  {
> - uint64_t flags = 0;
> - uint64_t error_bits = (qword >> I40E_RXD_QW1_ERROR_SHIFT);
> -
> -#define I40E_RX_ERR_BITS 0x3f
> - if (likely((error_bits & I40E_RX_ERR_BITS) == 0))
> - return flags;
> - /* If RXE bit set, all other status bits are meaningless */
> - if (unlikely(error_bits & (1 << I40E_RX_DESC_ERROR_RXE_SHIFT))) {
> - flags |= PKT_RX_MAC_ERR;
> - return flags;
> - }
> -
> - /* If RECIPE bit set, all other status indications should be ignored */
> - if (unlikely(error_bits & (1 << I40E_RX_DESC_ERROR_RECIPE_SHIFT))) {
> - flags |= PKT_RX_RECIP_ERR;
> - return flags;
> - }
> - if (unlikely(error_bits & (1 << I40E_RX_DESC_ERROR_HBO_SHIFT)))
> - flags |= PKT_RX_HBUF_OVERFLOW;
> - if (unlikely(error_bits & (1 << I40E_RX_DESC_ERROR_IPE_SHIFT)))
> - flags |= PKT_RX_IP_CKSUM_BAD;
> - if (unlikely(error_bits & (1 << I40E_RX_DESC_ERROR_L4E_SHIFT)))
> - flags |= PKT_RX_L4_CKSUM_BAD;
> - if (unlikely(error_bits & (1 << I40E_RX_DESC_ERROR_EIPE_SHIFT)))
> - flags |= PKT_RX_EIP_CKSUM_BAD;
> - if (unlikely(error_bits & (1 << I40E_RX_DESC_ERROR_OVERSIZE_SHIFT)))
> - flags |= PKT_RX_OVERSIZE;
> + if (unlikely(qword & I40E_RXD_QW1_ERROR_MASK))
> + return PKT_RX_ERR_HW;

Probably I didn't explain myself clear enough, sorry.
I didn't suggest to get rid of setting bits that indicate L3/L4 checksum errors:
PKT_RX_IP_CKSUM_BAD, PKT_RX_L4_CKSUM_BAD, PKT_RX_EIP_CKSUM_BAD.
I think these flags should be set as before.

I was talking only about collapsing only these 4 RX error flags into one:

#define PKT_RX_OVERSIZE  (0ULL << 0)  /**< Num of desc of an RX pkt 
oversize. */
#define PKT_RX_HBUF_OVERFLOW (0ULL << 0)  /**< Header buffer overflow. */
#define PKT_RX_RECIP_ERR (0ULL << 0)  /**< Hardware processing error. */
#define PKT_RX_MAC_ERR   (0ULL << 0)  /**< MAC error. */ 

>From my point of view the difference of these 2 groups are:
First - HW was able to receive whole packet without a problem, but L3/L4 
checksum check failed.

Second - HW was not able to receive whole packet properly by whatever reason. 
>From upp

[dpdk-dev] [RFC PATCH 0/6] DPDK support to bifurcated driver

2014-11-26 Thread Walukiewicz, Miroslaw
Thank you for explanation. 

I have a few  questions regarding the setup flow yet:

1. Why we need this step:
>   3. Setup a flow director rule to distribute packets with source ip
> > > > > >  0.0.0.0 to rxq No.0
> > > > > >   > ethtool -N eth0  flow-type udp4 src-ip 0.0.0.0 action 0


2. You presented the filter setup for receiving all udp4 packets on specific 
queue
> > > > > >   5. Setup a flow director rule to distribute packets with source ip
> > > > > >  1.1.1.1 to rxq No.32. This needs to be done after testpmd 
> > > > > > starts.
> > > > > >   > ethtool -N eth0 flow-type udp4 src-ip 1.1.1.1 action 32

How to configure flow director to receive all packets with dst-ip = 1.1.1.1 on 
qpair=32?
Will TCP SYN packets caught by such filter setup?

3.  Do we have a possibility to setup a rule like:
Forward all TCPv4 rx packets with dst-ip =1.1.1.1 and TCP port  to qpair=32 
including SYN packets?

3. In your application example you present that qpair number (32) is known 
before start of application
> > > > > >   > ./x86_64-native-linuxapp-gcc/app/testpmd -c 0x3 -n 4 \
> > > > > >   >  --vdev=rte_bifurc,iface=eth0,qpairs=1 -- \
> > > > > >   >  -i --rxfreet=32 --txfreet=32 --txrst=32

Is there a possibility to dynamic queue allocation? I ask about API.
 I mean dynamic attaching and detaching queue from application level and not 
specifying the numbers in the command line.

4. Is there a possibility to create a rule with perfect match and directing the 
packets to the specific queue.
I mean here a rule like:
Forward all TCPv4 rx packets with dst-ip=1.1.1.1 src-ip=2.2.2.2 dst-port= 
src-port=1234 to queue 33

Regards,

Mirek

> -Original Message-
> From: Zhou, Danny
> Sent: Tuesday, November 25, 2014 4:23 PM
> To: Richardson, Bruce; Walukiewicz, Miroslaw
> Cc: dev at dpdk.org
> Subject: RE: [dpdk-dev] [RFC PATCH 0/6] DPDK support to bifurcated driver
> 
> 
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce Richardson
> > Sent: Tuesday, November 25, 2014 11:03 PM
> > To: Walukiewicz, Miroslaw
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] [RFC PATCH 0/6] DPDK support to bifurcated driver
> >
> > On Tue, Nov 25, 2014 at 02:57:13PM +, Walukiewicz, Miroslaw wrote:
> > > Thank you Bruce for explanation of the idea.
> >
> > Actually, credit goes to Steve Liang, not me, for the explanation. :-)
> >
> > >
> > > I have question regarding TCP SYN packets? Do you have any idea how to
> share the TCP SYN requests between kernel and
> > user-space application?
> >
> > As I'm giving the credit to Steve, I'll also pass the buck for answering 
> > that
> > question to him too! :-)
> >
> > /Bruce
> 
> On ixgbe' Rx queuing flow, match SYN filter stage is prior to Flow Director
> filter stage. When working at bifurcated driver support mode,
> DPDK cannot access those NIC registers except for the ones that are used to
> rx/tx packets for assigned rx/tx queue pairs. So basically it really
> depends on user to use ethtool or other interface to setup SYN filter via
> ixgbe bifurcated driver. User can distribute TCP SYN packets to
> kernel bifurcated driver owned rx queues or DPDK owned rx queues, for the
> latter case, DPDK can still push them back to kernel via KNI if DPDK
> does not want to use them. If you have a user space TCP/IP stacks on top of
> DPDK, you can push them to the upper level stack rather instead.
> 
> > >
> > > Regards,
> > >
> > > Mirek
> > >
> > > > -Original Message-
> > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce
> Richardson
> > > > Sent: Tuesday, November 25, 2014 3:30 PM
> > > > To: Neil Horman
> > > > Cc: dev at dpdk.org
> > > > Subject: Re: [dpdk-dev] [RFC PATCH 0/6] DPDK support to bifurcated
> driver
> > > >
> > > > On Tue, Nov 25, 2014 at 09:23:16AM -0500, Neil Horman wrote:
> > > > > On Tue, Nov 25, 2014 at 10:11:16PM +0800, Cunming Liang wrote:
> > > > > >
> > > > > > This is a RFC patch set to support "bifurcated driver" in DPDK.
> > > > > >
> > > > > >
> > > > > > What is "bifurcated driver"?
> > > > > > ===
> > > > > >
> > > > > > The "bifurcated driver" stands for the kernel NIC driver that
> supports:
> > > > > >
> > > > > > 1. on-demand rx/tx queue pairs split-off and assignment to user
> space
> > > > > >
> > > > > > 2. direct NIC resource(e.g. rx/tx queue registers) access from user
> space
> > > > > >
> > > > > > 3. distributing packets to kernel or user space rx queues by
> > > > > >NIC's flow director according to the filter rules
> > > > > >
> > > > > > Here's the kernel patch set to support.
> > > > > > http://comments.gmane.org/gmane.linux.network/333615
> > > > > >
> > > > > >
> > > > > > Usage scenario
> > > > > > =
> > > > > >
> > > > > > It's well accepted by industry to use DPDK to process fast path
> packets in
> > > > > > user space in a high performance fashion, meanwhile processing
> slow
> > > > path
> > > > > > control 

[dpdk-dev] [PATCH 3/3] docs: update FreeBSD GSG to document ports install

2014-11-26 Thread Iremonger, Bernard
> -Original Message-
> Subject: [dpdk-dev] [PATCH 3/3] docs: update FreeBSD GSG to document ports 
> install
> 
> Since the DPDK is now part of the BSD ports collection, we should recommend 
> installing from ports as
> the best way to get it up and running.
> In order to achieve this, while still keeping the document readable, the 
> chapter on system
> requirements has been moved to instead be a section within the chapter on 
> compiling the DPDK
> outside of the ports collection. This move is necessary, since it covered a 
> lot of detail on installing other
> ports required to build DPDK. These steps are not needed when installing DPDK 
> itself from ports.
> 
> Signed-off-by: Bruce Richardson 

Acked-by: Bernard Iremonger 

 I have applied the patch to my tree next/dpdk-doc.



[dpdk-dev] [PATCH 2/3] docs: adjust line lengths in FreeBSD GSG rst files

2014-11-26 Thread Iremonger, Bernard
> -Original Message-
> Subject: [dpdk-dev] [PATCH 2/3] docs: adjust line lengths in FreeBSD GSG rst 
> files
> 
> The FreeBSD GSG rst files had very inconsistent line lengths for text within 
> paragraph blocks.
> Sometimes a line would be very short, while often lines would be quite long.
> This patch adjusts the formatting of the rst files so that lines break at 
> approx the 80-character mark, as
> is standard in the DPDK source code.
> 
> Signed-off-by: Bruce Richardson 

Acked-by: Bernard Iremonger 

 I have applied the patch to my tree next/dpdk-doc.



[dpdk-dev] [PATCH 1/3] docs: change hardcoded date to auto-generated

2014-11-26 Thread Iremonger, Bernard
> -Original Message-
> Subject: [dpdk-dev] [PATCH 1/3] docs: change hardcoded date to auto-generated
> 
> The index.html file for each of the "guide" docs had a hard-coded date value 
> in them of June 2014.
> Rather than update each of these for each revision, just use the |today| 
> directive to insert the date at
> which the document was generated.
> 
> Signed-off-by: Bruce Richardson 

Acked-by: Bernard Iremonger 

 I have applied the patch to my tree next/dpdk-doc.


[dpdk-dev] [PATCH v3 03/14] Add byte order operations for IBM Power architecture

2014-11-26 Thread Chao Zhu
Michael,

The default endianess of Power7/8 is big endian.  So I set big endian in 
the configuration file. If use little endian, just change the 
configuration file. Of cause, there is some way to determine the endian 
in run time. However, the original DPDK didn't do this.  I think this 
can be improved later.
About your second question, Power7 can support little endian, but it is 
a emulated one, not a CPU hardware feature.  Also, there is no official 
little endian support for Power7.  So I marked Power7 only support big 
endian.

On 2014/11/24 16:11, Qiu, Michael wrote:
> On 11/23/2014 9:22 PM, Chao Zhu wrote:
>> This patch adds architecture specific byte order operations for IBM Power
>> architecture. Power architecture support both big endian and little
>> endian. This patch also adds a RTE_ARCH_BIG_ENDIAN micro.
>>
>> Signed-off-by: Chao Zhu 
>> ---
>>   config/defconfig_ppc_64-power8-linuxapp-gcc|1 +
>>   .../common/include/arch/ppc_64/rte_byteorder.h |  150 
>> 
>>   2 files changed, 151 insertions(+), 0 deletions(-)
>>   create mode 100644 
>> lib/librte_eal/common/include/arch/ppc_64/rte_byteorder.h
>>
>> diff --git a/config/defconfig_ppc_64-power8-linuxapp-gcc 
>> b/config/defconfig_ppc_64-power8-linuxapp-gcc
>> index 97d72ff..b10f60c 100644
>> --- a/config/defconfig_ppc_64-power8-linuxapp-gcc
>> +++ b/config/defconfig_ppc_64-power8-linuxapp-gcc
>> @@ -34,6 +34,7 @@ CONFIG_RTE_MACHINE="power8"
>>   
>>   CONFIG_RTE_ARCH="ppc_64"
>>   CONFIG_RTE_ARCH_PPC_64=y
>> +CONFIG_RTE_ARCH_BIG_ENDIAN=y
> Does this means default is Big Endian,  if I runs it in little endian
> mode, I need to change it manually?
>>   
>>   CONFIG_RTE_TOOLCHAIN="gcc"
>>   CONFIG_RTE_TOOLCHAIN_GCC=y
>> diff --git a/lib/librte_eal/common/include/arch/ppc_64/rte_byteorder.h 
>> b/lib/librte_eal/common/include/arch/ppc_64/rte_byteorder.h
>> new file mode 100644
>> index 000..a593e8a
>> --- /dev/null
>> +++ b/lib/librte_eal/common/include/arch/ppc_64/rte_byteorder.h
>> @@ -0,0 +1,150 @@
>> +/*
>> + *   BSD LICENSE
>> + *
>> + *   Copyright (C) IBM Corporation 2014.
>> + *
>> + *   Redistribution and use in source and binary forms, with or without
>> + *   modification, are permitted provided that the following conditions
>> + *   are met:
>> + *
>> + * * Redistributions of source code must retain the above copyright
>> + *   notice, this list of conditions and the following disclaimer.
>> + * * Redistributions in binary form must reproduce the above copyright
>> + *   notice, this list of conditions and the following disclaimer in
>> + *   the documentation and/or other materials provided with the
>> + *   distribution.
>> + * * Neither the name of IBM Corporation nor the names of its
>> + *   contributors may be used to endorse or promote products derived
>> + *   from this software without specific prior written permission.
>> + *
>> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
>> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
>> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
>> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
>> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
>> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
>> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
>> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
>> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
>> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
>> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
>> +*/
>> +
>> +/* Inspired from FreeBSD src/sys/powerpc/include/endian.h
>> + * Copyright (c) 1987, 1991, 1993
>> + * The Regents of the University of California.  All rights reserved.
>> +*/
>> +
>> +#ifndef _RTE_BYTEORDER_PPC_64_H_
>> +#define _RTE_BYTEORDER_PPC_64_H_
>> +
>> +#ifdef __cplusplus
>> +extern "C" {
>> +#endif
>> +
>> +#include "generic/rte_byteorder.h"
>> +
>> +/*
>> + * An architecture-optimized byte swap for a 16-bit value.
>> + *
>> + * Do not use this function directly. The preferred function is 
>> rte_bswap16().
>> + */
>> +static inline uint16_t rte_arch_bswap16(uint16_t _x)
>> +{
>> +return ((_x >> 8) | ((_x << 8) & 0xff00));
>> +}
>> +
>> +/*
>> + * An architecture-optimized byte swap for a 32-bit value.
>> + *
>> + * Do not use this function directly. The preferred function is 
>> rte_bswap32().
>> + */
>> +static inline uint32_t rte_arch_bswap32(uint32_t _x)
>> +{
>> +return ((_x >> 24) | ((_x >> 8) & 0xff00) | ((_x << 8) & 0xff) |
>> +((_x << 24) & 0xff00));
>> +}
>> +
>> +/*
>> + * An architecture-optimized byte swap for a 64-bit value.
>> + *
>> +  * Do not use this function directly. The preferred function is 
>> rte_bswap64().
>>

[dpdk-dev] [RFC PATCH 6/6] ixgbe: PMD for bifurc ixgbe net device

2014-11-26 Thread Bruce Richardson
On Wed, Nov 26, 2014 at 08:22:05AM +, Liang, Cunming wrote:
> Thanks Bruce's valuable comments.
> 
> > -Original Message-
> > From: Richardson, Bruce
> > Sent: Tuesday, November 25, 2014 11:01 PM
> > To: Liang, Cunming
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] [RFC PATCH 6/6] ixgbe: PMD for bifurc ixgbe net 
> > device
> > 
> > On Tue, Nov 25, 2014 at 02:48:51PM +, Liang, Cunming wrote:
> > >
> > >
> > > > -Original Message-
> > > > From: Richardson, Bruce
> > > > Sent: Tuesday, November 25, 2014 10:34 PM
> > > > To: Liang, Cunming
> > > > Cc: dev at dpdk.org
> > > > Subject: Re: [dpdk-dev] [RFC PATCH 6/6] ixgbe: PMD for bifurc ixgbe net
> > device
> > > >
> > > > On Tue, Nov 25, 2014 at 10:11:22PM +0800, Cunming Liang wrote:
> > > > > Signed-off-by: Cunming Liang 
> > > > > ---
> > > > >  lib/librte_pmd_ixgbe/Makefile  |  13 +-
> > > > >  lib/librte_pmd_ixgbe/ixgbe_bifurcate.c | 303
> > > > +
> > > > >  lib/librte_pmd_ixgbe/ixgbe_bifurcate.h |  57 +++
> > > > >  lib/librte_pmd_ixgbe/ixgbe_rxtx.c  |  40 -
> > > > >  lib/librte_pmd_ixgbe/ixgbe_rxtx.h  |  10 ++
> > > > >  5 files changed, 415 insertions(+), 8 deletions(-)
> > > > >  create mode 100644 lib/librte_pmd_ixgbe/ixgbe_bifurcate.c
> > > > >  create mode 100644 lib/librte_pmd_ixgbe/ixgbe_bifurcate.h
> > > > >
> > > >
> > > > These changes are the ones that I'm not too sure about. I'd prefer if 
> > > > all
> > > > material for the bifurcated driver be kept within the librte_pmd_bifurc
> > directory.
> > > [Liang, Cunming] I haven't a librte_pmd_bifurc library.
> > > So far the purpose of librte_bifurc is for device scan, not used as a pmd.
> > > During driver probe, depend on device id, it asks for correct pmd from
> > 'librte_pmd_ixgbe, librte_pmd_i40e'.
> > >
> > > > Is it possible to leave ixgbe largely unmodified and simply have the new
> > > > bifurcated driver pull in the needed ixgbe (and later i40e) functions at
> > > > compile time i.e. refer from one Makefile to the sources in the other
> > > > driver's directory?
> > > [Liang, Cunming] Nice point. If we have single directory gathering all 
> > > direct ring
> > access.
> > > e.g. We have aka "librte_pmd_bifurc", inside it, we'll have bifurc_ixgbe,
> > bifurc_i40e, ...
> > > Each of them still depend on other libraries like
> > librte_pmd_ixgbe/librte_pmd_i40e.
> > > We may remove the internal dependence inside one pmd driver, but between
> > libraries we add more.
> > 
> > I'm not sure about all that. Two points:
> > 
> > * Why would we need separate subdirectories within the bifurcated driver
> > directory?
> > The *only* thing that is different between an implementation of ixgbe and 
> > i40e
> > to
> > use the bifurcated driver infrastructure is the code to map between NIC
> > descriptors
> > and rte_mbufs. All the other code would be identical as far as I can work 
> > out. So
> > the
> > only two routines that differ are going to be the rx_burst and tx_burst 
> > functions.
> [Liang, Cunming] Not really. If not using the fake page, we need to provide 
> init/start/stop case by case.
> > So why not just pull in those two specific functions (or sets of functions) 
> > from
> > their respective drivers, and keep the rest of the codebase common? 
> 
> [Liang, Cunming] I'm not sure all the rest of codebase can be common.
> For rx/tx or queue_setup, we know it can, we already do it in xxx_rxtx.c. For 
> other ops, may not.
> Even for the part we can, if we provide such common method template, it looks 
> like we still need to register 'ops'.
> (e.g. xxx_init_shared_code, xxx_dev_tx/rx_init, xxx_dev_rxtx_start) They're 
> not part of eth_dev_ops.
> If we consider more like enable all other DPDK ethdev API (by using ioctl 
> like ethtools does).
> These message wrap and translation are definitely the case to put into such 
> common codes.
> 
> So I agree with the idea to put more common method into librte_bifurc.
> But don't think it's good to make it as a common PMD driver.
> I still prefer ixgbe_bifurc.c in librte_pmd_ixgbe as an independent driver.
> Per codebase common, rxtx common stuffs already done in xxx_rxtx.c.
> Other common method provides by librte_bifurc, be used by each specific PMD.
> 
> > simpler than having the ixgbe driver having to be aware of whether it's 
> > operating
> > in bifurcated mode or uio/vfio/nic_uio mode, to check what operations are
> > supported
> > or not.
> [Liang, Cunming] If you go through the codes. You'll find it's not ixgbe 
> driver to aware of these modes.
> We already have ixgbe driver and ixgbevf driver, now have ixgbe_bifurc 
> driver, that's it.
> BTW ideally, it's better for ixgbe and ixgbevf in self-contain .c files, now 
> all are in ixgbe_ethdev.c
> Ixgbe_bifurc  has weaker NIC control than ixgbevf, both are mainly focus on 
> rx and tx.
> Ixgbe has full HW control, ixgbevf has limited HW control, ixgbe_bifurc no HW 
> control.
> All of them has the sam

[dpdk-dev] [PATCH v5 00/14] Patches for DPDK to support Power architecture

2014-11-26 Thread David Marchand
On Tue, Nov 25, 2014 at 11:17 PM, Chao Zhu 
wrote:

> The set of patches add IBM Power architecture to the DPDK. It adds the
> required support to the EAL library. This set of patches doesn't support
> full DPDK function on Power processors. To compile on PPC64 architecture,
> GCC version >= 4.8 must be used. According to Bruce and Neil's comments,
> this v5 patch removed the common configuration files of Powerpc in v4.
> Also, it fixed the checkpatch issues in v3.
>
> The only unsolved checkpatch issue is :
> ERROR: space prohibited before open square bracket '['
>
> This issue refers to the asm code input/output naming. But I think the
> error is invalid.
>
>
I must admit that the architecture abstraction in dpdk is not fully done,
but it is more a "core" problem than a problem of this port itself.
I am pretty sure that there is still a lot of places in dpdk that rely on
the fact that they were written for x86 architecture.

Neil's concerns on cpuflags (
http://dpdk.org/ml/archives/dev/2014-November/008769.html) are valid but I
think we could go with an incremental approach.
Since Chao is commited to add power support to dpdk, we can have this fixed
in subsequent patches with this patchset in 1.8.


So, this patchset looks good enough to me.
Acked-by: David Marchand 


-- 
David Marchand


[dpdk-dev] [PATCH v3 08/13] testpmd: rework csum forward engine

2014-11-26 Thread Ananyev, Konstantin
Hi Oliver,

> -Original Message-
> From: Olivier Matz [mailto:olivier.matz at 6wind.com]
> Sent: Thursday, November 20, 2014 10:59 PM
> To: dev at dpdk.org
> Cc: olivier.matz at 6wind.com; Walukiewicz, Miroslaw; Liu, Jijiang; Liu, 
> Yong; jigsaw at gmail.com; Richardson, Bruce; Ananyev, Konstantin
> Subject: [PATCH v3 08/13] testpmd: rework csum forward engine
> 
> The csum forward engine was becoming too complex to be used and
> extended (the next commits want to add the support of TSO):
> 
> - no explaination about what the code does
> - code is not factorized, lots of code duplicated, especially between
>   ipv4/ipv6
> - user command line api: use of bitmasks that need to be calculated by
>   the user
> - the user flags don't have the same semantic:
>   - for legacy IP/UDP/TCP/SCTP, it selects software or hardware checksum
>   - for other (vxlan), it selects between hardware checksum or no
> checksum
> - the code relies too much on flags set by the driver without software
>   alternative (ex: PKT_RX_TUNNEL_IPV4_HDR). It is nice to be able to
>   compare a software implementation with the hardware offload.
> 
> This commit tries to fix these issues, and provide a simple definition
> of what is done by the forward engine:
> 
>  * Receive a burst of packets, and for supported packet types:
>  *  - modify the IPs
>  *  - reprocess the checksum in SW or HW, depending on testpmd command line
>  *configuration
>  * Then packets are transmitted on the output port.
>  *
>  * Supported packets are:
>  *   Ether / (vlan) / IP|IP6 / UDP|TCP|SCTP .
>  *   Ether / (vlan) / IP|IP6 / UDP / VxLAN / Ether / IP|IP6 / UDP|TCP|SCTP
>  *
>  * The network parser supposes that the packet is contiguous, which may
>  * not be the case in real life.

As I can see you removed code that sets up TX_PKT_IPV4 and TX_PKT_IPV6  of 
ol_flags.
I think that we need to keep it.
The reason for that is:
With FVL, to make HW TX checksum offload work, SW is responsible to provide to 
the HW information about L3 header.
Possible values are:   
- IPv4 hdr with HW checksum calculation
- IPV4 hdr (checksum done by SW)
- IPV6 hdr 
- unknown
So let say to for the packet: ETHER_HDR/IPV6_HDR/TCP_HDR/DATA
To request HW TCP checksum offload,  SW have to provide to HW information that 
it is a packet with IPV6 header
(plus as for ixgbe: l2_hdr_len, l3_hdr_len, l4_type, l4_hdr_len).
That's why TX_PKT_IPV4 and TX_PKT_IPV6   were introduced.

Yes, it is  a change in public API for HW TX offload, but I don't see any other 
way we can overcome it
(apart from make TX function itself to parse a packet, which is obviously not a 
good choice).
Note that existing apps working on existing HW (ixgbe/igb/em) are not affected.
Though apps that supposed to be run on FVL HW too have to follow new convention.

So I suggest we keep setting these flags in csumonly.c

Apart from that , the patch looks good to me.
And yes, we would need to change the  the way we handle TX offload for 
tunnelled packets. 
Konstantin

> 
> Signed-off-by: Olivier Matz 
> ---
>  app/test-pmd/cmdline.c  | 156 ---
>  app/test-pmd/config.c   |  13 +-
>  app/test-pmd/csumonly.c | 676 
> ++--
>  app/test-pmd/testpmd.h  |  17 +-
>  4 files changed, 437 insertions(+), 425 deletions(-)
> 
> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
> index 4c3fc76..61e4340 100644
> --- a/app/test-pmd/cmdline.c
> +++ b/app/test-pmd/cmdline.c
> @@ -310,19 +310,19 @@ static void cmd_help_long_parsed(void *parsed_result,
>   "Disable hardware insertion of a VLAN header in"
>   " packets sent on a port.\n\n"
> 
> - "tx_checksum set (mask) (port_id)\n"
> - "Enable hardware insertion of checksum offload with"
> - " the 8-bit mask, 0~0xff, in packets sent on a port.\n"
> - "bit 0 - insert ip   checksum offload if set\n"
> - "bit 1 - insert udp  checksum offload if set\n"
> - "bit 2 - insert tcp  checksum offload if set\n"
> - "bit 3 - insert sctp checksum offload if set\n"
> - "bit 4 - insert inner ip  checksum offload if 
> set\n"
> - "bit 5 - insert inner udp checksum offload if 
> set\n"
> - "bit 6 - insert inner tcp checksum offload if 
> set\n"
> - "bit 7 - insert inner sctp checksum offload if 
> set\n"
> + "tx_cksum set (ip|udp|tcp|sctp|vxlan) (hw|sw) 
> (port_id)\n"
> + "Select hardware or software calculation of the"
> + " checksum with when transmitting a packet using the"
> + " csum forward engine.\n"
> + "ip|udp|tcp|sctp always concern the inner layer.\n"
> + "vxlan concerns 

[dpdk-dev] 答复:答复: [PATCH] eal: map uio resources after hugepages when the base_virtaddr is configured.

2014-11-26 Thread Burakov, Anatoly
> This is a known issue, and still not be solved yet.  The root cause is exactly
> clear, that should be try to map an address have already used in new
> process.
> 
> BTW, you should learn how to make a patch, like commit log, signed-off-by,
> etc.

Hi Michael,

As far as I know, the patch that fixes this issue was integrated yesterday 
(thanks Thomas!), and Liang Xu verified it to be working.

Thanks,
Anatoly


[dpdk-dev] [RFC PATCH 6/6] ixgbe: PMD for bifurc ixgbe net device

2014-11-26 Thread Liang, Cunming
Thanks Bruce's valuable comments.

> -Original Message-
> From: Richardson, Bruce
> Sent: Tuesday, November 25, 2014 11:01 PM
> To: Liang, Cunming
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [RFC PATCH 6/6] ixgbe: PMD for bifurc ixgbe net device
> 
> On Tue, Nov 25, 2014 at 02:48:51PM +, Liang, Cunming wrote:
> >
> >
> > > -Original Message-
> > > From: Richardson, Bruce
> > > Sent: Tuesday, November 25, 2014 10:34 PM
> > > To: Liang, Cunming
> > > Cc: dev at dpdk.org
> > > Subject: Re: [dpdk-dev] [RFC PATCH 6/6] ixgbe: PMD for bifurc ixgbe net
> device
> > >
> > > On Tue, Nov 25, 2014 at 10:11:22PM +0800, Cunming Liang wrote:
> > > > Signed-off-by: Cunming Liang 
> > > > ---
> > > >  lib/librte_pmd_ixgbe/Makefile  |  13 +-
> > > >  lib/librte_pmd_ixgbe/ixgbe_bifurcate.c | 303
> > > +
> > > >  lib/librte_pmd_ixgbe/ixgbe_bifurcate.h |  57 +++
> > > >  lib/librte_pmd_ixgbe/ixgbe_rxtx.c  |  40 -
> > > >  lib/librte_pmd_ixgbe/ixgbe_rxtx.h  |  10 ++
> > > >  5 files changed, 415 insertions(+), 8 deletions(-)
> > > >  create mode 100644 lib/librte_pmd_ixgbe/ixgbe_bifurcate.c
> > > >  create mode 100644 lib/librte_pmd_ixgbe/ixgbe_bifurcate.h
> > > >
> > >
> > > These changes are the ones that I'm not too sure about. I'd prefer if all
> > > material for the bifurcated driver be kept within the librte_pmd_bifurc
> directory.
> > [Liang, Cunming] I haven't a librte_pmd_bifurc library.
> > So far the purpose of librte_bifurc is for device scan, not used as a pmd.
> > During driver probe, depend on device id, it asks for correct pmd from
> 'librte_pmd_ixgbe, librte_pmd_i40e'.
> >
> > > Is it possible to leave ixgbe largely unmodified and simply have the new
> > > bifurcated driver pull in the needed ixgbe (and later i40e) functions at
> > > compile time i.e. refer from one Makefile to the sources in the other
> > > driver's directory?
> > [Liang, Cunming] Nice point. If we have single directory gathering all 
> > direct ring
> access.
> > e.g. We have aka "librte_pmd_bifurc", inside it, we'll have bifurc_ixgbe,
> bifurc_i40e, ...
> > Each of them still depend on other libraries like
> librte_pmd_ixgbe/librte_pmd_i40e.
> > We may remove the internal dependence inside one pmd driver, but between
> libraries we add more.
> 
> I'm not sure about all that. Two points:
> 
> * Why would we need separate subdirectories within the bifurcated driver
> directory?
> The *only* thing that is different between an implementation of ixgbe and i40e
> to
> use the bifurcated driver infrastructure is the code to map between NIC
> descriptors
> and rte_mbufs. All the other code would be identical as far as I can work 
> out. So
> the
> only two routines that differ are going to be the rx_burst and tx_burst 
> functions.
[Liang, Cunming] Not really. If not using the fake page, we need to provide 
init/start/stop case by case.
> So why not just pull in those two specific functions (or sets of functions) 
> from
> their respective drivers, and keep the rest of the codebase common? 

[Liang, Cunming] I'm not sure all the rest of codebase can be common.
For rx/tx or queue_setup, we know it can, we already do it in xxx_rxtx.c. For 
other ops, may not.
Even for the part we can, if we provide such common method template, it looks 
like we still need to register 'ops'.
(e.g. xxx_init_shared_code, xxx_dev_tx/rx_init, xxx_dev_rxtx_start) They're not 
part of eth_dev_ops.
If we consider more like enable all other DPDK ethdev API (by using ioctl like 
ethtools does).
These message wrap and translation are definitely the case to put into such 
common codes.

So I agree with the idea to put more common method into librte_bifurc.
But don't think it's good to make it as a common PMD driver.
I still prefer ixgbe_bifurc.c in librte_pmd_ixgbe as an independent driver.
Per codebase common, rxtx common stuffs already done in xxx_rxtx.c.
Other common method provides by librte_bifurc, be used by each specific PMD.

> simpler than having the ixgbe driver having to be aware of whether it's 
> operating
> in bifurcated mode or uio/vfio/nic_uio mode, to check what operations are
> supported
> or not.
[Liang, Cunming] If you go through the codes. You'll find it's not ixgbe driver 
to aware of these modes.
We already have ixgbe driver and ixgbevf driver, now have ixgbe_bifurc driver, 
that's it.
BTW ideally, it's better for ixgbe and ixgbevf in self-contain .c files, now 
all are in ixgbe_ethdev.c
Ixgbe_bifurc  has weaker NIC control than ixgbevf, both are mainly focus on rx 
and tx.
Ixgbe has full HW control, ixgbevf has limited HW control, ixgbe_bifurc no HW 
control.
All of them has the same capability to do rx and tx.
On this point of view, it makes sense to have such standalone driver.
> 
> * It's not really an inter-library dependency - or at least not a hugely 
> problematic
> one to my mind. With my proposal there is no need for the ixgbe or i40e 
> drivers
> to
> be compiled up f

[dpdk-dev] [PATCH v5 6/6] enicpmd: DPDK changes for accommodating ENIC PMD

2014-11-26 Thread Sujith Sankar
Signed-off-by: Sujith Sankar 
---
 config/common_linuxapp | 5 +
 lib/Makefile   | 1 +
 mk/rte.app.mk  | 4 
 3 files changed, 10 insertions(+)

diff --git a/config/common_linuxapp b/config/common_linuxapp
index 86a0d15..542fff2 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -210,6 +210,11 @@ CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_VM=4
 CONFIG_RTE_LIBRTE_I40E_ITR_INTERVAL=-1

 #
+# Compile burst-oriented Cisco ENIC PMD driver
+#
+CONFIG_RTE_LIBRTE_ENIC_PMD=y
+
+#
 # Compile burst-oriented VIRTIO PMD driver
 #
 CONFIG_RTE_LIBRTE_VIRTIO_PMD=y
diff --git a/lib/Makefile b/lib/Makefile
index 204ef11..df17d78 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -43,6 +43,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_CMDLINE) += librte_cmdline
 DIRS-$(CONFIG_RTE_LIBRTE_ETHER) += librte_ether
 DIRS-$(CONFIG_RTE_LIBRTE_E1000_PMD) += librte_pmd_e1000
 DIRS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD) += librte_pmd_ixgbe
+DIRS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD) += librte_pmd_enic
 DIRS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += librte_pmd_i40e
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += librte_pmd_bond
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_RING) += librte_pmd_ring
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 59468b0..bef823b 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -186,6 +186,10 @@ ifeq ($(CONFIG_RTE_LIBRTE_VMXNET3_PMD),y)
 LDLIBS += -lrte_pmd_vmxnet3_uio
 endif

+ifeq ($(CONFIG_RTE_LIBRTE_ENIC_PMD),y)
+LDLIBS += -lrte_pmd_enic
+endif
+
 ifeq ($(CONFIG_RTE_LIBRTE_VIRTIO_PMD),y)
 LDLIBS += -lrte_pmd_virtio_uio
 endif
-- 
1.9.1



[dpdk-dev] [PATCH v5 5/6] enicpmd: DPDK-ENIC PMD interface

2014-11-26 Thread Sujith Sankar
Signed-off-by: Sujith Sankar 
---
 lib/librte_pmd_enic/enic_etherdev.c | 613 
 1 file changed, 613 insertions(+)
 create mode 100644 lib/librte_pmd_enic/enic_etherdev.c

diff --git a/lib/librte_pmd_enic/enic_etherdev.c 
b/lib/librte_pmd_enic/enic_etherdev.c
new file mode 100644
index 000..441c85a
--- /dev/null
+++ b/lib/librte_pmd_enic/enic_etherdev.c
@@ -0,0 +1,613 @@
+/*
+ * Copyright 2008-2014 Cisco Systems, Inc.  All rights reserved.
+ * Copyright 2007 Nuova Systems, Inc.  All rights reserved.
+ *
+ * Copyright (c) 2014, Cisco Systems, Inc. 
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
+ * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
+ * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+ * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
+ * ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ */
+#ident "$Id$"
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+
+#include "vnic_intr.h"
+#include "vnic_cq.h"
+#include "vnic_wq.h"
+#include "vnic_rq.h"
+#include "vnic_enet.h"
+#include "enic.h"
+
+#define ENICPMD_FUNC_TRACE() \
+RTE_LOG(DEBUG, PMD, "ENICPMD trace: %s\n", __func__)
+
+/*
+ * The set of PCI devices this driver supports
+ */
+static struct rte_pci_id pci_id_enic_map[] = {
+#define RTE_PCI_DEV_ID_DECL_ENIC(vend, dev) {RTE_PCI_DEVICE(vend, dev)},
+#ifndef PCI_VENDOR_ID_CISCO
+#define PCI_VENDOR_ID_CISCO0x1137
+#endif
+#include "rte_pci_dev_ids.h"
+RTE_PCI_DEV_ID_DECL_ENIC(PCI_VENDOR_ID_CISCO, PCI_DEVICE_ID_CISCO_VIC_ENET)
+RTE_PCI_DEV_ID_DECL_ENIC(PCI_VENDOR_ID_CISCO, PCI_DEVICE_ID_CISCO_VIC_ENET_VF)
+{.vendor_id = 0, /* Sentinal */},
+};
+
+static int enicpmd_fdir_remove_perfect_filter(struct rte_eth_dev *eth_dev,
+   struct rte_fdir_filter *fdir_filter,
+   uint16_t soft_id)
+{
+   struct enic *enic = pmd_priv(eth_dev);
+
+   ENICPMD_FUNC_TRACE();
+   return enic_fdir_del_fltr(enic, fdir_filter);
+}
+
+static int enicpmd_fdir_add_perfect_filter(struct rte_eth_dev *eth_dev,
+   struct rte_fdir_filter *fdir_filter, uint16_t soft_id,
+   uint8_t queue, uint8_t drop)
+{
+   struct enic *enic = pmd_priv(eth_dev);
+
+   ENICPMD_FUNC_TRACE();
+   return enic_fdir_add_fltr(enic, fdir_filter, (uint16_t)queue, drop);
+}
+
+static void enicpmd_fdir_info_get(struct rte_eth_dev *eth_dev,
+   struct rte_eth_fdir *fdir)
+{
+   struct enic *enic = pmd_priv(eth_dev);
+
+   ENICPMD_FUNC_TRACE();
+   *fdir = enic->fdir.stats;
+}
+
+static void enicpmd_dev_tx_queue_release(void *txq)
+{
+   ENICPMD_FUNC_TRACE();
+   enic_free_wq(txq);
+}
+
+static int enicpmd_dev_setup_intr(struct enic *enic)
+{
+   int ret;
+   int index;
+
+   ENICPMD_FUNC_TRACE();
+
+   /* Are we done with the init of all the queues? */
+   for (index = 0; index < enic->cq_count; index++) {
+   if (!enic->cq[index].ctrl)
+   break;
+   }
+
+   if (enic->cq_count != index)
+   return 0;
+
+   ret = enic_alloc_intr_resources(enic);
+   if (ret) {
+   dev_err(enic, "alloc intr failed\n");
+   return ret;
+   }
+   enic_init_vnic_resources(enic);
+
+   ret = enic_setup_finish(enic);
+   if (ret)
+   dev_err(enic, "setup could not be finished\n");
+
+   return ret;
+}
+
+static int enicpmd_dev_tx_queue_setup(struct rte_eth_dev *eth_dev,
+   uint16_t queue_idx,
+   uint16_t nb_desc,
+   unsigned int socket_id,
+   const struct rte_eth_txconf *tx_conf)
+{
+   int ret;
+   struct enic *enic = pmd_priv(eth_dev);
+
+   ENICPMD_FUNC_TRACE();
+   eth_dev->data->tx_queues[queue_idx] = (void *)&enic->wq[queue_idx];
+
+   ret = enic_alloc_wq(enic, queue_idx, socket_id, nb_desc);
+   if (ret) {
+   dev_err(enic, "error in allocating wq\n");

[dpdk-dev] [PATCH v5 4/6] enicpmd: pmd specific code

2014-11-26 Thread Sujith Sankar
Signed-off-by: Sujith Sankar 
---
 lib/librte_pmd_enic/enic.h|  157 +
 lib/librte_pmd_enic/enic_clsf.c   |  244 +++
 lib/librte_pmd_enic/enic_compat.h |  142 +
 lib/librte_pmd_enic/enic_main.c   | 1266 +
 lib/librte_pmd_enic/enic_res.c|  221 +++
 lib/librte_pmd_enic/enic_res.h|  168 +
 6 files changed, 2198 insertions(+)
 create mode 100644 lib/librte_pmd_enic/enic.h
 create mode 100644 lib/librte_pmd_enic/enic_clsf.c
 create mode 100644 lib/librte_pmd_enic/enic_compat.h
 create mode 100644 lib/librte_pmd_enic/enic_main.c
 create mode 100644 lib/librte_pmd_enic/enic_res.c
 create mode 100644 lib/librte_pmd_enic/enic_res.h

diff --git a/lib/librte_pmd_enic/enic.h b/lib/librte_pmd_enic/enic.h
new file mode 100644
index 000..5041dd1
--- /dev/null
+++ b/lib/librte_pmd_enic/enic.h
@@ -0,0 +1,157 @@
+/*
+ * Copyright 2008-2014 Cisco Systems, Inc.  All rights reserved.
+ * Copyright 2007 Nuova Systems, Inc.  All rights reserved.
+ *
+ * Copyright (c) 2014, Cisco Systems, Inc. 
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
+ * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
+ * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+ * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
+ * ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ */
+#ident "$Id$"
+
+#ifndef _ENIC_H_
+#define _ENIC_H_
+
+#include "vnic_enet.h"
+#include "vnic_dev.h"
+#include "vnic_wq.h"
+#include "vnic_rq.h"
+#include "vnic_cq.h"
+#include "vnic_intr.h"
+#include "vnic_stats.h"
+#include "vnic_nic.h"
+#include "vnic_rss.h"
+#include "enic_res.h"
+
+#define DRV_NAME   "enic_pmd"
+#define DRV_DESCRIPTION"Cisco VIC Ethernet NIC Poll-mode 
Driver"
+#define DRV_VERSION"1.0.0.4"
+#define DRV_COPYRIGHT  "Copyright 2008-2014 Cisco Systems, Inc"
+
+#define ENIC_WQ_MAX8
+#define ENIC_RQ_MAX8
+#define ENIC_CQ_MAX(ENIC_WQ_MAX + ENIC_RQ_MAX)
+#define ENIC_INTR_MAX  (ENIC_CQ_MAX + 2)
+
+#define VLAN_ETH_HLEN   18
+
+#define ENICPMD_SETTING(enic, f) ((enic->config.flags & VENETF_##f) ? 1 : 0)
+
+#define ENICPMD_BDF_LENGTH  13   /* :00:00.0'\0' */
+#define PKT_TX_TCP_UDP_CKSUM0x6000
+#define ENIC_CALC_IP_CKSUM  1
+#define ENIC_CALC_TCP_UDP_CKSUM 2
+#define ENIC_MAX_MTU9000
+#define PAGE_SIZE   4096
+#define PAGE_ROUND_UP(x) \
+   unsigned long)(x)) + PAGE_SIZE-1) & (~(PAGE_SIZE-1)))
+
+#define ENICPMD_VFIO_PATH  "/dev/vfio/vfio"
+/*#define ENIC_DESC_COUNT_MAKE_ODD (x) do{if ((~(x)) & 1) { (x)--; } 
}while(0)*/
+
+#define PCI_DEVICE_ID_CISCO_VIC_ENET 0x0043  /* ethernet vnic */
+#define PCI_DEVICE_ID_CISCO_VIC_ENET_VF  0x0071  /* enet SRIOV VF */
+
+
+#define ENICPMD_FDIR_MAX   64
+
+struct enic_fdir_node {
+   struct rte_fdir_filter filter;
+   u16 fltr_id;
+   u16 rq_index;
+};
+
+struct enic_fdir {
+   struct rte_eth_fdir stats;
+   struct rte_hash *hash;
+   struct enic_fdir_node *nodes[ENICPMD_FDIR_MAX];
+};
+
+/* Per-instance private data structure */
+struct enic {
+   struct enic *next;
+   struct rte_pci_device *pdev;
+   struct vnic_enet_config config;
+   struct vnic_dev_bar bar0;
+   struct vnic_dev *vdev;
+
+   struct rte_eth_dev *rte_dev;
+   struct enic_fdir fdir;
+   char bdf_name[ENICPMD_BDF_LENGTH];
+   int dev_fd;
+   int iommu_group_fd;
+   int iommu_groupid;
+   int eventfd;
+   u_int8_t mac_addr[ETH_ALEN];
+   pthread_t err_intr_thread;
+   int promisc;
+   int allmulti;
+   int ig_vlan_strip_en;
+   int link_status;
+   u8 hw_ip_checksum;
+
+   unsigned int flags;
+   unsigned int priv_flags;
+
+   /* work queue */
+   struct vnic_wq wq[ENIC_WQ_MAX];
+   unsigned int wq_count;
+
+   /* 

[dpdk-dev] [PATCH v5 3/6] enicpmd: VNIC common code partially shared with ENIC kernel mode driver

2014-11-26 Thread Sujith Sankar
Signed-off-by: Sujith Sankar 
---
 lib/librte_pmd_enic/vnic/cq_desc.h   |  126 
 lib/librte_pmd_enic/vnic/cq_enet_desc.h  |  261 
 lib/librte_pmd_enic/vnic/rq_enet_desc.h  |   76 +++
 lib/librte_pmd_enic/vnic/vnic_cq.c   |  117 
 lib/librte_pmd_enic/vnic/vnic_cq.h   |  152 +
 lib/librte_pmd_enic/vnic/vnic_dev.c  | 1063 ++
 lib/librte_pmd_enic/vnic/vnic_dev.h  |  202 ++
 lib/librte_pmd_enic/vnic/vnic_devcmd.h   |  774 ++
 lib/librte_pmd_enic/vnic/vnic_enet.h |   78 +++
 lib/librte_pmd_enic/vnic/vnic_intr.c |   83 +++
 lib/librte_pmd_enic/vnic/vnic_intr.h |  126 
 lib/librte_pmd_enic/vnic/vnic_nic.h  |   88 +++
 lib/librte_pmd_enic/vnic/vnic_resource.h |   97 +++
 lib/librte_pmd_enic/vnic/vnic_rq.c   |  246 +++
 lib/librte_pmd_enic/vnic/vnic_rq.h   |  282 
 lib/librte_pmd_enic/vnic/vnic_rss.c  |   85 +++
 lib/librte_pmd_enic/vnic/vnic_rss.h  |   61 ++
 lib/librte_pmd_enic/vnic/vnic_stats.h|   86 +++
 lib/librte_pmd_enic/vnic/vnic_wq.c   |  245 +++
 lib/librte_pmd_enic/vnic/vnic_wq.h   |  283 
 lib/librte_pmd_enic/vnic/wq_enet_desc.h  |  114 
 21 files changed, 4645 insertions(+)
 create mode 100644 lib/librte_pmd_enic/vnic/cq_desc.h
 create mode 100644 lib/librte_pmd_enic/vnic/cq_enet_desc.h
 create mode 100644 lib/librte_pmd_enic/vnic/rq_enet_desc.h
 create mode 100644 lib/librte_pmd_enic/vnic/vnic_cq.c
 create mode 100644 lib/librte_pmd_enic/vnic/vnic_cq.h
 create mode 100644 lib/librte_pmd_enic/vnic/vnic_dev.c
 create mode 100644 lib/librte_pmd_enic/vnic/vnic_dev.h
 create mode 100644 lib/librte_pmd_enic/vnic/vnic_devcmd.h
 create mode 100644 lib/librte_pmd_enic/vnic/vnic_enet.h
 create mode 100644 lib/librte_pmd_enic/vnic/vnic_intr.c
 create mode 100644 lib/librte_pmd_enic/vnic/vnic_intr.h
 create mode 100644 lib/librte_pmd_enic/vnic/vnic_nic.h
 create mode 100644 lib/librte_pmd_enic/vnic/vnic_resource.h
 create mode 100644 lib/librte_pmd_enic/vnic/vnic_rq.c
 create mode 100644 lib/librte_pmd_enic/vnic/vnic_rq.h
 create mode 100644 lib/librte_pmd_enic/vnic/vnic_rss.c
 create mode 100644 lib/librte_pmd_enic/vnic/vnic_rss.h
 create mode 100644 lib/librte_pmd_enic/vnic/vnic_stats.h
 create mode 100644 lib/librte_pmd_enic/vnic/vnic_wq.c
 create mode 100644 lib/librte_pmd_enic/vnic/vnic_wq.h
 create mode 100644 lib/librte_pmd_enic/vnic/wq_enet_desc.h

diff --git a/lib/librte_pmd_enic/vnic/cq_desc.h 
b/lib/librte_pmd_enic/vnic/cq_desc.h
new file mode 100644
index 000..7dfb2b6
--- /dev/null
+++ b/lib/librte_pmd_enic/vnic/cq_desc.h
@@ -0,0 +1,126 @@
+/*
+ * Copyright 2008-2010 Cisco Systems, Inc.  All rights reserved.
+ * Copyright 2007 Nuova Systems, Inc.  All rights reserved.
+ *
+ * Copyright (c) 2014, Cisco Systems, Inc. 
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
+ * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
+ * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+ * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
+ * ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ */
+#ident "$Id: cq_desc.h 129574 2013-04-26 22:11:14Z rfaucett $"
+
+#ifndef _CQ_DESC_H_
+#define _CQ_DESC_H_
+
+/*
+ * Completion queue descriptor types
+ */
+enum cq_desc_types {
+   CQ_DESC_TYPE_WQ_ENET = 0,
+   CQ_DESC_TYPE_DESC_COPY = 1,
+   CQ_DESC_TYPE_WQ_EXCH = 2,
+   CQ_DESC_TYPE_RQ_ENET = 3,
+   CQ_DESC_TYPE_RQ_FCP = 4,
+   CQ_DESC_TYPE_IOMMU_MISS = 5,
+   CQ_DESC_TYPE_SGL = 6,
+   CQ_DESC_TYPE_CLASSIFIER = 7,
+   CQ_DESC_TYPE_TEST = 127,
+};
+
+/* Completion queue descriptor: 16B
+ *
+ * All completion queues have this basic layout.  The
+ * type_specfic area is unique for each completion
+ * queue type.
+ */
+struct cq_desc {
+   __le16 completed_index;
+   __le16 q_number;
+   u8 type_specfic[11];
+   u8 type_color;
+};

[dpdk-dev] [PATCH v5 2/6] enicpmd: Makefile

2014-11-26 Thread Sujith Sankar
Signed-off-by: Sujith Sankar 
---
 lib/librte_pmd_enic/Makefile | 67 
 1 file changed, 67 insertions(+)
 create mode 100644 lib/librte_pmd_enic/Makefile

diff --git a/lib/librte_pmd_enic/Makefile b/lib/librte_pmd_enic/Makefile
new file mode 100644
index 000..25d8f31
--- /dev/null
+++ b/lib/librte_pmd_enic/Makefile
@@ -0,0 +1,67 @@
+#
+# Copyright 2008-2014 Cisco Systems, Inc.  All rights reserved.
+# Copyright 2007 Nuova Systems, Inc.  All rights reserved.
+#
+# Copyright (c) 2014, Cisco Systems, Inc. 
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+# 1. Redistributions of source code must retain the above copyright
+# notice, this list of conditions and the following disclaimer.
+#
+# 2. Redistributions in binary form must reproduce the above copyright
+# notice, this list of conditions and the following disclaimer in
+# the documentation and/or other materials provided with the
+# distribution.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+# FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+# COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
+# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
+# BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+# LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+# LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
+# ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+# POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_enic.a
+
+CFLAGS += -I$(RTE_SDK)/lib/librte_hash/ -I$(RTE_SDK)/lib/librte_pmd_enic/vnic/
+CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_enic/
+CFLAGS += -O3 -Wno-deprecated
+
+VPATH += $(RTE_SDK)/lib/librte_pmd_enic/src
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_ENIC_PMD) += enic_main.c 
+SRCS-$(CONFIG_RTE_LIBRTE_ENIC_PMD) += enic_clsf.c 
+SRCS-$(CONFIG_RTE_LIBRTE_ENIC_PMD) += vnic/vnic_cq.c 
+SRCS-$(CONFIG_RTE_LIBRTE_ENIC_PMD) += vnic/vnic_wq.c 
+SRCS-$(CONFIG_RTE_LIBRTE_ENIC_PMD) += vnic/vnic_dev.c 
+SRCS-$(CONFIG_RTE_LIBRTE_ENIC_PMD) += vnic/vnic_intr.c 
+SRCS-$(CONFIG_RTE_LIBRTE_ENIC_PMD) += vnic/vnic_rq.c 
+SRCS-$(CONFIG_RTE_LIBRTE_ENIC_PMD) += enic_etherdev.c
+SRCS-$(CONFIG_RTE_LIBRTE_ENIC_PMD) += enic_res.c
+SRCS-$(CONFIG_RTE_LIBRTE_ENIC_PMD) += vnic/vnic_rss.c
+
+
+# this lib depends upon:
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ENIC_PMD) += lib/librte_eal lib/librte_ether
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ENIC_PMD) += lib/librte_mempool lib/librte_mbuf
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ENIC_PMD) += lib/librte_net lib/librte_malloc
+
+include $(RTE_SDK)/mk/rte.lib.mk
+
-- 
1.9.1



[dpdk-dev] [PATCH v5 1/6] enicpmd: License text

2014-11-26 Thread Sujith Sankar
Signed-off-by: Sujith Sankar 
---
 lib/librte_pmd_enic/LICENSE | 27 +++
 1 file changed, 27 insertions(+)
 create mode 100644 lib/librte_pmd_enic/LICENSE

diff --git a/lib/librte_pmd_enic/LICENSE b/lib/librte_pmd_enic/LICENSE
new file mode 100644
index 000..0ad2216
--- /dev/null
+++ b/lib/librte_pmd_enic/LICENSE
@@ -0,0 +1,27 @@
+ * Copyright (c) 2014, Cisco Systems, Inc. 
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
+ * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
+ * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+ * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
+ * ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
-- 
1.9.1



[dpdk-dev] [PATCH v5 0/6] enicpmd: Cisco Systems Inc. VIC Ethernet PMD

2014-11-26 Thread Sujith Sankar
ENIC PMD is the poll-mode driver for the Cisco Systems Inc. VIC to be
used with DPDK suite.

Sujith Sankar (6):
  enicpmd: License text
  enicpmd: Makefile
  enicpmd: VNIC common code partially shared with ENIC kernel mode
driver
  enicpmd: pmd specific code
  enicpmd: DPDK-ENIC PMD interface
  enicpmd: DPDK changes for accommodating ENIC PMD

 config/common_linuxapp   |5 +
 lib/Makefile |1 +
 lib/librte_pmd_enic/LICENSE  |   27 +
 lib/librte_pmd_enic/Makefile |   67 ++
 lib/librte_pmd_enic/enic.h   |  157 
 lib/librte_pmd_enic/enic_clsf.c  |  244 ++
 lib/librte_pmd_enic/enic_compat.h|  142 
 lib/librte_pmd_enic/enic_etherdev.c  |  613 +++
 lib/librte_pmd_enic/enic_main.c  | 1266 ++
 lib/librte_pmd_enic/enic_res.c   |  221 ++
 lib/librte_pmd_enic/enic_res.h   |  168 
 lib/librte_pmd_enic/vnic/cq_desc.h   |  126 +++
 lib/librte_pmd_enic/vnic/cq_enet_desc.h  |  261 ++
 lib/librte_pmd_enic/vnic/rq_enet_desc.h  |   76 ++
 lib/librte_pmd_enic/vnic/vnic_cq.c   |  117 +++
 lib/librte_pmd_enic/vnic/vnic_cq.h   |  152 
 lib/librte_pmd_enic/vnic/vnic_dev.c  | 1063 +
 lib/librte_pmd_enic/vnic/vnic_dev.h  |  202 +
 lib/librte_pmd_enic/vnic/vnic_devcmd.h   |  774 ++
 lib/librte_pmd_enic/vnic/vnic_enet.h |   78 ++
 lib/librte_pmd_enic/vnic/vnic_intr.c |   83 ++
 lib/librte_pmd_enic/vnic/vnic_intr.h |  126 +++
 lib/librte_pmd_enic/vnic/vnic_nic.h  |   88 +++
 lib/librte_pmd_enic/vnic/vnic_resource.h |   97 +++
 lib/librte_pmd_enic/vnic/vnic_rq.c   |  246 ++
 lib/librte_pmd_enic/vnic/vnic_rq.h   |  282 +++
 lib/librte_pmd_enic/vnic/vnic_rss.c  |   85 ++
 lib/librte_pmd_enic/vnic/vnic_rss.h  |   61 ++
 lib/librte_pmd_enic/vnic/vnic_stats.h|   86 ++
 lib/librte_pmd_enic/vnic/vnic_wq.c   |  245 ++
 lib/librte_pmd_enic/vnic/vnic_wq.h   |  283 +++
 lib/librte_pmd_enic/vnic/wq_enet_desc.h  |  114 +++
 mk/rte.app.mk|4 +
 33 files changed, 7560 insertions(+)
 create mode 100644 lib/librte_pmd_enic/LICENSE
 create mode 100644 lib/librte_pmd_enic/Makefile
 create mode 100644 lib/librte_pmd_enic/enic.h
 create mode 100644 lib/librte_pmd_enic/enic_clsf.c
 create mode 100644 lib/librte_pmd_enic/enic_compat.h
 create mode 100644 lib/librte_pmd_enic/enic_etherdev.c
 create mode 100644 lib/librte_pmd_enic/enic_main.c
 create mode 100644 lib/librte_pmd_enic/enic_res.c
 create mode 100644 lib/librte_pmd_enic/enic_res.h
 create mode 100644 lib/librte_pmd_enic/vnic/cq_desc.h
 create mode 100644 lib/librte_pmd_enic/vnic/cq_enet_desc.h
 create mode 100644 lib/librte_pmd_enic/vnic/rq_enet_desc.h
 create mode 100644 lib/librte_pmd_enic/vnic/vnic_cq.c
 create mode 100644 lib/librte_pmd_enic/vnic/vnic_cq.h
 create mode 100644 lib/librte_pmd_enic/vnic/vnic_dev.c
 create mode 100644 lib/librte_pmd_enic/vnic/vnic_dev.h
 create mode 100644 lib/librte_pmd_enic/vnic/vnic_devcmd.h
 create mode 100644 lib/librte_pmd_enic/vnic/vnic_enet.h
 create mode 100644 lib/librte_pmd_enic/vnic/vnic_intr.c
 create mode 100644 lib/librte_pmd_enic/vnic/vnic_intr.h
 create mode 100644 lib/librte_pmd_enic/vnic/vnic_nic.h
 create mode 100644 lib/librte_pmd_enic/vnic/vnic_resource.h
 create mode 100644 lib/librte_pmd_enic/vnic/vnic_rq.c
 create mode 100644 lib/librte_pmd_enic/vnic/vnic_rq.h
 create mode 100644 lib/librte_pmd_enic/vnic/vnic_rss.c
 create mode 100644 lib/librte_pmd_enic/vnic/vnic_rss.h
 create mode 100644 lib/librte_pmd_enic/vnic/vnic_stats.h
 create mode 100644 lib/librte_pmd_enic/vnic/vnic_wq.c
 create mode 100644 lib/librte_pmd_enic/vnic/vnic_wq.h
 create mode 100644 lib/librte_pmd_enic/vnic/wq_enet_desc.h

-- 
1.9.1



[dpdk-dev] 答复:答复: [PATCH] eal: map uio resources after hugepages when the base_virtaddr is configured.

2014-11-26 Thread Qiu, Michael
On 11/6/2014 12:12 AM, XU Liang wrote:
> I have a multiple processes application. When start a secondary process, we 
> got error message "EAL: pci_map_resource(): cannot mmap(11, 0x77fba000, 
> 0x2, 0x0): Bad file descriptor (0x77fb9000)".The secondary process 
> link difference shared libraries, so the address 0x77fba000 is used.

This is a known issue, and still not be solved yet.  The root cause is
exactly clear, that should be try to map an address have already used in
new process.

BTW, you should learn how to make a patch, like commit log,
signed-off-by, etc.

Thanks,
Michael

> --Burakov,
>  Anatoly ?2014?11?5?(???) 23:59?? 
> ?dev at dpdk.org RE: 
> ???[dpdk-dev] [PATCH] eal: map uio resources after hugepages when the   
> base_virtaddr is configured.
>
>
>
> font-family: MS Gothic;panose-1: 2 11 6 9 7 2 5 8 2 4;font-family: Cambria 
> Math;panose-1: 2 4 5 3 5 4 6 3 2 4;font-family: Calibri;panose-1: 2 15 5 2 2 
> 2 4 3 2 4;font-family: Tahoma;panose-1: 2 11 6 4 3 5 4 4 2 4;font-family: 
> \@MS Gothic;panose-1: 2 11 6 9 7 2 5 8 2 4;font-family: Microsoft 
> JhengHei;panose-1: 2 11 6 4 3 5 4 4 2 4;font-family: \@Microsoft 
> JhengHei;panose-1: 2 11 6 4 3 5 4 4 2 4;p.MsoNormal, li.MsoNormal, 
> div.MsoNormal {margin: 0.0cm;margin-bottom: 1.0E-4pt;font-size: 
> 12.0pt;font-family: Times New Roman , serif;}
> a:link, span.MsoHyperlink {mso-style-priority: 99;color: 
> #0563c1;text-decoration: underline;}
> a:visited, span.MsoHyperlinkFollowed {mso-style-priority: 99;color: 
> #954f72;text-decoration: underline;}
> span.EmailStyle17 {mso-style-type: personal-reply;font-family: Calibri , 
> sans-serif;color: #1f497d;}
> *.MsoChpDefault {mso-style-type: export-only;font-family: Calibri , 
> sans-serif;}
> size: 612.0pt 792.0pt;margin: 72.0pt 72.0pt 72.0pt 72.0pt;div.WordSection1 
> {page: WordSection1;}
>
>
>
>
> Hi Liang
>  
> Yes it is a problem. Even if it was carefully selected by user, nothing stops 
> the DPDK application from mapping something into where you?re trying to map 
> your
>  UIO devices. Plus, this changes the default behavior where a wrong 
> base-virtaddr leads to a failure to initialize, rather than simply using a 
> different address (remember that pci_map_resource fails if it cannot map the 
> resource at the exact address you requested).
>  
> A very crude way of finding out where hugepages end would be to walk the 
> hugepage memory (walk through memsegs and note the maximum start addr + 
> length of that
>  memseg).
>  
> Could you perhaps explain what is the problem that you?re trying to solve 
> with this? I can?t think of a situation where the location of UIO maps would 
> matter,
>  to be honest.
>  
> Thanks,
> Anatoly
>  
> From: XU Liang [mailto:liang.xu at cinfotech.cn]
>
>
> Sent: Wednesday, November 5, 2014 3:49 PM
>
> To: Burakov, Anatoly; dev at dpdk.org
>
> Subject: ???[dpdk-dev] [PATCH] eal: map uio resources after hugepages when 
> the base_virtaddr is configured.
>  
>
>
> I think the base_virtadd will be carefully selected by user when they need 
> it. So maybe it's not a real problem.  :>
>
>
>  
>
>
> The real reason is I can't find a easy way to get the end address of 
> hugepages. Can you give me some suggestions ?
>
>
>
> --
>
>
> Burakov, Anatoly 
>
>
> ?2014?11?5?(???)
>  23:10
>
>
> ?? ?dev at dpdk.org
>  
>
>
> RE:
>  [dpdk-dev] [PATCH] eal: map uio resources after hugepages when the 
> base_virtaddr is configured.
>
>
>  
>
> I have a slight problems with this patch.
>
>
>
> The base_virtaddr doesn't necessarily correspond to an address that 
> everything gets mapped to. It's a "hint" of sorts, that may or may not be 
> taken into account by mmap. Therefore we can't simply assume that if we 
> requested a base-virtaddr, everything will
>  get mapped at exactly that address. We also can't assume that hugepages will 
> be ordered one after the other and occupy neatly all the contiguous virtual 
> memory between base_virtaddr and base_virtaddr + internal_config.memory - 
> there may be holes, for whatever
>  reasons.
>
>
>
> Also, 
>
>
>
> Thanks,
>
> Anatoly
>
>
>
> -Original Message-
>
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of lxu
>
> Sent: Wednesday, November 5, 2014 1:25 PM
>
> To: dev at dpdk.org
>
> Subject: [dpdk-dev] [PATCH] eal: map uio resources after hugepages when the 
> base_virtaddr is configured.
>
>
>
> ---
>
> lib/librte_eal/linuxapp/eal/eal_pci_uio.c | 9 -
>
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
>
>
> diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c 
> b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
>
> index 7e62266..bc7ed3a 100644
>
> --- a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
>
> +++ b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
>
> @@ -289,6 +289,11 @@ pci_uio_map_resource(struct rte_pci_device *dev)
>
> str

[dpdk-dev] [PATCH v2 06/13] mbuf: add functions to get the name of an ol_flag

2014-11-26 Thread Zhang, Helin


> -Original Message-
> From: Ananyev, Konstantin
> Sent: Tuesday, November 25, 2014 9:49 PM
> To: Zhang, Helin; 'Olivier MATZ'; 'dev at dpdk.org'
> Cc: 'jigsaw at gmail.com'
> Subject: RE: [dpdk-dev] [PATCH v2 06/13] mbuf: add functions to get the name 
> of
> an ol_flag
> 
> 
> 
> > -Original Message-
> > From: Zhang, Helin
> > Sent: Tuesday, November 25, 2014 12:15 PM
> > To: Ananyev, Konstantin; 'Olivier MATZ'; 'dev at dpdk.org'
> > Cc: 'jigsaw at gmail.com'
> > Subject: RE: [dpdk-dev] [PATCH v2 06/13] mbuf: add functions to get
> > the name of an ol_flag
> >
> > HI Konstantin
> >
> > > -Original Message-
> > > From: Ananyev, Konstantin
> > > Sent: Tuesday, November 25, 2014 6:38 PM
> > > To: 'Olivier MATZ'; 'dev at dpdk.org'
> > > Cc: 'jigsaw at gmail.com'; Zhang, Helin
> > > Subject: RE: [dpdk-dev] [PATCH v2 06/13] mbuf: add functions to get
> > > the name of an ol_flag
> > >
> > > Hi Helin,
> > >
> > > > -Original Message-
> > > > From: Ananyev, Konstantin
> > > > Sent: Wednesday, November 19, 2014 11:07 AM
> > > > To: Olivier MATZ; dev at dpdk.org
> > > > Cc: jigsaw at gmail.com; Zhang, Helin
> > > > Subject: RE: [dpdk-dev] [PATCH v2 06/13] mbuf: add functions to
> > > > get the name of an ol_flag
> > > >
> > > >
> > > >
> > > > > -Original Message-
> > > > > From: Olivier MATZ [mailto:olivier.matz at 6wind.com]
> > > > > Sent: Tuesday, November 18, 2014 9:30 AM
> > > > > To: Ananyev, Konstantin; dev at dpdk.org
> > > > > Cc: jigsaw at gmail.com; Zhang, Helin
> > > > > Subject: Re: [dpdk-dev] [PATCH v2 06/13] mbuf: add functions to
> > > > > get the name of an ol_flag
> > > > >
> > > > > Hi Konstantin,
> > > > >
> > > > > On 11/17/2014 08:00 PM, Ananyev, Konstantin wrote:
> > > > > >> +/*
> > > > > >> + * Get the name of a RX offload flag  */ const char
> > > > > >> +*rte_get_rx_ol_flag_name(uint64_t mask) {
> > > > > >> +  switch (mask) {
> > > > > >> +  case PKT_RX_VLAN_PKT: return "PKT_RX_VLAN_PKT";
> > > > > >> +  case PKT_RX_RSS_HASH: return "PKT_RX_RSS_HASH";
> > > > > >> +  case PKT_RX_FDIR: return "PKT_RX_FDIR";
> > > > > >> +  case PKT_RX_L4_CKSUM_BAD: return
> "PKT_RX_L4_CKSUM_BAD";
> > > > > >> +  case PKT_RX_IP_CKSUM_BAD: return
> "PKT_RX_IP_CKSUM_BAD";
> > > > > >> +  /* case PKT_RX_EIP_CKSUM_BAD: return
> > > > > >> +"PKT_RX_EIP_CKSUM_BAD";
> > > */
> > > > > >> +  /* case PKT_RX_OVERSIZE: return "PKT_RX_OVERSIZE"; */
> > > > > >> +  /* case PKT_RX_HBUF_OVERFLOW: return
> > > "PKT_RX_HBUF_OVERFLOW"; */
> > > > > >> +  /* case PKT_RX_RECIP_ERR: return "PKT_RX_RECIP_ERR"; */
> > > > > >> +  /* case PKT_RX_MAC_ERR: return "PKT_RX_MAC_ERR"; */
> > > > > >
> > > > > > Didn't spot it before, wonder why do you need these 5
> > > > > > commented out
> > > lines?
> > > > > > In fact, why do we need these flags if they all equal to zero right 
> > > > > > now?
> > > > > > I know these flags were not introduced by that patch, in fact
> > > > > > as I can see it was a temporary measure, as old ol_flags were just 
> > > > > > 16 bits
> long:
> > > > > > http://dpdk.org/ml/archives/dev/2014-June/003308.html
> > > > > > So wonder should now these flags either get proper values or be
> removed?
> > > > >
> > > > > I would be in favor of removing them, or at least the following
> > > > > ones (I don't understand how they can help the application):
> > > > >
> > > > > - PKT_RX_OVERSIZE: Num of desc of an RX pkt oversize.
> > > > > - PKT_RX_HBUF_OVERFLOW: Header buffer overflow.
> > > > > - PKT_RX_RECIP_ERR: Hardware processing error.
> > > > > - PKT_RX_MAC_ERR: MAC error.
> > > >
> > > > Tend to agree...
> > > > Or probably collapse these 4 flags into one: flag PKT_RX_ERR or 
> > > > something.
> > > > Might be still used by someone for debugging purposes.
> > > > Helin, what do you think?
> > >
> > > As there is no answer, I suppose you don't care these flags any more.
> > > So we can just remove them, right?
> > Sorry, I think I care it a bit. I have a lot of emails to be dealt with, 
> > due to the
> whole week training.
> > Yes, it was added there before new mbuf defined. Why zero? Because of lack
> of bits for them.
> > Unfortunately, I forgot to add them with correct values after new mbuf
> introduced.
> > Thank you so much for spotting it out!
> >
> > The error flags were added according to the errors defined by FVL
> > datasheet. It could be helpful for middle layer software or
> > applications with the specific errors identified. I'd prefer to add the 
> > correct
> values for those flags. What do you think?
> 
> 
> I am ok to have one flag for that something like PKT_RX_HW_ERR (or something).
> Don't really understand why you need all 4 of them - the packet contains 
> invalid
> data anyway, so there is not much use of it.
Yes, I agree with you that one bit might be enough. It seems that we have more
than one bits for errors previously.

Regards,
Helin

> For debugging purposes you can just add a debug log for all of them.
> Something like:
> 
> if (un

[dpdk-dev] [RFC PATCH 5/6] ixgbe: rx/tx queue stop bug fix

2014-11-26 Thread Ouyang, Changchun
Hi,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Cunming Liang
> Sent: Tuesday, November 25, 2014 10:11 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [RFC PATCH 5/6] ixgbe: rx/tx queue stop bug fix
> 
> Signed-off-by: Cunming Liang 

Acked-by: Changchun Ouyang


Thanks
Changchun