from:"Alejandro Lucero"

[dpdk-dev] [PATCH v3] nfp: report link speed using hardware info

2016-12-02 Thread Alejandro Lucero

Previous reported speed was hardcoded.

v3: remove unsed macro
v2: using RTE_DIM instead of own macro

Signed-off-by: Alejandro Lucero <alejandro.luc...@netronome.com>
---
 drivers/net/nfp/nfp_net.c | 28 ++--
 1 file changed, 26 insertions(+), 2 deletions(-)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index c6b1587..24f3164 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -816,6 +816,17 @@ static void nfp_net_read_mac(struct nfp_net_hw *hw)
struct rte_eth_link link, old;
uint32_t nn_link_status;
 
+   static const uint32_t ls_to_ethtool[] = {
+   [NFP_NET_CFG_STS_LINK_RATE_UNSUPPORTED] = ETH_SPEED_NUM_NONE,
+   [NFP_NET_CFG_STS_LINK_RATE_UNKNOWN] = ETH_SPEED_NUM_NONE,
+   [NFP_NET_CFG_STS_LINK_RATE_1G]  = ETH_SPEED_NUM_1G,
+   [NFP_NET_CFG_STS_LINK_RATE_10G] = ETH_SPEED_NUM_10G,
+   [NFP_NET_CFG_STS_LINK_RATE_25G] = ETH_SPEED_NUM_25G,
+   [NFP_NET_CFG_STS_LINK_RATE_40G] = ETH_SPEED_NUM_40G,
+   [NFP_NET_CFG_STS_LINK_RATE_50G] = ETH_SPEED_NUM_50G,
+   [NFP_NET_CFG_STS_LINK_RATE_100G]= ETH_SPEED_NUM_100G,
+   };
+
PMD_DRV_LOG(DEBUG, "Link update\n");
 
hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
@@ -831,8 +842,21 @@ static void nfp_net_read_mac(struct nfp_net_hw *hw)
link.link_status = ETH_LINK_UP;
 
link.link_duplex = ETH_LINK_FULL_DUPLEX;
-   /* Other cards can limit the tx and rx rate per VF */
-   link.link_speed = ETH_SPEED_NUM_40G;
+
+   nn_link_status = (nn_link_status >> NFP_NET_CFG_STS_LINK_RATE_SHIFT) &
+NFP_NET_CFG_STS_LINK_RATE_MASK;
+
+   if ((NFD_CFG_MAJOR_VERSION_of(hw->ver) < 4) ||
+   ((NFD_CFG_MINOR_VERSION_of(hw->ver) == 4) &&
+   (NFD_CFG_MINOR_VERSION_of(hw->ver) == 0)))
+   link.link_speed = ETH_SPEED_NUM_40G;
+   else {
+   if (nn_link_status == NFP_NET_CFG_STS_LINK_RATE_UNKNOWN ||
+   nn_link_status >= RTE_DIM(ls_to_ethtool))
+   link.link_speed = ETH_SPEED_NUM_NONE;
+   else
+   link.link_speed = ls_to_ethtool[nn_link_status];
+   }
 
if (old.link_status != link.link_status) {
nfp_net_dev_atomic_write_link_status(dev, );
-- 
1.9.1

[dpdk-dev] [PATCH v2] ethdev: check number of queues less than RTE_ETHDEV_QUEUE_STAT_CNTRS

2016-12-01 Thread Alejandro Lucero

On Mon, Nov 28, 2016 at 11:13 AM, Thomas Monjalon  wrote:

> 2016-11-24 17:59, Olivier Matz:
> > Hi,
> >
> > On Mon, 2016-11-21 at 09:59 +0000, Alejandro Lucero wrote:
> > > From: Bert van Leeuwen 
> > >
> > > Arrays inside rte_eth_stats have size=RTE_ETHDEV_QUEUE_STAT_CNTRS.
> > > Some devices report more queues than that and this code blindly uses
> > > the reported number of queues by the device to fill those arrays up.
> > > This patch fixes the problem using MIN between the reported number of
> > > queues and RTE_ETHDEV_QUEUE_STAT_CNTRS.
> > >
> > > Signed-off-by: Alejandro Lucero 
> > >
> >
> > Reviewed-by: Olivier Matz 
> >
> >
> > As a next step, I'm wondering if it would be possible to remove
> > this limitation. We could replace the tables in struct rte_eth_stats
> > by a pointer to an array allocated dynamically at pmd setup.
>
> Yes that's definitely the right way to handle these statistics.
>
>
Agree.


> > It would break the API, so it should be announced first. I'm thinking
> > of something like:
> >
> > struct rte_eth_generic_stats {
> > uint64_t ipackets;
> > uint64_t opackets;
> > uint64_t ibytes;
> > uint64_t obytes;
> > uint64_t imissed;
> > uint64_t ierrors;
> > uint64_t oerrors;
> > uint64_t rx_nombuf
> > };
> >
> > struct rte_eth_stats {
> >   struct rte_eth_generic_stats port_stats;
> >   struct rte_eth_generic_stats *queue_stats;
> > };
> >
> > The queue_stats array would always be indexed by queue_id.
> > The xstats would continue to report the generic stats per-port and
> > per-queue.
> >
> > About the mapping API, either we keep it as-is, or it could
> > become a driver-specific API.
>
> Yes I agree to remove the queue statistics mapping which is very specific.
> I will send a patch with a deprecation notice to move the mapping API
> to a driver-specific API.
>
> Any objection?
>

No from my side.

[dpdk-dev] [PATCH v2] nfp: report link speed using hardware info

2016-11-21 Thread Alejandro Lucero

On Mon, Nov 21, 2016 at 11:18 AM, Ferruh Yigit 
wrote:

> On 11/18/2016 4:06 PM, Alejandro Lucero wrote:
> > Previous reported speed was hardcoded.
> >
> > Signed-off-by: Alejandro Lucero 
> > ---
> >  drivers/net/nfp/nfp_net.c  | 28 ++--
> >  drivers/net/nfp/nfp_net_ctrl.h | 13 +
> >  2 files changed, 39 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
> > index c6b1587..24f3164 100644
> > --- a/drivers/net/nfp/nfp_net.c
> > +++ b/drivers/net/nfp/nfp_net.c
> > @@ -816,6 +816,17 @@ static void nfp_net_read_mac(struct nfp_net_hw *hw)
> >   struct rte_eth_link link, old;
> >   uint32_t nn_link_status;
> >
> > + static const uint32_t ls_to_ethtool[] = {
> > + [NFP_NET_CFG_STS_LINK_RATE_UNSUPPORTED] =
> ETH_SPEED_NUM_NONE,
> > + [NFP_NET_CFG_STS_LINK_RATE_UNKNOWN] =
> ETH_SPEED_NUM_NONE,
> > + [NFP_NET_CFG_STS_LINK_RATE_1G]  = ETH_SPEED_NUM_1G,
> > + [NFP_NET_CFG_STS_LINK_RATE_10G] =
> ETH_SPEED_NUM_10G,
> > + [NFP_NET_CFG_STS_LINK_RATE_25G] =
> ETH_SPEED_NUM_25G,
> > + [NFP_NET_CFG_STS_LINK_RATE_40G] =
> ETH_SPEED_NUM_40G,
> > + [NFP_NET_CFG_STS_LINK_RATE_50G] =
> ETH_SPEED_NUM_50G,
> > + [NFP_NET_CFG_STS_LINK_RATE_100G]=
> ETH_SPEED_NUM_100G,
> > + };
> > +
> >   PMD_DRV_LOG(DEBUG, "Link update\n");
> >
> >   hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
> > @@ -831,8 +842,21 @@ static void nfp_net_read_mac(struct nfp_net_hw *hw)
> >   link.link_status = ETH_LINK_UP;
> >
> >   link.link_duplex = ETH_LINK_FULL_DUPLEX;
> > - /* Other cards can limit the tx and rx rate per VF */
> > - link.link_speed = ETH_SPEED_NUM_40G;
> > +
> > + nn_link_status = (nn_link_status >> NFP_NET_CFG_STS_LINK_RATE_SHIFT)
> &
> > +  NFP_NET_CFG_STS_LINK_RATE_MASK;
> > +
> > + if ((NFD_CFG_MAJOR_VERSION_of(hw->ver) < 4) ||
> > + ((NFD_CFG_MINOR_VERSION_of(hw->ver) == 4) &&
> > + (NFD_CFG_MINOR_VERSION_of(hw->ver) == 0)))
> > + link.link_speed = ETH_SPEED_NUM_40G;
>
> For specific firmware version, speed is still hardcoded to 40G, can you
> please mention from this and if possible its reason in commit log?
>
> > + else {
> > + if (nn_link_status == NFP_NET_CFG_STS_LINK_RATE_UNKNOWN ||
>
> This check can be redundant, since
> ls_to_ethtool[NFP_NET_CFG_STS_LINK_RATE_UNKNOWN] => ETH_SPEED_NUM_NONE
>
>
This is for checking any wrong value from firmware/hardware.


> > + nn_link_status >= RTE_DIM(ls_to_ethtool))
> > + link.link_speed = ETH_SPEED_NUM_NONE;
> > + else
> > + link.link_speed = ls_to_ethtool[nn_link_status];
> > + }
> >
> >   if (old.link_status != link.link_status) {
> >   nfp_net_dev_atomic_write_link_status(dev, );
> > diff --git a/drivers/net/nfp/nfp_net_ctrl.h b/drivers/net/nfp/nfp_net_
> ctrl.h
> > index fce8251..f9aaba3 100644
> > --- a/drivers/net/nfp/nfp_net_ctrl.h
> > +++ b/drivers/net/nfp/nfp_net_ctrl.h
> > @@ -157,6 +157,19 @@
> >  #define   NFP_NET_CFG_VERSION_MINOR(x)(((x) & 0xff) <<  0)
> >  #define NFP_NET_CFG_STS 0x0034
> >  #define   NFP_NET_CFG_STS_LINK(0x1 << 0) /* Link up or down
> */
> > +/* Link rate */
> > +#define   NFP_NET_CFG_STS_LINK_RATE_SHIFT 1
> > +#define   NFP_NET_CFG_STS_LINK_RATE_MASK  0xF
> > +#define   NFP_NET_CFG_STS_LINK_RATE   \
> > +   (NFP_NET_CFG_STS_LINK_RATE_MASK << NFP_NET_CFG_STS_LINK_RATE_
> SHIFT)
>
> This macro is not used at all, just fyi.
>
>
Thanks. I think I can remove it.


> > +#define   NFP_NET_CFG_STS_LINK_RATE_UNSUPPORTED   0
> > +#define   NFP_NET_CFG_STS_LINK_RATE_UNKNOWN   1
> > +#define   NFP_NET_CFG_STS_LINK_RATE_1G2
> > +#define   NFP_NET_CFG_STS_LINK_RATE_10G   3
> > +#define   NFP_NET_CFG_STS_LINK_RATE_25G   4
> > +#define   NFP_NET_CFG_STS_LINK_RATE_40G   5
> > +#define   NFP_NET_CFG_STS_LINK_RATE_50G   6
> > +#define   NFP_NET_CFG_STS_LINK_RATE_100G  7
> >  #define NFP_NET_CFG_CAP 0x0038
> >  #define NFP_NET_CFG_MAX_TXRINGS 0x003c
> >  #define NFP_NET_CFG_MAX_RXRINGS 0x0040
> >
>
>

[dpdk-dev] [PATCH v2] ethdev: check number of queues less than RTE_ETHDEV_QUEUE_STAT_CNTRS

2016-11-21 Thread Alejandro Lucero

From: Bert van Leeuwen <bert.vanleeu...@netronome.com>

Arrays inside rte_eth_stats have size=RTE_ETHDEV_QUEUE_STAT_CNTRS.
Some devices report more queues than that and this code blindly uses
the reported number of queues by the device to fill those arrays up.
This patch fixes the problem using MIN between the reported number of
queues and RTE_ETHDEV_QUEUE_STAT_CNTRS.

Signed-off-by: Alejandro Lucero 
---
 lib/librte_ether/rte_ethdev.c | 25 +
 1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index fde8112..4209ad0 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1343,8 +1343,10 @@ get_xstats_count(uint8_t port_id)
} else
count = 0;
count += RTE_NB_STATS;
-   count += dev->data->nb_rx_queues * RTE_NB_RXQ_STATS;
-   count += dev->data->nb_tx_queues * RTE_NB_TXQ_STATS;
+   count += RTE_MIN(dev->data->nb_rx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS) *
+RTE_NB_RXQ_STATS;
+   count += RTE_MIN(dev->data->nb_tx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS) *
+RTE_NB_TXQ_STATS;
return count;
 }

@@ -1358,6 +1360,7 @@ rte_eth_xstats_get_names(uint8_t port_id,
int cnt_expected_entries;
int cnt_driver_entries;
uint32_t idx, id_queue;
+   uint16_t num_q;

cnt_expected_entries = get_xstats_count(port_id);
if (xstats_names == NULL || cnt_expected_entries < 0 ||
@@ -1374,7 +1377,8 @@ rte_eth_xstats_get_names(uint8_t port_id,
"%s", rte_stats_strings[idx].name);
cnt_used_entries++;
}
-   for (id_queue = 0; id_queue < dev->data->nb_rx_queues; id_queue++) {
+   num_q = RTE_MIN(dev->data->nb_rx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
+   for (id_queue = 0; id_queue < num_q; id_queue++) {
for (idx = 0; idx < RTE_NB_RXQ_STATS; idx++) {
snprintf(xstats_names[cnt_used_entries].name,
sizeof(xstats_names[0].name),
@@ -1384,7 +1388,8 @@ rte_eth_xstats_get_names(uint8_t port_id,
}

}
-   for (id_queue = 0; id_queue < dev->data->nb_tx_queues; id_queue++) {
+   num_q = RTE_MIN(dev->data->nb_tx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
+   for (id_queue = 0; id_queue < num_q; id_queue++) {
for (idx = 0; idx < RTE_NB_TXQ_STATS; idx++) {
snprintf(xstats_names[cnt_used_entries].name,
sizeof(xstats_names[0].name),
@@ -1420,14 +1425,18 @@ rte_eth_xstats_get(uint8_t port_id, struct 
rte_eth_xstat *xstats,
unsigned count = 0, i, q;
signed xcount = 0;
uint64_t val, *stats_ptr;
+   uint16_t nb_rxqs, nb_txqs;

RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);

dev = _eth_devices[port_id];

+   nb_rxqs = RTE_MIN(dev->data->nb_rx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
+   nb_txqs = RTE_MIN(dev->data->nb_tx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
+
/* Return generic statistics */
-   count = RTE_NB_STATS + (dev->data->nb_rx_queues * RTE_NB_RXQ_STATS) +
-   (dev->data->nb_tx_queues * RTE_NB_TXQ_STATS);
+   count = RTE_NB_STATS + (nb_rxqs * RTE_NB_RXQ_STATS) +
+   (nb_txqs * RTE_NB_TXQ_STATS);

/* implemented by the driver */
if (dev->dev_ops->xstats_get != NULL) {
@@ -1458,7 +1467,7 @@ rte_eth_xstats_get(uint8_t port_id, struct rte_eth_xstat 
*xstats,
}

/* per-rxq stats */
-   for (q = 0; q < dev->data->nb_rx_queues; q++) {
+   for (q = 0; q < nb_rxqs; q++) {
for (i = 0; i < RTE_NB_RXQ_STATS; i++) {
stats_ptr = RTE_PTR_ADD(_stats,
rte_rxq_stats_strings[i].offset +
@@ -1469,7 +1478,7 @@ rte_eth_xstats_get(uint8_t port_id, struct rte_eth_xstat 
*xstats,
}

/* per-txq stats */
-   for (q = 0; q < dev->data->nb_tx_queues; q++) {
+   for (q = 0; q < nb_txqs; q++) {
for (i = 0; i < RTE_NB_TXQ_STATS; i++) {
stats_ptr = RTE_PTR_ADD(_stats,
rte_txq_stats_strings[i].offset +
-- 
1.9.1

[dpdk-dev] [PATCH v2] nfp: report link speed using hardware info

2016-11-18 Thread Alejandro Lucero

On Fri, Nov 18, 2016 at 4:29 PM, Thomas Monjalon 
wrote:

> 2016-11-18 16:06, Alejandro Lucero:
> > Previous reported speed was hardcoded.
> >
> > Signed-off-by: Alejandro Lucero 
> > ---
> >  drivers/net/nfp/nfp_net.c  | 28 ++--
> >  drivers/net/nfp/nfp_net_ctrl.h | 13 +
> >  2 files changed, 39 insertions(+), 2 deletions(-)
>
> You should update the doc in the same patch:
> doc/guides/nics/features/nfp.ini
> It will be the first feature as the file appears to be empty.
> So you will need another patch to fill other existing features.
>

Yes. I'm just working on updating that file properly.
May I delay this doc change for including it with that other one?
It will be a bit weird to just have one feature there.

>
> I have an unrelated question: why nfp is disabled in the default build?
>
>
Because NFP PMD can just work if Netronome BSP is installed in the system.
We do not support PF with the PMD, so it requires a Linux PF driver which
comes with the BSP.

The compilation has no dependencies, but we had our own UIO driver before
(now using igb_uio). So basically, we wanted the people aware of this
dependency and to specifically configure this option.

I know what you are surely going to ask about DPDK in Linux distributions,
and that this being a bad idea. The fact is, we have people using NFP PMD
as part of a product, so installing that product implies to (automatically)
install the BSP and a specific DPDK version with the NFP PMD enabled. But
yes, maybe we should modify this and to add some sort of BSP check inside
the PMD.

So, thanks for the heads up. I will think about this.

[dpdk-dev] [PATCH v2] nfp: report link speed using hardware info

2016-11-18 Thread Alejandro Lucero

Previous reported speed was hardcoded.

Signed-off-by: Alejandro Lucero 
---
 drivers/net/nfp/nfp_net.c  | 28 ++--
 drivers/net/nfp/nfp_net_ctrl.h | 13 +
 2 files changed, 39 insertions(+), 2 deletions(-)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index c6b1587..24f3164 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -816,6 +816,17 @@ static void nfp_net_read_mac(struct nfp_net_hw *hw)
struct rte_eth_link link, old;
uint32_t nn_link_status;

+   static const uint32_t ls_to_ethtool[] = {
+   [NFP_NET_CFG_STS_LINK_RATE_UNSUPPORTED] = ETH_SPEED_NUM_NONE,
+   [NFP_NET_CFG_STS_LINK_RATE_UNKNOWN] = ETH_SPEED_NUM_NONE,
+   [NFP_NET_CFG_STS_LINK_RATE_1G]  = ETH_SPEED_NUM_1G,
+   [NFP_NET_CFG_STS_LINK_RATE_10G] = ETH_SPEED_NUM_10G,
+   [NFP_NET_CFG_STS_LINK_RATE_25G] = ETH_SPEED_NUM_25G,
+   [NFP_NET_CFG_STS_LINK_RATE_40G] = ETH_SPEED_NUM_40G,
+   [NFP_NET_CFG_STS_LINK_RATE_50G] = ETH_SPEED_NUM_50G,
+   [NFP_NET_CFG_STS_LINK_RATE_100G]= ETH_SPEED_NUM_100G,
+   };
+
PMD_DRV_LOG(DEBUG, "Link update\n");

hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
@@ -831,8 +842,21 @@ static void nfp_net_read_mac(struct nfp_net_hw *hw)
link.link_status = ETH_LINK_UP;

link.link_duplex = ETH_LINK_FULL_DUPLEX;
-   /* Other cards can limit the tx and rx rate per VF */
-   link.link_speed = ETH_SPEED_NUM_40G;
+
+   nn_link_status = (nn_link_status >> NFP_NET_CFG_STS_LINK_RATE_SHIFT) &
+NFP_NET_CFG_STS_LINK_RATE_MASK;
+
+   if ((NFD_CFG_MAJOR_VERSION_of(hw->ver) < 4) ||
+   ((NFD_CFG_MINOR_VERSION_of(hw->ver) == 4) &&
+   (NFD_CFG_MINOR_VERSION_of(hw->ver) == 0)))
+   link.link_speed = ETH_SPEED_NUM_40G;
+   else {
+   if (nn_link_status == NFP_NET_CFG_STS_LINK_RATE_UNKNOWN ||
+   nn_link_status >= RTE_DIM(ls_to_ethtool))
+   link.link_speed = ETH_SPEED_NUM_NONE;
+   else
+   link.link_speed = ls_to_ethtool[nn_link_status];
+   }

if (old.link_status != link.link_status) {
nfp_net_dev_atomic_write_link_status(dev, );
diff --git a/drivers/net/nfp/nfp_net_ctrl.h b/drivers/net/nfp/nfp_net_ctrl.h
index fce8251..f9aaba3 100644
--- a/drivers/net/nfp/nfp_net_ctrl.h
+++ b/drivers/net/nfp/nfp_net_ctrl.h
@@ -157,6 +157,19 @@
 #define   NFP_NET_CFG_VERSION_MINOR(x)(((x) & 0xff) <<  0)
 #define NFP_NET_CFG_STS 0x0034
 #define   NFP_NET_CFG_STS_LINK(0x1 << 0) /* Link up or down */
+/* Link rate */
+#define   NFP_NET_CFG_STS_LINK_RATE_SHIFT 1
+#define   NFP_NET_CFG_STS_LINK_RATE_MASK  0xF
+#define   NFP_NET_CFG_STS_LINK_RATE   \
+ (NFP_NET_CFG_STS_LINK_RATE_MASK << NFP_NET_CFG_STS_LINK_RATE_SHIFT)
+#define   NFP_NET_CFG_STS_LINK_RATE_UNSUPPORTED   0
+#define   NFP_NET_CFG_STS_LINK_RATE_UNKNOWN   1
+#define   NFP_NET_CFG_STS_LINK_RATE_1G2
+#define   NFP_NET_CFG_STS_LINK_RATE_10G   3
+#define   NFP_NET_CFG_STS_LINK_RATE_25G   4
+#define   NFP_NET_CFG_STS_LINK_RATE_40G   5
+#define   NFP_NET_CFG_STS_LINK_RATE_50G   6
+#define   NFP_NET_CFG_STS_LINK_RATE_100G  7
 #define NFP_NET_CFG_CAP 0x0038
 #define NFP_NET_CFG_MAX_TXRINGS 0x003c
 #define NFP_NET_CFG_MAX_RXRINGS 0x0040
-- 
1.9.1

[dpdk-dev] Fwd: |WARNING| [PATCH] nfp: report link speed using hardware info

2016-11-18 Thread Alejandro Lucero

On Fri, Nov 18, 2016 at 3:31 PM, Alejandro Lucero <
alejandro.lucero at netronome.com> wrote:

>
>
> On Fri, Nov 18, 2016 at 3:24 PM, Ferruh Yigit 
> wrote:
>
>> On 11/18/2016 3:10 PM, Alejandro Lucero wrote:
>> > Hi Thomas,
>> >
>> > I got this email when sending a patch some minutes ago.
>> >
>> > The point is I trusted script/checkpatches.sh which did not report those
>> > warnings.
>> > Am I doing anything wrong when using checkpatches.sh?
>>
>> I am also getting same warnings as below, this can be related to the
>> checkpatch.pl version.
>>
>> I have: Version: 0.32
>> (./scripts/checkpatch.pl --version)
>>
>>
> Uhmm, I got same one.
>
>

Ok. It seems I suffered a temporal blindness. I though the automatic report
was about warnings but it is about checks. But I got just one of the checks
messages. This is the output with -v and adding OPTIONS used:

### [PATCH] nfp: report link speed using hardware info


OPTIONS: --no-tree --max-line-length=80 --show-types
--ignore=LINUX_VERSION_CODE,FILE_PATH_CHANGES,VOLATILE,PREFER_PACKED,PREFER_ALIGNED,PREFER_PRINTF,PREFER_KERNEL_TYPES,BIT_MACRO,CONST_STRUCT,SPLIT_STRING,LINE_SPACING,PARENTHESIS_ALIGNMENT,NETWORKING_BLOCK_COMMENT_STYLE,NEW_TYPEDEFS,COMPARISON_TO_NULL

CHECK:BRACES: Blank lines aren't necessary after an open brace '{'

#60: FILE: drivers/net/nfp/nfp_net.c:856:

+ else {

+


total: 0 errors, 0 warnings, 1 checks, 68 lines checked


0/1 valid patch






> >
>> >
>> > -- Forwarded message --
>> > From: 
>> > Date: Fri, Nov 18, 2016 at 3:04 PM
>> > Subject: |WARNING| [PATCH] nfp: report link speed using hardware info
>> > To: test-report at dpdk.org
>> > Cc: Alejandro Lucero 
>> >
>> >
>> > Test-Label: checkpatch
>> > Test-Status: WARNING
>> > http://dpdk.org/patch/17091
>> >
>> > _coding style issues_
>> >
>> >
>> > CHECK:MACRO_ARG_REUSE: Macro argument reuse 'arr' - possible
>> side-effects?
>> > #53: FILE: drivers/net/nfp/nfp_net.c:806:
>> > +#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]))
>> >
>> > CHECK:BRACES: Blank lines aren't necessary after an open brace '{'
>> > #91: FILE: drivers/net/nfp/nfp_net.c:856:
>> > +   else {
>> > +
>> >
>> > total: 0 errors, 0 warnings, 2 checks, 68 lines checked
>> >
>>
>>
>

[dpdk-dev] Fwd: |WARNING| [PATCH] nfp: report link speed using hardware info

2016-11-18 Thread Alejandro Lucero

On Fri, Nov 18, 2016 at 3:26 PM, Ferruh Yigit 
wrote:

> On 11/18/2016 3:10 PM, Alejandro Lucero wrote:
> > Hi Thomas,
> >
> > I got this email when sending a patch some minutes ago.
> >
> > The point is I trusted script/checkpatches.sh which did not report those
> > warnings.
> > Am I doing anything wrong when using checkpatches.sh?
> >
> >
> > -- Forwarded message --
> > From: 
> > Date: Fri, Nov 18, 2016 at 3:04 PM
> > Subject: |WARNING| [PATCH] nfp: report link speed using hardware info
> > To: test-report at dpdk.org
> > Cc: Alejandro Lucero 
> >
> >
> > Test-Label: checkpatch
> > Test-Status: WARNING
> > http://dpdk.org/patch/17091
> >
> > _coding style issues_
> >
> >
> > CHECK:MACRO_ARG_REUSE: Macro argument reuse 'arr' - possible
> side-effects?
> > #53: FILE: drivers/net/nfp/nfp_net.c:806:
> > +#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]))
>
> btw, you can benefit from RTE_DIM:
>
> lib/librte_eal/common/include/rte_common.h:352:
> #define  RTE_DIM(a)  (sizeof (a) / sizeof ((a)[0]))
>
>
Thanks!

I will use it in the next patch version.


> >
> > CHECK:BRACES: Blank lines aren't necessary after an open brace '{'
> > #91: FILE: drivers/net/nfp/nfp_net.c:856:
> > +   else {
> > +
> >
> > total: 0 errors, 0 warnings, 2 checks, 68 lines checked
> >
>
>

[dpdk-dev] Fwd: |WARNING| [PATCH] nfp: report link speed using hardware info

2016-11-18 Thread Alejandro Lucero

On Fri, Nov 18, 2016 at 3:24 PM, Ferruh Yigit 
wrote:

> On 11/18/2016 3:10 PM, Alejandro Lucero wrote:
> > Hi Thomas,
> >
> > I got this email when sending a patch some minutes ago.
> >
> > The point is I trusted script/checkpatches.sh which did not report those
> > warnings.
> > Am I doing anything wrong when using checkpatches.sh?
>
> I am also getting same warnings as below, this can be related to the
> checkpatch.pl version.
>
> I have: Version: 0.32
> (./scripts/checkpatch.pl --version)
>
>
Uhmm, I got same one.


> >
> >
> > -- Forwarded message --
> > From: 
> > Date: Fri, Nov 18, 2016 at 3:04 PM
> > Subject: |WARNING| [PATCH] nfp: report link speed using hardware info
> > To: test-report at dpdk.org
> > Cc: Alejandro Lucero 
> >
> >
> > Test-Label: checkpatch
> > Test-Status: WARNING
> > http://dpdk.org/patch/17091
> >
> > _coding style issues_
> >
> >
> > CHECK:MACRO_ARG_REUSE: Macro argument reuse 'arr' - possible
> side-effects?
> > #53: FILE: drivers/net/nfp/nfp_net.c:806:
> > +#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]))
> >
> > CHECK:BRACES: Blank lines aren't necessary after an open brace '{'
> > #91: FILE: drivers/net/nfp/nfp_net.c:856:
> > +   else {
> > +
> >
> > total: 0 errors, 0 warnings, 2 checks, 68 lines checked
> >
>
>

[dpdk-dev] Fwd: |WARNING| [PATCH] nfp: report link speed using hardware info

2016-11-18 Thread Alejandro Lucero

Hi Thomas,

I got this email when sending a patch some minutes ago.

The point is I trusted script/checkpatches.sh which did not report those
warnings.
Am I doing anything wrong when using checkpatches.sh?


-- Forwarded message --
From: <checkpa...@dpdk.org>
Date: Fri, Nov 18, 2016 at 3:04 PM
Subject: |WARNING| [PATCH] nfp: report link speed using hardware info
To: test-report at dpdk.org
Cc: Alejandro Lucero 


Test-Label: checkpatch
Test-Status: WARNING
http://dpdk.org/patch/17091

_coding style issues_


CHECK:MACRO_ARG_REUSE: Macro argument reuse 'arr' - possible side-effects?
#53: FILE: drivers/net/nfp/nfp_net.c:806:
+#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]))

CHECK:BRACES: Blank lines aren't necessary after an open brace '{'
#91: FILE: drivers/net/nfp/nfp_net.c:856:
+   else {
+

total: 0 errors, 0 warnings, 2 checks, 68 lines checked

[dpdk-dev] [PATCH] nfp: report link speed using hardware info

2016-11-18 Thread Alejandro Lucero

Previous reported speed was hardcoded.

Signed-off-by: Alejandro Lucero 
---
 drivers/net/nfp/nfp_net.c  | 31 +--
 drivers/net/nfp/nfp_net_ctrl.h | 13 +
 2 files changed, 42 insertions(+), 2 deletions(-)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index c6b1587..d5ec0ff 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -803,6 +803,7 @@ static void nfp_net_read_mac(struct nfp_net_hw *hw)
hw->ctrl = new_ctrl;
 }

+#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]))
 /*
  * return 0 means link status changed, -1 means not changed
  *
@@ -816,6 +817,18 @@ static void nfp_net_read_mac(struct nfp_net_hw *hw)
struct rte_eth_link link, old;
uint32_t nn_link_status;

+   static const uint32_t ls_to_ethtool[] = {
+   [NFP_NET_CFG_STS_LINK_RATE_UNSUPPORTED] = ETH_SPEED_NUM_NONE,
+   [NFP_NET_CFG_STS_LINK_RATE_UNKNOWN] = ETH_SPEED_NUM_NONE,
+   [NFP_NET_CFG_STS_LINK_RATE_1G]  = ETH_SPEED_NUM_1G,
+   [NFP_NET_CFG_STS_LINK_RATE_10G] = ETH_SPEED_NUM_10G,
+   [NFP_NET_CFG_STS_LINK_RATE_25G] = ETH_SPEED_NUM_25G,
+   [NFP_NET_CFG_STS_LINK_RATE_40G] = ETH_SPEED_NUM_40G,
+   [NFP_NET_CFG_STS_LINK_RATE_50G] = ETH_SPEED_NUM_50G,
+   [NFP_NET_CFG_STS_LINK_RATE_100G]= ETH_SPEED_NUM_100G,
+   };
+
+
PMD_DRV_LOG(DEBUG, "Link update\n");

hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
@@ -831,8 +844,22 @@ static void nfp_net_read_mac(struct nfp_net_hw *hw)
link.link_status = ETH_LINK_UP;

link.link_duplex = ETH_LINK_FULL_DUPLEX;
-   /* Other cards can limit the tx and rx rate per VF */
-   link.link_speed = ETH_SPEED_NUM_40G;
+
+   nn_link_status = (nn_link_status >> NFP_NET_CFG_STS_LINK_RATE_SHIFT) &
+NFP_NET_CFG_STS_LINK_RATE_MASK;
+
+   if ((NFD_CFG_MAJOR_VERSION_of(hw->ver) < 4) ||
+   ((NFD_CFG_MINOR_VERSION_of(hw->ver) == 4) &&
+   (NFD_CFG_MINOR_VERSION_of(hw->ver) == 0)))
+   link.link_speed = ETH_SPEED_NUM_40G;
+   else {
+
+   if (nn_link_status == NFP_NET_CFG_STS_LINK_RATE_UNKNOWN ||
+   nn_link_status >= ARRAY_SIZE(ls_to_ethtool))
+   link.link_speed = ETH_SPEED_NUM_NONE;
+   else
+   link.link_speed = ls_to_ethtool[nn_link_status];
+   }

if (old.link_status != link.link_status) {
nfp_net_dev_atomic_write_link_status(dev, );
diff --git a/drivers/net/nfp/nfp_net_ctrl.h b/drivers/net/nfp/nfp_net_ctrl.h
index fce8251..f9aaba3 100644
--- a/drivers/net/nfp/nfp_net_ctrl.h
+++ b/drivers/net/nfp/nfp_net_ctrl.h
@@ -157,6 +157,19 @@
 #define   NFP_NET_CFG_VERSION_MINOR(x)(((x) & 0xff) <<  0)
 #define NFP_NET_CFG_STS 0x0034
 #define   NFP_NET_CFG_STS_LINK(0x1 << 0) /* Link up or down */
+/* Link rate */
+#define   NFP_NET_CFG_STS_LINK_RATE_SHIFT 1
+#define   NFP_NET_CFG_STS_LINK_RATE_MASK  0xF
+#define   NFP_NET_CFG_STS_LINK_RATE   \
+ (NFP_NET_CFG_STS_LINK_RATE_MASK << NFP_NET_CFG_STS_LINK_RATE_SHIFT)
+#define   NFP_NET_CFG_STS_LINK_RATE_UNSUPPORTED   0
+#define   NFP_NET_CFG_STS_LINK_RATE_UNKNOWN   1
+#define   NFP_NET_CFG_STS_LINK_RATE_1G2
+#define   NFP_NET_CFG_STS_LINK_RATE_10G   3
+#define   NFP_NET_CFG_STS_LINK_RATE_25G   4
+#define   NFP_NET_CFG_STS_LINK_RATE_40G   5
+#define   NFP_NET_CFG_STS_LINK_RATE_50G   6
+#define   NFP_NET_CFG_STS_LINK_RATE_100G  7
 #define NFP_NET_CFG_CAP 0x0038
 #define NFP_NET_CFG_MAX_TXRINGS 0x003c
 #define NFP_NET_CFG_MAX_RXRINGS 0x0040
-- 
1.9.1

[dpdk-dev] [PATCH] ethdev: check number of queues less than RTE_ETHDEV_QUEUE_STAT_CNTRS

2016-11-11 Thread Alejandro Lucero

On Fri, Nov 11, 2016 at 9:29 AM, Thomas Monjalon 
wrote:

> 2016-11-11 09:16, Alejandro Lucero:
> > Thomas,
> >
> > We are wondering if you realize this patch fixes a bug with current
> ethdev
> > code as a device can have more than RTE_ETHDEV_QUEUE_STAT_CNTRS.
> >
> > Maybe the commit message is giving the wrong impression and as you
> > commented, it should just focus on the bug it fixes and to leave for
> > another email thread the discussion of how to solve the
> > RTE_ETHDEV_QUEUE_STAT_CNTRS
> > problem.
> >
> > Should we remove this from patchwork and to send another patch that way?
>
> Yes please. It was my first comment, we don't understand the exact issue
> you are fixing.
>

OK


> And I have a bad feeling it could break something else (really just a
> feeling).
> It is not the kind of patch we can apply the last day of a release.
> That's why I think it should wait 17.02.
>
>
Fine.


> Of course you can try to convince me and others to apply it as a last
> minute
> patch. But why are you sending a patch on the generic API in the last days?
>
>
We just found it a couple of days ago.


> Last argument: it is not fixing a regression of 16.11, so it is not so
> urgent.
>

[dpdk-dev] [PATCH] ethdev: check number of queues less than RTE_ETHDEV_QUEUE_STAT_CNTRS

2016-11-11 Thread Alejandro Lucero

Thomas,

We are wondering if you realize this patch fixes a bug with current ethdev
code as a device can have more than RTE_ETHDEV_QUEUE_STAT_CNTRS.

Maybe the commit message is giving the wrong impression and as you
commented, it should just focus on the bug it fixes and to leave for
another email thread the discussion of how to solve the
RTE_ETHDEV_QUEUE_STAT_CNTRS
problem.

Should we remove this from patchwork and to send another patch that way?


On Thu, Nov 10, 2016 at 4:04 PM, Alejandro Lucero <
alejandro.lucero at netronome.com> wrote:

>
>
> On Thu, Nov 10, 2016 at 4:01 PM, Thomas Monjalon <
> thomas.monjalon at 6wind.com> wrote:
>
>> 2016-11-10 15:43, Alejandro Lucero:
>> > On Thu, Nov 10, 2016 at 2:42 PM, Thomas Monjalon <
>> thomas.monjalon at 6wind.com>
>> > wrote:
>> >
>> > > 2016-11-10 14:00, Alejandro Lucero:
>> > > > From: Bert van Leeuwen 
>> > > >
>> > > > A device can have more than RTE_ETHDEV_QUEUE_STAT_CNTRS queues which
>> > > > is used inside struct rte_eth_stats. Ideally, DPDK should be built
>> with
>> > > > RTE_ETHDEV_QUEUE_STAT_CNTRS to the maximum number of queues a device
>> > > > can support, 65536, as uint16_t is used for keeping those values for
>> > > > RX and TX. But of course, having such big arrays inside struct
>> > > rte_eth_stats
>> > > > is not a good idea.
>> > >
>> > > RTE_ETHDEV_QUEUE_STAT_CNTRS come from a limitation in Intel devices.
>> > > They have limited number of registers to store the stats per queue.
>> > >
>> > > > Current default value is 16, which could likely be changed to 32 or
>> 64
>> > > > without too much opposition. And maybe it would be a good idea to
>> modify
>> > > > struct rte_eth_stats for allowing dynamically allocated arrays and
>> maybe
>> > > > some extra fields for keeping the array sizes.
>> > >
>> > > Yes
>> > > and? what is your issue exactly? with which device?
>> > > Please explain the idea brought by your patch.
>> > >
>> >
>> > Netronome NFP devices support 128 queues and future version will support
>> > 1024.
>> >
>> > A particular VF, our PMD just supports VFs, could get as much as 128.
>> > Although that is not likely, that could be an option for some client.
>> >
>> > Clients want to use a DPDK coming with a distribution, so changing the
>> > RTE_ETHDEV_QUEUE_STAT_CNTRS depending on the present devices is not an
>> > option.
>> >
>> > We would be happy if RTE_ETHDEV_QUEUE_STAT_CNTRS could be set to 1024,
>> > covering current and future requirements for our cards, but maybe having
>> > such big arrays inside struct rte_eth_stats is something people do not
>> want
>> > to have.
>> >
>> > A solution could be to create such arrays dynamically based on the
>> device
>> > to get the stats from. For example, call to rte_eth_dev_configure could
>> > have ax extra field for allocating a rte_eth_stats struct, which will be
>> > based on nb_rx_q and nb_tx_q params already given to that function.
>> >
>> > Maybe the first thing to know is what people think about just
>> incrementing
>> > RTE_ETHDEV_QUEUE_STAT_CNTRS to 1024.
>> >
>> > So Thomas, what do you think about this?
>>
>> I think this patch is doing something else :)
>>
>>
> Sure. But the problem the patch solves is pointing to this, IMHO, bigger
> issue.
>
>
>> I'm not sure what is better between big arrays and variable size.
>> I think you must explain these 2 options in another thread,
>> because I'm not sure you will have enough attention in a thread starting
>> with
>> "check number of queues less than RTE_ETHDEV_QUEUE_STAT_CNTRS".
>>
>
> Agree. I'll do that then.
>
> Thanks
>

[dpdk-dev] [PATCH] ethdev: check number of queues less than RTE_ETHDEV_QUEUE_STAT_CNTRS

2016-11-10 Thread Alejandro Lucero

On Thu, Nov 10, 2016 at 4:01 PM, Thomas Monjalon 
wrote:

> 2016-11-10 15:43, Alejandro Lucero:
> > On Thu, Nov 10, 2016 at 2:42 PM, Thomas Monjalon <
> thomas.monjalon at 6wind.com>
> > wrote:
> >
> > > 2016-11-10 14:00, Alejandro Lucero:
> > > > From: Bert van Leeuwen 
> > > >
> > > > A device can have more than RTE_ETHDEV_QUEUE_STAT_CNTRS queues which
> > > > is used inside struct rte_eth_stats. Ideally, DPDK should be built
> with
> > > > RTE_ETHDEV_QUEUE_STAT_CNTRS to the maximum number of queues a device
> > > > can support, 65536, as uint16_t is used for keeping those values for
> > > > RX and TX. But of course, having such big arrays inside struct
> > > rte_eth_stats
> > > > is not a good idea.
> > >
> > > RTE_ETHDEV_QUEUE_STAT_CNTRS come from a limitation in Intel devices.
> > > They have limited number of registers to store the stats per queue.
> > >
> > > > Current default value is 16, which could likely be changed to 32 or
> 64
> > > > without too much opposition. And maybe it would be a good idea to
> modify
> > > > struct rte_eth_stats for allowing dynamically allocated arrays and
> maybe
> > > > some extra fields for keeping the array sizes.
> > >
> > > Yes
> > > and? what is your issue exactly? with which device?
> > > Please explain the idea brought by your patch.
> > >
> >
> > Netronome NFP devices support 128 queues and future version will support
> > 1024.
> >
> > A particular VF, our PMD just supports VFs, could get as much as 128.
> > Although that is not likely, that could be an option for some client.
> >
> > Clients want to use a DPDK coming with a distribution, so changing the
> > RTE_ETHDEV_QUEUE_STAT_CNTRS depending on the present devices is not an
> > option.
> >
> > We would be happy if RTE_ETHDEV_QUEUE_STAT_CNTRS could be set to 1024,
> > covering current and future requirements for our cards, but maybe having
> > such big arrays inside struct rte_eth_stats is something people do not
> want
> > to have.
> >
> > A solution could be to create such arrays dynamically based on the device
> > to get the stats from. For example, call to rte_eth_dev_configure could
> > have ax extra field for allocating a rte_eth_stats struct, which will be
> > based on nb_rx_q and nb_tx_q params already given to that function.
> >
> > Maybe the first thing to know is what people think about just
> incrementing
> > RTE_ETHDEV_QUEUE_STAT_CNTRS to 1024.
> >
> > So Thomas, what do you think about this?
>
> I think this patch is doing something else :)
>
>
Sure. But the problem the patch solves is pointing to this, IMHO, bigger
issue.


> I'm not sure what is better between big arrays and variable size.
> I think you must explain these 2 options in another thread,
> because I'm not sure you will have enough attention in a thread starting
> with
> "check number of queues less than RTE_ETHDEV_QUEUE_STAT_CNTRS".
>

Agree. I'll do that then.

Thanks

[dpdk-dev] [PATCH] ethdev: check number of queues less than RTE_ETHDEV_QUEUE_STAT_CNTRS

2016-11-10 Thread Alejandro Lucero

On Thu, Nov 10, 2016 at 2:42 PM, Thomas Monjalon 
wrote:

> 2016-11-10 14:00, Alejandro Lucero:
> > From: Bert van Leeuwen 
> >
> > A device can have more than RTE_ETHDEV_QUEUE_STAT_CNTRS queues which
> > is used inside struct rte_eth_stats. Ideally, DPDK should be built with
> > RTE_ETHDEV_QUEUE_STAT_CNTRS to the maximum number of queues a device
> > can support, 65536, as uint16_t is used for keeping those values for
> > RX and TX. But of course, having such big arrays inside struct
> rte_eth_stats
> > is not a good idea.
>
> RTE_ETHDEV_QUEUE_STAT_CNTRS come from a limitation in Intel devices.
> They have limited number of registers to store the stats per queue.
>
> > Current default value is 16, which could likely be changed to 32 or 64
> > without too much opposition. And maybe it would be a good idea to modify
> > struct rte_eth_stats for allowing dynamically allocated arrays and maybe
> > some extra fields for keeping the array sizes.
>
> Yes
> and? what is your issue exactly? with which device?
> Please explain the idea brought by your patch.
>

Netronome NFP devices support 128 queues and future version will support
1024.

A particular VF, our PMD just supports VFs, could get as much as 128.
Although that is not likely, that could be an option for some client.

Clients want to use a DPDK coming with a distribution, so changing the
RTE_ETHDEV_QUEUE_STAT_CNTRS depending on the present devices is not an
option.

We would be happy if RTE_ETHDEV_QUEUE_STAT_CNTRS could be set to 1024,
covering current and future requirements for our cards, but maybe having
such big arrays inside struct rte_eth_stats is something people do not want
to have.

A solution could be to create such arrays dynamically based on the device
to get the stats from. For example, call to rte_eth_dev_configure could
have ax extra field for allocating a rte_eth_stats struct, which will be
based on nb_rx_q and nb_tx_q params already given to that function.

Maybe the first thing to know is what people think about just incrementing
RTE_ETHDEV_QUEUE_STAT_CNTRS to 1024.

So Thomas, what do you think about this?

[dpdk-dev] [PATCH] ethdev: check number of queues less than RTE_ETHDEV_QUEUE_STAT_CNTRS

2016-11-10 Thread Alejandro Lucero

From: Bert van Leeuwen <bert.vanleeu...@netronome.com>

A device can have more than RTE_ETHDEV_QUEUE_STAT_CNTRS queues which
is used inside struct rte_eth_stats. Ideally, DPDK should be built with
RTE_ETHDEV_QUEUE_STAT_CNTRS to the maximum number of queues a device
can support, 65536, as uint16_t is used for keeping those values for
RX and TX. But of course, having such big arrays inside struct rte_eth_stats
is not a good idea.

Current default value is 16, which could likely be changed to 32 or 64
without too much opposition. And maybe it would be a good idea to modify
struct rte_eth_stats for allowing dynamically allocated arrays and maybe
some extra fields for keeping the array sizes.

Signed-off-by: Alejandro Lucero 
---
 lib/librte_ether/rte_ethdev.c | 25 +
 1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index fde8112..4209ad0 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1343,8 +1343,10 @@ get_xstats_count(uint8_t port_id)
} else
count = 0;
count += RTE_NB_STATS;
-   count += dev->data->nb_rx_queues * RTE_NB_RXQ_STATS;
-   count += dev->data->nb_tx_queues * RTE_NB_TXQ_STATS;
+   count += RTE_MIN(dev->data->nb_rx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS) *
+RTE_NB_RXQ_STATS;
+   count += RTE_MIN(dev->data->nb_tx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS) *
+RTE_NB_TXQ_STATS;
return count;
 }

@@ -1358,6 +1360,7 @@ rte_eth_xstats_get_names(uint8_t port_id,
int cnt_expected_entries;
int cnt_driver_entries;
uint32_t idx, id_queue;
+   uint16_t num_q;

cnt_expected_entries = get_xstats_count(port_id);
if (xstats_names == NULL || cnt_expected_entries < 0 ||
@@ -1374,7 +1377,8 @@ rte_eth_xstats_get_names(uint8_t port_id,
"%s", rte_stats_strings[idx].name);
cnt_used_entries++;
}
-   for (id_queue = 0; id_queue < dev->data->nb_rx_queues; id_queue++) {
+   num_q = RTE_MIN(dev->data->nb_rx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
+   for (id_queue = 0; id_queue < num_q; id_queue++) {
for (idx = 0; idx < RTE_NB_RXQ_STATS; idx++) {
snprintf(xstats_names[cnt_used_entries].name,
sizeof(xstats_names[0].name),
@@ -1384,7 +1388,8 @@ rte_eth_xstats_get_names(uint8_t port_id,
}

}
-   for (id_queue = 0; id_queue < dev->data->nb_tx_queues; id_queue++) {
+   num_q = RTE_MIN(dev->data->nb_tx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
+   for (id_queue = 0; id_queue < num_q; id_queue++) {
for (idx = 0; idx < RTE_NB_TXQ_STATS; idx++) {
snprintf(xstats_names[cnt_used_entries].name,
sizeof(xstats_names[0].name),
@@ -1420,14 +1425,18 @@ rte_eth_xstats_get(uint8_t port_id, struct 
rte_eth_xstat *xstats,
unsigned count = 0, i, q;
signed xcount = 0;
uint64_t val, *stats_ptr;
+   uint16_t nb_rxqs, nb_txqs;

RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);

dev = _eth_devices[port_id];

+   nb_rxqs = RTE_MIN(dev->data->nb_rx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
+   nb_txqs = RTE_MIN(dev->data->nb_tx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS);
+
/* Return generic statistics */
-   count = RTE_NB_STATS + (dev->data->nb_rx_queues * RTE_NB_RXQ_STATS) +
-   (dev->data->nb_tx_queues * RTE_NB_TXQ_STATS);
+   count = RTE_NB_STATS + (nb_rxqs * RTE_NB_RXQ_STATS) +
+   (nb_txqs * RTE_NB_TXQ_STATS);

/* implemented by the driver */
if (dev->dev_ops->xstats_get != NULL) {
@@ -1458,7 +1467,7 @@ rte_eth_xstats_get(uint8_t port_id, struct rte_eth_xstat 
*xstats,
}

/* per-rxq stats */
-   for (q = 0; q < dev->data->nb_rx_queues; q++) {
+   for (q = 0; q < nb_rxqs; q++) {
for (i = 0; i < RTE_NB_RXQ_STATS; i++) {
stats_ptr = RTE_PTR_ADD(_stats,
rte_rxq_stats_strings[i].offset +
@@ -1469,7 +1478,7 @@ rte_eth_xstats_get(uint8_t port_id, struct rte_eth_xstat 
*xstats,
}

/* per-txq stats */
-   for (q = 0; q < dev->data->nb_tx_queues; q++) {
+   for (q = 0; q < nb_txqs; q++) {
for (i = 0; i < RTE_NB_TXQ_STATS; i++) {
stats_ptr = RTE_PTR_ADD(_stats,
rte_txq_stats_strings[i].offset +
-- 
1.9.1

[dpdk-dev] Fwd: mbuf changes

2016-11-10 Thread Alejandro Lucero

I forgot to include dev at dpdk.org in my response.

My comment at the end o this email.

On Wed, Oct 26, 2016 at 10:28 AM, Alejandro Lucero <
alejandro.lucero at netronome.com> wrote:

>
>
> On Tue, Oct 25, 2016 at 2:05 PM, Bruce Richardson <
> bruce.richardson at intel.com> wrote:
>
>> On Tue, Oct 25, 2016 at 05:24:28PM +0530, Shreyansh Jain wrote:
>> > On Monday 24 October 2016 09:55 PM, Bruce Richardson wrote:
>> > > On Mon, Oct 24, 2016 at 04:11:33PM +, Wiles, Keith wrote:
>> > > >
>> > > > > On Oct 24, 2016, at 10:49 AM, Morten Br?rup <
>> mb at smartsharesystems.com> wrote:
>> > > > >
>> > > > > First of all: Thanks for a great DPDK Userspace 2016!
>> > > > >
>> > > > >
>> > > > >
>> > > > > Continuing the Userspace discussion about Olivier Matz?s proposed
>> mbuf changes...
>> > >
>> > > Thanks for keeping the discussion going!
>> > > > >
>> > > > >
>> > > > >
>> > > > > 1.
>> > > > >
>> > > > > Stephen Hemminger had a noteworthy general comment about keeping
>> metadata for the NIC in the appropriate section of the mbuf: Metadata
>> generated by the NIC?s RX handler belongs in the first cache line, and
>> metadata required by the NIC?s TX handler belongs in the second cache line.
>> This also means that touching the second cache line on ingress should be
>> avoided if possible; and Bruce Richardson mentioned that for this reason
>> m->next was zeroed on free().
>> > > > >
>> > > Thinking about it, I suspect there are more fields we can reset on
>> free
>> > > to save time on alloc. Refcnt, as discussed below is one of them, but
>> so
>> > > too could be the nb_segs field and possibly others.
>> > >
>> > > > >
>> > > > >
>> > > > > 2.
>> > > > >
>> > > > > There seemed to be consensus that the size of m->refcnt should
>> match the size of m->port because a packet could be duplicated on all
>> physical ports for L3 multicast and L2 flooding.
>> > > > >
>> > > > > Furthermore, although a single physical machine (i.e. a single
>> server) with 255 physical ports probably doesn?t exist, it might contain
>> more than 255 virtual machines with a virtual port each, so it makes sense
>> extending these mbuf fields from 8 to 16 bits.
>> > > >
>> > > > I thought we also talked about removing the m->port from the mbuf
>> as it is not really needed.
>> > > >
>> > > Yes, this was mentioned, and also the option of moving the port value
>> to
>> > > the second cacheline, but it appears that NXP are using the port value
>> > > in their NIC drivers for passing in metadata, so we'd need their
>> > > agreement on any move (or removal).
>> >
>> > I am not sure where NXP's NIC came into picture on this, but now that
>> it is
>> > highlighted, this field is required for libevent implementation [1].
>> >
>> > A scheduler sending an event, which can be a packet, would only have
>> > information of a flow_id. From this matching it back to a port, without
>> > mbuf->port, would be very difficult (costly). There may be way around
>> this
>> > but at least in current proposal I think port would be important to
>> have -
>> > even if in second cache line.
>> >
>> > But, off the top of my head, as of now it is not being used for any
>> specific
>> > purpose in NXP's PMD implementation.
>> >
>> > Even the SoC patches don't necessarily rely on it except using it
>> because it
>> > is available.
>> >
>> > @Bruce: where did you get the NXP context here from?
>> >
>> Oh, I'm just mis-remembering. :-( It was someone else who was looking for
>> this - Netronome, perhaps?
>>
>> CC'ing Alejandro in the hope I'm remembering correctly second time
>> round!
>>
>>
> Yes. Thanks Bruce!
>
> So Netronome uses the port field and, as I commented on the user meeting,
> we are happy with the field going from 8 to 16 bits.
>
> In our case, this is something some clients have demanded, and if I'm not
> wrong (I'll double check this asap), the port value is for knowing where
> the packet is coming from. Think about a switch in the NIC, with ports
> linked to VFs/VMs, and one or more physical ports. That port value is not
> related to DPDK ports but to the switch ports. Code in the host (DPDK or
> not) can receive packets from the wire or from VFs through the NIC. This is
> also true for packets received by VMs, but I guess the port value is just
> interested for host code.
>
>
>

I consulted this functionality internally and it seems we do not need this
anymore. In fact, I will remove the metadata port handling soon from our
PMD.



> /Bruce
>>
>
>

[dpdk-dev] mbuf changes

2016-11-09 Thread Alejandro Lucero

On Wed, Oct 26, 2016 at 10:28 AM, Alejandro Lucero <
alejandro.lucero at netronome.com> wrote:

>
>
> On Tue, Oct 25, 2016 at 2:05 PM, Bruce Richardson <
> bruce.richardson at intel.com> wrote:
>
>> On Tue, Oct 25, 2016 at 05:24:28PM +0530, Shreyansh Jain wrote:
>> > On Monday 24 October 2016 09:55 PM, Bruce Richardson wrote:
>> > > On Mon, Oct 24, 2016 at 04:11:33PM +, Wiles, Keith wrote:
>> > > >
>> > > > > On Oct 24, 2016, at 10:49 AM, Morten Br?rup <
>> mb at smartsharesystems.com> wrote:
>> > > > >
>> > > > > First of all: Thanks for a great DPDK Userspace 2016!
>> > > > >
>> > > > >
>> > > > >
>> > > > > Continuing the Userspace discussion about Olivier Matz?s proposed
>> mbuf changes...
>> > >
>> > > Thanks for keeping the discussion going!
>> > > > >
>> > > > >
>> > > > >
>> > > > > 1.
>> > > > >
>> > > > > Stephen Hemminger had a noteworthy general comment about keeping
>> metadata for the NIC in the appropriate section of the mbuf: Metadata
>> generated by the NIC?s RX handler belongs in the first cache line, and
>> metadata required by the NIC?s TX handler belongs in the second cache line.
>> This also means that touching the second cache line on ingress should be
>> avoided if possible; and Bruce Richardson mentioned that for this reason
>> m->next was zeroed on free().
>> > > > >
>> > > Thinking about it, I suspect there are more fields we can reset on
>> free
>> > > to save time on alloc. Refcnt, as discussed below is one of them, but
>> so
>> > > too could be the nb_segs field and possibly others.
>> > >
>> > > > >
>> > > > >
>> > > > > 2.
>> > > > >
>> > > > > There seemed to be consensus that the size of m->refcnt should
>> match the size of m->port because a packet could be duplicated on all
>> physical ports for L3 multicast and L2 flooding.
>> > > > >
>> > > > > Furthermore, although a single physical machine (i.e. a single
>> server) with 255 physical ports probably doesn?t exist, it might contain
>> more than 255 virtual machines with a virtual port each, so it makes sense
>> extending these mbuf fields from 8 to 16 bits.
>> > > >
>> > > > I thought we also talked about removing the m->port from the mbuf
>> as it is not really needed.
>> > > >
>> > > Yes, this was mentioned, and also the option of moving the port value
>> to
>> > > the second cacheline, but it appears that NXP are using the port value
>> > > in their NIC drivers for passing in metadata, so we'd need their
>> > > agreement on any move (or removal).
>> >
>> > I am not sure where NXP's NIC came into picture on this, but now that
>> it is
>> > highlighted, this field is required for libevent implementation [1].
>> >
>> > A scheduler sending an event, which can be a packet, would only have
>> > information of a flow_id. From this matching it back to a port, without
>> > mbuf->port, would be very difficult (costly). There may be way around
>> this
>> > but at least in current proposal I think port would be important to
>> have -
>> > even if in second cache line.
>> >
>> > But, off the top of my head, as of now it is not being used for any
>> specific
>> > purpose in NXP's PMD implementation.
>> >
>> > Even the SoC patches don't necessarily rely on it except using it
>> because it
>> > is available.
>> >
>> > @Bruce: where did you get the NXP context here from?
>> >
>> Oh, I'm just mis-remembering. :-( It was someone else who was looking for
>> this - Netronome, perhaps?
>>
>> CC'ing Alejandro in the hope I'm remembering correctly second time
>> round!
>>
>>
> Yes. Thanks Bruce!
>
> So Netronome uses the port field and, as I commented on the user meeting,
> we are happy with the field going from 8 to 16 bits.
>
> In our case, this is something some clients have demanded, and if I'm not
> wrong (I'll double check this asap), the port value is for knowing where
> the packet is coming from. Think about a switch in the NIC, with ports
> linked to VFs/VMs, and one or more physical ports. That port value is not
> related to DPDK ports but to the switch ports. Code in the host (DPDK or
> not) can receive packets from the wire or from VFs through the NIC. This is
> also true for packets received by VMs, but I guess the port value is just
> interested for host code.
>
>
>

I consulted this functionality internally and it seems we do not need this
anymore. In fact, I will remove the metadata port handling soon from our
PMD.



> /Bruce
>>
>
>

[dpdk-dev] mbuf changes

2016-10-26 Thread Alejandro Lucero

On Tue, Oct 25, 2016 at 2:05 PM, Bruce Richardson <
bruce.richardson at intel.com> wrote:

> On Tue, Oct 25, 2016 at 05:24:28PM +0530, Shreyansh Jain wrote:
> > On Monday 24 October 2016 09:55 PM, Bruce Richardson wrote:
> > > On Mon, Oct 24, 2016 at 04:11:33PM +, Wiles, Keith wrote:
> > > >
> > > > > On Oct 24, 2016, at 10:49 AM, Morten Br?rup <
> mb at smartsharesystems.com> wrote:
> > > > >
> > > > > First of all: Thanks for a great DPDK Userspace 2016!
> > > > >
> > > > >
> > > > >
> > > > > Continuing the Userspace discussion about Olivier Matz?s proposed
> mbuf changes...
> > >
> > > Thanks for keeping the discussion going!
> > > > >
> > > > >
> > > > >
> > > > > 1.
> > > > >
> > > > > Stephen Hemminger had a noteworthy general comment about keeping
> metadata for the NIC in the appropriate section of the mbuf: Metadata
> generated by the NIC?s RX handler belongs in the first cache line, and
> metadata required by the NIC?s TX handler belongs in the second cache line.
> This also means that touching the second cache line on ingress should be
> avoided if possible; and Bruce Richardson mentioned that for this reason
> m->next was zeroed on free().
> > > > >
> > > Thinking about it, I suspect there are more fields we can reset on free
> > > to save time on alloc. Refcnt, as discussed below is one of them, but
> so
> > > too could be the nb_segs field and possibly others.
> > >
> > > > >
> > > > >
> > > > > 2.
> > > > >
> > > > > There seemed to be consensus that the size of m->refcnt should
> match the size of m->port because a packet could be duplicated on all
> physical ports for L3 multicast and L2 flooding.
> > > > >
> > > > > Furthermore, although a single physical machine (i.e. a single
> server) with 255 physical ports probably doesn?t exist, it might contain
> more than 255 virtual machines with a virtual port each, so it makes sense
> extending these mbuf fields from 8 to 16 bits.
> > > >
> > > > I thought we also talked about removing the m->port from the mbuf as
> it is not really needed.
> > > >
> > > Yes, this was mentioned, and also the option of moving the port value
> to
> > > the second cacheline, but it appears that NXP are using the port value
> > > in their NIC drivers for passing in metadata, so we'd need their
> > > agreement on any move (or removal).
> >
> > I am not sure where NXP's NIC came into picture on this, but now that it
> is
> > highlighted, this field is required for libevent implementation [1].
> >
> > A scheduler sending an event, which can be a packet, would only have
> > information of a flow_id. From this matching it back to a port, without
> > mbuf->port, would be very difficult (costly). There may be way around
> this
> > but at least in current proposal I think port would be important to have
> -
> > even if in second cache line.
> >
> > But, off the top of my head, as of now it is not being used for any
> specific
> > purpose in NXP's PMD implementation.
> >
> > Even the SoC patches don't necessarily rely on it except using it
> because it
> > is available.
> >
> > @Bruce: where did you get the NXP context here from?
> >
> Oh, I'm just mis-remembering. :-( It was someone else who was looking for
> this - Netronome, perhaps?
>
> CC'ing Alejandro in the hope I'm remembering correctly second time
> round!
>
>
Yes. Thanks Bruce!

So Netronome uses the port field and, as I commented on the user meeting,
we are happy with the field going from 8 to 16 bits.

In our case, this is something some clients have demanded, and if I'm not
wrong (I'll double check this asap), the port value is for knowing where
the packet is coming from. Think about a switch in the NIC, with ports
linked to VFs/VMs, and one or more physical ports. That port value is not
related to DPDK ports but to the switch ports. Code in the host (DPDK or
not) can receive packets from the wire or from VFs through the NIC. This is
also true for packets received by VMs, but I guess the port value is just
interested for host code.



> /Bruce
>

[dpdk-dev] [PATCH v2] nfp: unregister interrupt callback when closing

2016-09-16 Thread Alejandro Lucero

With an app using hotplug feature, when a device is unplugged without
unregistering makes the interrupt handling unstable.

Fixes: 6c53f87b3497 ("nfp: add link status interrupt")

Signed-off-by: Alejandro Lucero 
---
 drivers/net/nfp/nfp_net.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index d79f0a1..f78eb82 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -733,6 +733,11 @@ nfp_net_close(struct rte_eth_dev *dev)
rte_intr_disable(>pci_dev->intr_handle);
nn_cfg_writeb(hw, NFP_NET_CFG_LSC, 0xff);

+   /* unregister callback func from eal lib */
+   rte_intr_callback_unregister(>pci_dev->intr_handle,
+nfp_net_dev_interrupt_handler,
+(void *)dev);
+
/*
 * The ixgbe PMD driver disables the pcie master on the
 * device. The i40e does not...
-- 
1.9.1

[dpdk-dev] [PATCH v2] nfp: fixing bug when copying MAC address

2016-09-16 Thread Alejandro Lucero

Fixes: defb9a5dd156 ("nfp: introduce driver initialization")

Signed-off-by: Alejandro Lucero 
---
 drivers/net/nfp/nfp_net.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index 1948a12..d79f0a1 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -2421,8 +2421,8 @@ nfp_net_init(struct rte_eth_dev *eth_dev)
eth_random_addr(>mac_addr[0]);

/* Copying mac address to DPDK eth_dev struct */
-   ether_addr_copy(_dev->data->mac_addrs[0],
-   (struct ether_addr *)hw->mac_addr);
+   ether_addr_copy((struct ether_addr *)hw->mac_addr,
+   _dev->data->mac_addrs[0]);

PMD_INIT_LOG(INFO, "port %d VendorID=0x%x DeviceID=0x%x "
 "mac=%02x:%02x:%02x:%02x:%02x:%02x",
-- 
1.9.1

[dpdk-dev] [PATCH v2] nfp: using random MAC address if not configured

2016-09-16 Thread Alejandro Lucero

Signed-off-by: Alejandro Lucero 
---
 drivers/net/nfp/nfp_net.c | 28 
 1 file changed, 16 insertions(+), 12 deletions(-)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index 82e3e4e..1948a12 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -607,18 +607,8 @@ nfp_net_rx_freelist_setup(struct rte_eth_dev *dev)
 static void
 nfp_net_params_setup(struct nfp_net_hw *hw)
 {
-   uint32_t *mac_address;
-
nn_cfg_writel(hw, NFP_NET_CFG_MTU, hw->mtu);
nn_cfg_writel(hw, NFP_NET_CFG_FLBUFSZ, hw->flbufsz);
-
-   /* A MAC address is 8 bytes long */
-   mac_address = (uint32_t *)(hw->mac_addr);
-
-   nn_cfg_writel(hw, NFP_NET_CFG_MACADDR,
- rte_cpu_to_be_32(*mac_address));
-   nn_cfg_writel(hw, NFP_NET_CFG_MACADDR + 4,
- rte_cpu_to_be_32(*(mac_address + 4)));
 }

 static void
@@ -627,6 +617,17 @@ nfp_net_cfg_queue_setup(struct nfp_net_hw *hw)
hw->qcp_cfg = hw->tx_bar + NFP_QCP_QUEUE_ADDR_SZ;
 }

+static void nfp_net_read_mac(struct nfp_net_hw *hw)
+{
+   uint32_t tmp;
+
+   tmp = rte_be_to_cpu_32(nn_cfg_readl(hw, NFP_NET_CFG_MACADDR));
+   memcpy(>mac_addr[0], , sizeof(struct ether_addr));
+
+   tmp = rte_be_to_cpu_32(nn_cfg_readl(hw, NFP_NET_CFG_MACADDR + 4));
+   memcpy(>mac_addr[4], , 2);
+}
+
 static int
 nfp_net_start(struct rte_eth_dev *dev)
 {
@@ -2413,8 +2414,11 @@ nfp_net_init(struct rte_eth_dev *eth_dev)
return -ENOMEM;
}

-   /* Using random mac addresses for VFs */
-   eth_random_addr(>mac_addr[0]);
+   nfp_net_read_mac(hw);
+
+   if (!is_valid_assigned_ether_addr((struct ether_addr *)>mac_addr))
+   /* Using random mac addresses for VFs */
+   eth_random_addr(>mac_addr[0]);

/* Copying mac address to DPDK eth_dev struct */
ether_addr_copy(_dev->data->mac_addrs[0],
-- 
1.9.1

[dpdk-dev] [PATCH] nfp: using random mac address if not a configured mac

2016-09-16 Thread Alejandro Lucero

Thank you for the feedback.

I'll send the fixed patched today.



On Tue, Sep 13, 2016 at 8:30 PM, Thomas Monjalon 
wrote:

> 2016-09-13 18:10, Ferruh Yigit:
> > Hi Alejandro,
> >
> > On 8/16/2016 4:15 PM, Alejandro Lucero wrote:
> > > Signed-off-by: Alejandro Lucero 
> > > ---
> >
> > There are following checkpatch warnings, also check-git-log complains:
> >
> > Headline too long:
> > nfp: unregister interrupt callback function when closing device
>
> Just a tip to keep headlines short: you can often remove some common words.
> Here we still understand the idea without "function" and "device".
> nfp: unregister interrupt callback when closing
>

[dpdk-dev] [PATCH 2/2] virtio: support IOMMU platform

2016-09-04 Thread Alejandro Lucero

I know RedHat is working on a vIOMMU so I guess this work is related to
that effort, but it is a surprise virtio using IOMMU. I thought IOMMU just
made sense when using SRIOV. My second guess is using IOMMU with virtio is
a matter of security, but by other hand, virtio + IOMMU could imply serious
performance degradation when multiple VMs are in use. I'm talking about
IOMMU contention, exactly about IOTLB contention. This performance issue is
complex to describe or even analyze as there are several factors having an
impact on it. For example, 1GB hugepages can avoid most of it and the same
if TX & RX rings are not bigger than 256. So, my question: is RedHat aware
of this potential IOMMU contention which can limit scalability?

On Fri, Sep 2, 2016 at 6:26 PM, Michael S. Tsirkin  wrote:

> On Fri, Sep 02, 2016 at 03:04:56PM +0200, Thomas Monjalon wrote:
> > 2016-09-02 14:37, Jason Wang:
> > > Virtio pmd doesn't support VFIO in the past since devices bypass IOMMU
> > > completely. But recently, the work of making virtio device work with
> > > IOMMU is near to complete.
> >
> > Good news!
> > What are the requirements for Qemu and Linux version numbers please?
>
> I expect QEMU 2.8 and Linux 4.8 to have the support.
>

[dpdk-dev] [PATCH] nfp: unregister interrupt callback function when closing device

2016-08-16 Thread Alejandro Lucero

With an app using hotplug feature, when a device is unplugged without
unregistering makes the interrupt handling unstable.

Fixes: 6c53f87b3497 ("nfp: add link status interrupt")

Signed-off-by: Alejandro Lucero 
---
 drivers/net/nfp/nfp_net.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index 229c8e6..94c64db 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -733,6 +733,11 @@ nfp_net_close(struct rte_eth_dev *dev)
rte_intr_disable(>pci_dev->intr_handle);
nn_cfg_writeb(hw, NFP_NET_CFG_LSC, 0xff);

+   /* unregister callback func from eal lib */
+   rte_intr_callback_unregister(>pci_dev->intr_handle,
+nfp_net_dev_interrupt_handler,
+(void *)dev);
+
/*
 * The ixgbe PMD driver disables the pcie master on the
 * device. The i40e does not...
-- 
1.9.1

[dpdk-dev] [PATCH] nfp: fixing bug when copying mac address

2016-08-16 Thread Alejandro Lucero

Fixes: defb9a5dd156 ("nfp: introduce driver initialization")

Signed-off-by: Alejandro Lucero 
---
 drivers/net/nfp/nfp_net.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index 45d122d..229c8e6 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -2421,8 +2421,8 @@ nfp_net_init(struct rte_eth_dev *eth_dev)
eth_random_addr(>mac_addr[0]);

/* Copying mac address to DPDK eth_dev struct */
-   ether_addr_copy(_dev->data->mac_addrs[0],
-   (struct ether_addr *)hw->mac_addr);
+   ether_addr_copy((struct ether_addr *)hw->mac_addr,
+   _dev->data->mac_addrs[0]);

PMD_INIT_LOG(INFO, "port %d VendorID=0x%x DeviceID=0x%x "
 "mac=%02x:%02x:%02x:%02x:%02x:%02x",
-- 
1.9.1

[dpdk-dev] [PATCH] nfp: using random mac address if not a configured mac

2016-08-16 Thread Alejandro Lucero

Signed-off-by: Alejandro Lucero 
---
 drivers/net/nfp/nfp_net.c | 28 
 1 file changed, 16 insertions(+), 12 deletions(-)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index 82e3e4e..45d122d 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -607,18 +607,8 @@ nfp_net_rx_freelist_setup(struct rte_eth_dev *dev)
 static void
 nfp_net_params_setup(struct nfp_net_hw *hw)
 {
-   uint32_t *mac_address;
-
nn_cfg_writel(hw, NFP_NET_CFG_MTU, hw->mtu);
nn_cfg_writel(hw, NFP_NET_CFG_FLBUFSZ, hw->flbufsz);
-
-   /* A MAC address is 8 bytes long */
-   mac_address = (uint32_t *)(hw->mac_addr);
-
-   nn_cfg_writel(hw, NFP_NET_CFG_MACADDR,
- rte_cpu_to_be_32(*mac_address));
-   nn_cfg_writel(hw, NFP_NET_CFG_MACADDR + 4,
- rte_cpu_to_be_32(*(mac_address + 4)));
 }

 static void
@@ -627,6 +617,17 @@ nfp_net_cfg_queue_setup(struct nfp_net_hw *hw)
hw->qcp_cfg = hw->tx_bar + NFP_QCP_QUEUE_ADDR_SZ;
 }

+static void nfp_net_read_mac(struct nfp_net_hw *hw) {
+
+   uint32_t tmp;
+
+   tmp = rte_be_to_cpu_32(nn_cfg_readl(hw, NFP_NET_CFG_MACADDR));
+   memcpy(>mac_addr[0], , sizeof(struct ether_addr));
+
+   tmp = rte_be_to_cpu_32(nn_cfg_readl(hw, NFP_NET_CFG_MACADDR + 4));
+   memcpy(>mac_addr[4], , 2);
+}
+
 static int
 nfp_net_start(struct rte_eth_dev *dev)
 {
@@ -2413,8 +2414,11 @@ nfp_net_init(struct rte_eth_dev *eth_dev)
return -ENOMEM;
}

-   /* Using random mac addresses for VFs */
-   eth_random_addr(>mac_addr[0]);
+   nfp_net_read_mac(hw);
+
+   if (!is_valid_assigned_ether_addr((struct ether_addr 
*)>mac_addr[0]))
+   /* Using random mac addresses for VFs */
+   eth_random_addr(>mac_addr[0]);

/* Copying mac address to DPDK eth_dev struct */
ether_addr_copy(_dev->data->mac_addrs[0],
-- 
1.9.1

[dpdk-dev] memory allocation requirements

2016-05-18 Thread Alejandro Lucero

On Wed, Apr 13, 2016 at 5:03 PM, Thomas Monjalon 
wrote:

> After looking at the patches for container support, it appears that
> some changes are needed in the memory management:
> http://thread.gmane.org/gmane.comp.networking.dpdk.devel/32786/focus=32788
>
> I think it is time to collect what are the needs and expectations of
> the DPDK memory allocator. The goal is to satisfy every needs while
> cleaning the API.
> Here is a first try to start the discussion.
>
> The memory allocator has 2 classes of API in DPDK.
> First the user/application allows or requires DPDK to take over some
> memory resources of the system. The characteristics can be:
> - numa node
> - page size
> - swappable or not
> - contiguous (cannot be guaranteed) or not
> - physical address (as root only)
> Then the drivers or other libraries use the memory through
> - rte_malloc
> - rte_memzone
> - rte_mempool
> I think we can integrate the characteristics of the requested memory
> in rte_malloc. Then rte_memzone would be only a named rte_malloc.
> The rte_mempool still focus on collection of objects with cache.
>
> If a rework happens, maybe that the build options CONFIG_RTE_LIBRTE_IVSHMEM
> and CONFIG_RTE_EAL_SINGLE_FILE_SEGMENTS can be removed.
> The Xen support should also be better integrated.
>
> Currently, the first class of API is directly implemented as command line
> parameters. Please let's think of C functions first.
> The EAL parameters should simply wrap some API functions and let the
> applications tune the memory initialization with a well documented API.
>
> Probably that I forget some needs, e.g. for the secondary processes.
> Please comment.
>

Just to mention VFIO IOMMU mapping should be adjusted for just those
memsegs physically contiguous which rte_pktmbuf_pool_create will allocate
along with those hugepages backing driver/device descriptor rings. Mapping
all the memsegs is not a performance issue but I think it is the right
thing to do.

Maybe some memseg flag like "DMA_CAPABLE" or similar should be used for
IOMMU mapping.

Other question is avoiding to use mbufs from no "DMA_CAPABLE" segments with
a device. I'm thinking about an DPDK app using a virtio network driver and
a device-backed PMD at the same time what could be a possibility for having
best of both worlds (intra-host and inter-host VM communications).

[dpdk-dev] [dpdk-dev, 2/3] eth_dev: add support for device dma mask

2016-05-13 Thread Alejandro Lucero

On Thu, May 12, 2016 at 4:41 PM, Jan Viktorin 
wrote:

> Hi,
>
> Just a note, please, when replying inline, do not prepend ">" before your
> new text. I could not find your replies.
>
>
My gmail interface does not show that prepend character... It seems I have
to leave a white line before my replies.



> See below...
>
> On Thu, 12 May 2016 16:03:14 +0100
> Alejandro Lucero  wrote:
>
> > Hi Jan
> >
> > On Thu, May 12, 2016 at 3:52 PM, Jan Viktorin 
> > wrote:
> >
> > > Hello Alejandro,
> > >
> > > On Thu, 12 May 2016 15:33:59 +0100
> > > "Alejandro.Lucero"  wrote:
> > >
> > > > - New dma_mask field in rte_eth_dev_data.
> > > >  - If PMD sets device dma_mask, call to check hugepages within
> > >
> > > I think that one of the purposes of DPDK is to support DMA transfers
> > > in userspace. In other words, I can see no reason to support dma_mask
> > > at the ethdev level only.
> > >
> > > The limitation is a device limitation so I can not see a better place
> for
> > adding the device dma mask.
>
> That's what I've meant. It is a _device_ limitation. The ethdev is a
> wrapper
> around the rte_pci_device. The ethdev just extends semantics of the
> generic device.
> However, all DPDK devices are expected to be DMA-capable.
>
> If you get a pointer to the ethdev, you get a pointer to the
> rte_pci_device as well
> (1 more level of dereference but we are not on the fast path here, so it's
> unimportant).
>
> Consider the cryptodev. If cryptodev has some DMA mask requirements we can
> support it
> in the generic place, i.e. rte_pci_device and not rte_ethdev because the
> cryptodev
> is not an ethdev.
>
>
Ok. I was wrongly assuming we had just ethdevs, with the ethdev being the
generic and rte_pci_device being a type of ethdev.

I can add the dma mask to the rte_pci_dev. The extra level of dereference
is not a problem as long as we do not use that dma mask for a more complex
allocation API (more about this later).

If I understand it right, work is in progress for adding a rte_device. I
can not see a problem with adding dma mask to the rte_device struct either.



> >
> >
> > > We should consider adding this to the generic struct rte_device
> > > (currently rte_pci_device). Thomas?
> > >
> > > I guess it could be a non-pci device with such a limitation. I though
> > rte_ethdev is more generic.
>
> When it is added to the rte_pci_device (or rte_device after the planned
> changes)
> the non-PCI devices get this for free... Otherwise I don't understand the
> point
> here.
>
> >
> >
> > > >supported range.
> > >
> > > I think, the '-' is unnecessary at the beginning of line. As for me
> > > I prefer a fluent text describing the purpose. The '-' is useful for
> > > a real list of notes.
> > >
> > > >
> > > > Signed-off-by: Alejandro Lucero 
> > > >
> > > > ---
> > > > lib/librte_ether/rte_ethdev.c | 7 +++
> > > >  lib/librte_ether/rte_ethdev.h | 1 +
> > > >  2 files changed, 8 insertions(+)
> > > >
> > > > diff --git a/lib/librte_ether/rte_ethdev.c
> > > b/lib/librte_ether/rte_ethdev.c
> > > > index a31018e..c0de88a 100644
> > > > --- a/lib/librte_ether/rte_ethdev.c
> > > > +++ b/lib/librte_ether/rte_ethdev.c
> > > > @@ -280,9 +280,16 @@ rte_eth_dev_init(struct rte_pci_driver *pci_drv,
> > > >
> > > >   /* Invoke PMD device initialization function */
> > > >   diag = (*eth_drv->eth_dev_init)(eth_dev);
> > > > + if (diag)
> > > > + goto err;
> > > > +
> > > > + if (eth_dev->data->dma_mask)
> > > > + diag =
> > > rte_eal_hugepage_check_address_mask(eth_dev->data->dma_mask);
> > > > +
> > >
> > > I don't understand what happens if the memory is out of the DMA mask.
> What
> > > can the driver
> > > do? Does it just fail?
> > >
> > > I think that this should be considered during a malloc instead. (Well,
> > > there is probably
> > > no suitable API for this at the moment.)
> > >
> > > hugepage memory allocation is done before device initialization. I see
> > easier to leave the normal hugepage code as it is now and add a later
> call
> > if a device requires it.
>
> True. I didn't meant to change hugepages allocation but to change
> allocation
> of memor

[dpdk-dev] [dpdk-dev,3/3] nfp: set device dma mask

2016-05-12 Thread Alejandro Lucero

On Thu, May 12, 2016 at 4:03 PM, Jan Viktorin 
wrote:

> On Thu, 12 May 2016 15:34:00 +0100
> "Alejandro.Lucero"  wrote:
>
> > - Just hugepages within the supported range will be available.
>
> Again the hyphen is redundant here.
>
> By the way, this text does not describe the change well. If I understood
> the whole patch set (I am not quite sure now), the initialization would
> fail if there are hugepages out of the given DMA mask. Am I wrong?
>
>
You are right.


> I'd expect something like "NFP supports DMA address in range ...".
>
>
That is a good idea. I was thinking on adding a memseg dump info as well
which would help to understand this issue and other related to memory
allocation.


> >
> > Signed-off-by: Alejandro Lucero 
> >
> > ---
> > drivers/net/nfp/nfp_net.c | 11 +++
> >  1 file changed, 11 insertions(+)
> >
> > diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
> > index ea5a2a3..e0e444a 100644
> > --- a/drivers/net/nfp/nfp_net.c
> > +++ b/drivers/net/nfp/nfp_net.c
> > @@ -115,6 +115,14 @@ enum nfp_qcp_ptr {
> >   NFP_QCP_WRITE_PTR
> >  };
> >
> > +#ifndef DMA_64BIT_MASK
> > +#define DMA_64BIT_MASK  0xULL
> > +#endif
> > +
> > +#ifndef DMA_BIT_MASK
> > +#define DMA_BIT_MASK(n) (((n) == 64) ? DMA_64BIT_MASK : ((1ULL<<(n))-1))
> > +#endif
>
> This is quite a generic code, I'd put it into the EAL. Probably, it should
> be renamed to something like RTE_DMA_BIT_MASK.


OK.


>
>
> +
> >  /*
> >   * nfp_qcp_ptr_add - Add the value to the selected pointer of a queue
> >   * @q: Base address for queue structure
> > @@ -2441,6 +2449,9 @@ nfp_net_init(struct rte_eth_dev *eth_dev)
> >   /* Recording current stats counters values */
> >   nfp_net_stats_reset(eth_dev);
> >
> > + /* Setting dma_mask */
> > + eth_dev->data->dma_mask = DMA_BIT_MASK(40);
>
> Can we read this from /sys/bus/pci/devices/*/dma_mask_bits? I am not sure
> whether is this generic enough but I can see dma_mask_bits for the PCI
> devices on my PC.
>
>
The kernel adds a default dma mask when device scanning (at least for PCI
devices). It is a device driver who knows about specific DMA addressing
limitations. For example, this is done with UIO (igb_uio) and the using
sysfs would be fine (but then you should add support for specifying a dma
mask in igb_uio as a module param) but this is not true for VFIO.



> Regards
> Jan
>
> > +
> >   return 0;
> >  }
> >
>
>
>
>
> --
>Jan Viktorin  E-mail: Viktorin at RehiveTech.com
>System Architect  Web:www.RehiveTech.com
>RehiveTech
>Brno, Czech Republic
>

[dpdk-dev] [dpdk-dev, 2/3] eth_dev: add support for device dma mask

2016-05-12 Thread Alejandro Lucero

Hi Jan

On Thu, May 12, 2016 at 3:52 PM, Jan Viktorin 
wrote:

> Hello Alejandro,
>
> On Thu, 12 May 2016 15:33:59 +0100
> "Alejandro.Lucero"  wrote:
>
> > - New dma_mask field in rte_eth_dev_data.
> >  - If PMD sets device dma_mask, call to check hugepages within
>
> I think that one of the purposes of DPDK is to support DMA transfers
> in userspace. In other words, I can see no reason to support dma_mask
> at the ethdev level only.
>
> The limitation is a device limitation so I can not see a better place for
adding the device dma mask.


> We should consider adding this to the generic struct rte_device
> (currently rte_pci_device). Thomas?
>
> I guess it could be a non-pci device with such a limitation. I though
rte_ethdev is more generic.


> >supported range.
>
> I think, the '-' is unnecessary at the beginning of line. As for me
> I prefer a fluent text describing the purpose. The '-' is useful for
> a real list of notes.
>
> >
> > Signed-off-by: Alejandro Lucero 
> >
> > ---
> > lib/librte_ether/rte_ethdev.c | 7 +++
> >  lib/librte_ether/rte_ethdev.h | 1 +
> >  2 files changed, 8 insertions(+)
> >
> > diff --git a/lib/librte_ether/rte_ethdev.c
> b/lib/librte_ether/rte_ethdev.c
> > index a31018e..c0de88a 100644
> > --- a/lib/librte_ether/rte_ethdev.c
> > +++ b/lib/librte_ether/rte_ethdev.c
> > @@ -280,9 +280,16 @@ rte_eth_dev_init(struct rte_pci_driver *pci_drv,
> >
> >   /* Invoke PMD device initialization function */
> >   diag = (*eth_drv->eth_dev_init)(eth_dev);
> > + if (diag)
> > + goto err;
> > +
> > + if (eth_dev->data->dma_mask)
> > + diag =
> rte_eal_hugepage_check_address_mask(eth_dev->data->dma_mask);
> > +
>
> I don't understand what happens if the memory is out of the DMA mask. What
> can the driver
> do? Does it just fail?
>
> I think that this should be considered during a malloc instead. (Well,
> there is probably
> no suitable API for this at the moment.)
>
> hugepage memory allocation is done before device initialization. I see
easier to leave the normal hugepage code as it is now and add a later call
if a device requires it.

The only reasonable thing to do is to fail as the amount of required memory
can not be (safely) allocated.


> Regards
> Jan
>
> >   if (diag == 0)
> >   return 0;
> >
> > +err:
> >   RTE_PMD_DEBUG_TRACE("driver %s: eth_dev_init(vendor_id=0x%u
> device_id=0x%x) failed\n",
> >   pci_drv->name,
> >   (unsigned) pci_dev->id.vendor_id,
> > diff --git a/lib/librte_ether/rte_ethdev.h
> b/lib/librte_ether/rte_ethdev.h
> > index 2757510..34daa92 100644
> > --- a/lib/librte_ether/rte_ethdev.h
> > +++ b/lib/librte_ether/rte_ethdev.h
> > @@ -1675,6 +1675,7 @@ struct rte_eth_dev_data {
> >   enum rte_kernel_driver kdrv;/**< Kernel driver passthrough */
> >   int numa_node;  /**< NUMA node connection */
> >   const char *drv_name;   /**< Driver name */
> > + uint64_t dma_mask; /** device supported address space range */
> >  };
> >
> >  /** Device supports hotplug detach */
>
>
>
> --
>Jan Viktorin  E-mail: Viktorin at RehiveTech.com
>System Architect  Web:www.RehiveTech.com
>RehiveTech
>Brno, Czech Republic
>

[dpdk-dev] [PATCH 3/3] nfp: set device dma mask

2016-05-12 Thread Alejandro Lucero

 - Just hugepages within the supported range will be available.

Signed-off-by: Alejandro Lucero 
---
 drivers/net/nfp/nfp_net.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index ea5a2a3..e0e444a 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -115,6 +115,14 @@ enum nfp_qcp_ptr {
NFP_QCP_WRITE_PTR
 };

+#ifndef DMA_64BIT_MASK
+#define DMA_64BIT_MASK  0xULL
+#endif
+
+#ifndef DMA_BIT_MASK
+#define DMA_BIT_MASK(n) (((n) == 64) ? DMA_64BIT_MASK : ((1ULL<<(n))-1))
+#endif
+
 /*
  * nfp_qcp_ptr_add - Add the value to the selected pointer of a queue
  * @q: Base address for queue structure
@@ -2441,6 +2449,9 @@ nfp_net_init(struct rte_eth_dev *eth_dev)
/* Recording current stats counters values */
nfp_net_stats_reset(eth_dev);

+   /* Setting dma_mask */
+   eth_dev->data->dma_mask = DMA_BIT_MASK(40);
+
return 0;
 }

-- 
1.9.1

[dpdk-dev] [PATCH 2/3] eth_dev: add support for device dma mask

2016-05-12 Thread Alejandro Lucero

 - New dma_mask field in rte_eth_dev_data.
 - If PMD sets device dma_mask, call to check hugepages within
   supported range.

Signed-off-by: Alejandro Lucero 
---
 lib/librte_ether/rte_ethdev.c | 7 +++
 lib/librte_ether/rte_ethdev.h | 1 +
 2 files changed, 8 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index a31018e..c0de88a 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -280,9 +280,16 @@ rte_eth_dev_init(struct rte_pci_driver *pci_drv,

/* Invoke PMD device initialization function */
diag = (*eth_drv->eth_dev_init)(eth_dev);
+   if (diag)
+   goto err;
+
+   if (eth_dev->data->dma_mask)
+   diag = 
rte_eal_hugepage_check_address_mask(eth_dev->data->dma_mask);
+
if (diag == 0)
return 0;

+err:
RTE_PMD_DEBUG_TRACE("driver %s: eth_dev_init(vendor_id=0x%u 
device_id=0x%x) failed\n",
pci_drv->name,
(unsigned) pci_dev->id.vendor_id,
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 2757510..34daa92 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1675,6 +1675,7 @@ struct rte_eth_dev_data {
enum rte_kernel_driver kdrv;/**< Kernel driver passthrough */
int numa_node;  /**< NUMA node connection */
const char *drv_name;   /**< Driver name */
+   uint64_t dma_mask; /** device supported address space range */
 };

 /** Device supports hotplug detach */
-- 
1.9.1

[dpdk-dev] [PATCH 1/3] eal/linux: add function for checking hugepages within device supported address range

2016-05-12 Thread Alejandro Lucero

 - This is needed for avoiding problems with devices not being able to address
   all the physical available memory.

Signed-off-by: Alejandro Lucero 
---
 lib/librte_eal/common/include/rte_memory.h |  6 ++
 lib/librte_eal/linuxapp/eal/eal_memory.c   | 27 +++
 2 files changed, 33 insertions(+)

diff --git a/lib/librte_eal/common/include/rte_memory.h 
b/lib/librte_eal/common/include/rte_memory.h
index f8dbece..67b0b28 100644
--- a/lib/librte_eal/common/include/rte_memory.h
+++ b/lib/librte_eal/common/include/rte_memory.h
@@ -256,6 +256,12 @@ rte_mem_phy2mch(uint32_t memseg_id __rte_unused, const 
phys_addr_t phy_addr)
 }
 #endif

+/**
+ * Check hugepages are within the supported
+ * device address space range.
+ */
+int rte_eal_hugepage_check_address_mask(uint64_t dma_mask);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c 
b/lib/librte_eal/linuxapp/eal/eal_memory.c
index 5b9132c..2cd046d 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -1037,6 +1037,33 @@ calc_num_pages_per_socket(uint64_t * memory,
 }

 /*
+ * Some devices have addressing limitations. A PMD will indirectly call this
+ * function raising an error if any hugepage is out of address range supported.
+ * As hugepages are ordered by physical address, there is nothing to do as
+ * any other hugepage available will be out of range as well.
+ */
+int
+rte_eal_hugepage_check_address_mask(uint64_t dma_mask)
+{
+   const struct rte_mem_config *mcfg = 
rte_eal_get_configuration()->mem_config;
+   phys_addr_t physaddr;
+   int i =0;
+
+   while (i < RTE_MAX_MEMSEG && mcfg->memseg[i].len > 0) {
+   physaddr = mcfg->memseg[i].phys_addr + mcfg->memseg[i].len;
+   RTE_LOG(DEBUG, EAL, "Checking page with address %"PRIx64" and 
device"
+   " mask 0x%"PRIx64"\n", physaddr, dma_mask);
+   if (physaddr & ~dma_mask) {
+   RTE_LOG(ERR, EAL, "Allocated hugepages are out of 
device address"
+   " range.");
+   return -1;
+   }
+   i++;
+   }
+   return 0;
+}
+
+/*
  * Prepare physical memory mapping: fill configuration structure with
  * these infos, return 0 on success.
  *  1. map N huge pages in separate files in hugetlbfs
-- 
1.9.1

[dpdk-dev] [PATCH 0/3] add support for devices with addressing limitations

2016-05-12 Thread Alejandro Lucero

A kernel driver uses a dma mask specifying the memory address range supported
by the device for DMA operations. With DPDK there is no possibility for doing
the same thing so it could lead to problems with those devices not being able
to use all the available physical memory.

This patchset adds support for a PMD setting a device dma mask. If this dma
mask is set this will imply a call for checking hugepages allocated are within
the supported device range.

First patch adds the checking function. If there is a hugepage (memseg) out of
the device supported range an error is raised. Nothing really we can do as any
other available hugepage (and not allocated) will be also out of range as
hugepages are ordered by physical address before allocating.

Second patch adds call to the checking function if device dma mask is set during
PMD initialization. Depending on how hugepages are created and the amount of 
them
the checking could slow down initialization. If a device has not addressing 
limitations the checking is not done.

Third patch adds support for setting dma mask in the PMD NFP. Current NFP card
just supports 40 bits. Future versions will support 64 bits.

Alejandro Lucero (3):
  eal/linux: add function for checking hugepages within device supported
address range
  eth_dev: add support for device dma mask
  nfp: set device dma mask

 drivers/net/nfp/nfp_net.c  | 11 +++
 lib/librte_eal/common/include/rte_memory.h |  6 ++
 lib/librte_eal/linuxapp/eal/eal_memory.c   | 27 +++
 lib/librte_ether/rte_ethdev.c  |  7 +++
 lib/librte_ether/rte_ethdev.h  |  1 +
 5 files changed, 52 insertions(+)

-- 
1.9.1

[dpdk-dev] [PATCH v1] igu_uio: fix IOMMU domain issue

2016-05-11 Thread Alejandro Lucero

On Tue, May 10, 2016 at 4:59 PM, Stephen Hemminger <
stephen at networkplumber.org> wrote:

> On Tue, 10 May 2016 19:21:41 +0800
> Zhe Tao  wrote:
>
> > Problem:
> > The following  operations will cause the igb_uio based DPDK
> > operation failed.
> > --Any device assignment through the kvm_assign_device interface,
> > this can be the pci-assign method in QEMU
> > --VFIO group attachment operation(attach to the container)
> > this can happens in  vfio-pci assignment in QEMU
>
>
> If you have an IOMMU why not use VFIO instead, it is better.
>

It is not about VFIO against UIO but about how iommu domains are created
and destroyed by the (old) kernel when iommu=pt. So even with VFIO you can
have problems.

We have had problems like this and other due to our device (NFP) just
mapping up to 40 bits of address space. Old kernels used in LTS
distributions like Ubuntu are iommu buggy and you need to do things like
this mapping inside the driver for solving problems. By the way, using
SRIOV just adds more problems. It is not safe to use iommu=pt with 3.13.x
Ubuntu kernels.

It would be a good thing for the original patch to identify those kernels
where the problem was detected. Of course, there could be more kernels with
the same problem but that is more work to do.

[dpdk-dev] [PATCH] nfp: avoiding concurrency when hardware reconfig

2016-05-03 Thread Alejandro Lucero

Hi Bruce,

Sorry about this. I sent a v2 for this patch but not in the same thread:

http://www.dpdk.org/ml/archives/dev/2016-April/037996.html

On Tue, May 3, 2016 at 12:01 PM, Bruce Richardson <
bruce.richardson at intel.com> wrote:

> On Tue, Apr 26, 2016 at 01:14:15PM +0100, Alejandro Lucero wrote:
> > Some apps calling some functions from different threads at the
> > same time could lead to reconfig problems. Reconfig mechanism is
> > based on a hardware queue where incrementing a counter signals the
> > firmware to do the reconfig. If there are two increments before the
> > first one has been processed the firmware will stop and a device
> > reset is necessary.
> >
> > Signed-off-by: Alejandro Lucero 
> > ---
> >  drivers/net/nfp/nfp_net.c | 8 
> >  drivers/net/nfp/nfp_net_pmd.h | 1 +
> >  2 files changed, 9 insertions(+)
> >
> > diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
> > index bc0a3d8..ba0ee04 100644
> > --- a/drivers/net/nfp/nfp_net.c
> > +++ b/drivers/net/nfp/nfp_net.c
> > @@ -58,6 +58,7 @@
> >  #include "nfp_net_pmd.h"
> >  #include "nfp_net_logs.h"
> >  #include "nfp_net_ctrl.h"
> > +#include 
>
> Hi Alejandro,
>
> I think this header addition is in the wrong place in the code. When I
> apply
> this patch to next-net and try a recompile I get the error:
>
>   CC nfp_net.o
>   In file included from
> /home/bruce/next-net/dpdk-next-net/drivers/net/nfp/nfp_net.c:58:0:
>   /home/bruce/next-net/dpdk-next-net/drivers/net/nfp/nfp_net_pmd.h:409:2:
> error: unknown type name ?rte_spinlock_t?
> rte_spinlock_t reconfig_lock;
> ^
>
> You either need to put the spinlock include before the nfp_net_pmd.h
> include
> or, perhaps better, put the spinlock include inside the nfp_net_pmd header
> file
> since that is where the spinlock variable is being defined.
>
> /Bruce
>
>

[dpdk-dev] [PATCH] nfp: add flag for enabling device hotplug

2016-04-26 Thread Alejandro Lucero

RTE_PCI_DRV_DETACHABLE is required for detaching a device
during execution.

Signed-off-by: Alejandro Lucero 
---
 drivers/net/nfp/nfp_net.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index 1259d2c..ea5a2a3 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -2466,7 +2466,8 @@ static struct eth_driver rte_nfp_net_pmd = {
{
.name = "rte_nfp_net_pmd",
.id_table = pci_id_nfp_net_map,
-   .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
+   .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
+RTE_PCI_DRV_DETACHABLE,
},
.eth_dev_init = nfp_net_init,
.dev_private_size = sizeof(struct nfp_net_adapter),
-- 
1.9.1

[dpdk-dev] [PATCH] nfp: fixing a bug when gather

2016-04-26 Thread Alejandro Lucero

mbufs where not properly released when they are chained.

Fixes: b812daadad0d ("nfp: add Rx and Tx")

Signed-off-by: Alejandro Lucero 
---
 drivers/net/nfp/nfp_net.c | 21 ++---
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index 559ebe6..1259d2c 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -323,7 +323,7 @@ nfp_net_tx_queue_release_mbufs(struct nfp_net_txq *txq)

for (i = 0; i < txq->tx_count; i++) {
if (txq->txbufs[i].mbuf) {
-   rte_pktmbuf_free_seg(txq->txbufs[i].mbuf);
+   rte_pktmbuf_free(txq->txbufs[i].mbuf);
txq->txbufs[i].mbuf = NULL;
}
}
@@ -1976,11 +1976,16 @@ nfp_net_xmit_pkts(void *tx_queue, struct rte_mbuf 
**tx_pkts, uint16_t nb_pkts)
 */
pkt_size = pkt->pkt_len;

-   while (pkt_size) {
-   /* Releasing mbuf which was prefetched above */
-   if (*lmbuf)
-   rte_pktmbuf_free_seg(*lmbuf);
+   /* Releasing mbuf which was prefetched above */
+   if (*lmbuf)
+   rte_pktmbuf_free(*lmbuf);
+   /*
+* Linking mbuf with descriptor for being released
+* next time descriptor is used
+*/
+   *lmbuf = pkt;

+   while (pkt_size) {
dma_size = pkt->data_len;
dma_addr = rte_mbuf_data_dma_addr(pkt);
PMD_TX_LOG(DEBUG, "Working with mbuf at dma address:"
@@ -1994,12 +1999,6 @@ nfp_net_xmit_pkts(void *tx_queue, struct rte_mbuf 
**tx_pkts, uint16_t nb_pkts)
ASSERT(free_descs > 0);
free_descs--;

-   /*
-* Linking mbuf with descriptor for being released
-* next time descriptor is used
-*/
-   *lmbuf = pkt;
-
txq->wr_p++;
txq->tail++;
if (unlikely(txq->tail == txq->tx_count)) /* wrapping?*/
-- 
1.9.1

[dpdk-dev] [PATCH v2] nfp: avoiding concurrency when hardware reconfig

2016-04-26 Thread Alejandro Lucero

Some apps calling some functions from different threads at the
same time could lead to reconfig problems. Reconfig mechanism is
based on a hardware queue where incrementing a counter signals the
firmware to do the reconfig. If there are two increments before the
 first one has been processed the firmware will stop and a device
reset is necessary.

 - v2: header file to the right place

Signed-off-by: Alejandro Lucero 
---
 drivers/net/nfp/nfp_net.c | 8 
 drivers/net/nfp/nfp_net_pmd.h | 1 +
 2 files changed, 9 insertions(+)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index bc0a3d8..559ebe6 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -54,6 +54,7 @@
 #include 
 #include 
 #include 
+#include 

 #include "nfp_net_pmd.h"
 #include "nfp_net_logs.h"
@@ -407,6 +408,8 @@ nfp_net_reconfig(struct nfp_net_hw *hw, uint32_t ctrl, 
uint32_t update)
PMD_DRV_LOG(DEBUG, "nfp_net_reconfig: ctrl=%08x update=%08x\n",
ctrl, update);

+   rte_spinlock_lock(>reconfig_lock);
+
nn_cfg_writel(hw, NFP_NET_CFG_CTRL, ctrl);
nn_cfg_writel(hw, NFP_NET_CFG_UPDATE, update);

@@ -414,6 +417,8 @@ nfp_net_reconfig(struct nfp_net_hw *hw, uint32_t ctrl, 
uint32_t update)

err = __nfp_net_reconfig(hw, update);

+   rte_spinlock_unlock(>reconfig_lock);
+
if (!err)
return 0;

@@ -2399,6 +2404,9 @@ nfp_net_init(struct rte_eth_dev *eth_dev)
PMD_INIT_LOG(INFO, "max_rx_queues: %u, max_tx_queues: %u\n",
 hw->max_rx_queues, hw->max_tx_queues);

+   /* Initializing spinlock for reconfigs */
+   rte_spinlock_init(>reconfig_lock);
+
/* Allocating memory for mac addr */
eth_dev->data->mac_addrs = rte_zmalloc("mac_addr", ETHER_ADDR_LEN, 0);
if (eth_dev->data->mac_addrs == NULL) {
diff --git a/drivers/net/nfp/nfp_net_pmd.h b/drivers/net/nfp/nfp_net_pmd.h
index 232ce5c..c180972 100644
--- a/drivers/net/nfp/nfp_net_pmd.h
+++ b/drivers/net/nfp/nfp_net_pmd.h
@@ -406,6 +406,7 @@ struct nfp_net_hw {
int stride_tx;

uint8_t *qcp_cfg;
+   rte_spinlock_t reconfig_lock;

uint32_t max_tx_queues;
uint32_t max_rx_queues;
-- 
1.9.1

[dpdk-dev] [PATCH] nfp: avoiding concurrency when hardware reconfig

2016-04-26 Thread Alejandro Lucero

Some apps calling some functions from different threads at the
same time could lead to reconfig problems. Reconfig mechanism is
based on a hardware queue where incrementing a counter signals the
firmware to do the reconfig. If there are two increments before the
first one has been processed the firmware will stop and a device
reset is necessary.

Signed-off-by: Alejandro Lucero 
---
 drivers/net/nfp/nfp_net.c | 8 
 drivers/net/nfp/nfp_net_pmd.h | 1 +
 2 files changed, 9 insertions(+)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index bc0a3d8..ba0ee04 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -58,6 +58,7 @@
 #include "nfp_net_pmd.h"
 #include "nfp_net_logs.h"
 #include "nfp_net_ctrl.h"
+#include 

 /* Prototypes */
 static void nfp_net_close(struct rte_eth_dev *dev);
@@ -407,6 +408,8 @@ nfp_net_reconfig(struct nfp_net_hw *hw, uint32_t ctrl, 
uint32_t update)
PMD_DRV_LOG(DEBUG, "nfp_net_reconfig: ctrl=%08x update=%08x\n",
ctrl, update);

+   rte_spinlock_lock(>reconfig_lock);
+
nn_cfg_writel(hw, NFP_NET_CFG_CTRL, ctrl);
nn_cfg_writel(hw, NFP_NET_CFG_UPDATE, update);

@@ -414,6 +417,8 @@ nfp_net_reconfig(struct nfp_net_hw *hw, uint32_t ctrl, 
uint32_t update)

err = __nfp_net_reconfig(hw, update);

+   rte_spinlock_unlock(>reconfig_lock);
+
if (!err)
return 0;

@@ -2399,6 +2404,9 @@ nfp_net_init(struct rte_eth_dev *eth_dev)
PMD_INIT_LOG(INFO, "max_rx_queues: %u, max_tx_queues: %u\n",
 hw->max_rx_queues, hw->max_tx_queues);

+   /* Initializing spinlock for reconfigs */
+   rte_spinlock_init(>reconfig_lock);
+
/* Allocating memory for mac addr */
eth_dev->data->mac_addrs = rte_zmalloc("mac_addr", ETHER_ADDR_LEN, 0);
if (eth_dev->data->mac_addrs == NULL) {
diff --git a/drivers/net/nfp/nfp_net_pmd.h b/drivers/net/nfp/nfp_net_pmd.h
index 232ce5c..c180972 100644
--- a/drivers/net/nfp/nfp_net_pmd.h
+++ b/drivers/net/nfp/nfp_net_pmd.h
@@ -406,6 +406,7 @@ struct nfp_net_hw {
int stride_tx;

uint8_t *qcp_cfg;
+   rte_spinlock_t reconfig_lock;

uint32_t max_tx_queues;
uint32_t max_rx_queues;
-- 
1.9.1

[dpdk-dev] [PATCH] nfp: modifying guide about using uio modules

2016-04-26 Thread Alejandro Lucero

 - Removing dependency on nfp_uio kernel module. The igb_uio
   kernel modules can be used instead.

Fixes: 80bc1752f16e ("nfp: add guide")

Signed-off-by: Alejandro Lucero 
---
 doc/guides/nics/nfp.rst | 47 ---
 1 file changed, 16 insertions(+), 31 deletions(-)

diff --git a/doc/guides/nics/nfp.rst b/doc/guides/nics/nfp.rst
index dfc3683..e4ebc71 100644
--- a/doc/guides/nics/nfp.rst
+++ b/doc/guides/nics/nfp.rst
@@ -61,9 +61,8 @@ instructions.

 DPDK runs in userspace and PMDs uses the Linux kernel UIO interface to
 allow access to physical devices from userspace. The NFP PMD requires
-a separate UIO driver, **nfp_uio**, to perform correct
-initialization. This driver is part of Netronome?s BSP and it is
-equivalent to Intel's igb_uio driver.
+the **igb_uio** UIO driver, available with DPDK, to perform correct
+initialization.

 Building the software
 -
@@ -201,27 +200,18 @@ Using the NFP PMD is not different to using other PMDs. 
Usual steps are:

The module should now be listed by the lsmod command.

-#. **To install the nfp_uio kernel module (manually):** This module supports
-   NFP-6xxx devices through the UIO interface.
-
-   This module is part of Netronome?s BSP and it should be available when the
-   BSP is installed.
+#. **To install the igb_uio kernel module (manually):** This module is part
+   of DPDK sources and configured by default (CONFIG_RTE_EAL_IGB_UIO=y).

.. code-block:: console

-  modprobe nfp_uio.ko
+  modprobe igb_uio.ko

The module should now be listed by the lsmod command.

-   Depending on which NFP modules are loaded, nfp_uio may be automatically
-   bound to the NFP PCI devices by the system. Otherwise the binding needs
-   to be done explicitly. This is the case when nfp_netvf, the Linux kernel
-   driver for NFP VFs, was loaded when VFs were created. As described later
-   in this document this configuration may also be performed using scripts
-   provided by the Netronome?s BSP.
-
-   First the device needs to be unbound, for example from the nfp_netvf
-   driver:
+   Depending on which NFP modules are loaded, it could be necessary to
+   detach NFP devices from the nfp_netvf module. If this is the case the
+   device needs to be unbound, for example:

.. code-block:: console

@@ -232,30 +222,25 @@ Using the NFP PMD is not different to using other PMDs. 
Usual steps are:
The output of lspci should now show that :03:08.0 is not bound to
any driver.

-   The next step is to add the NFP PCI ID to the NFP UIO driver:
+   The next step is to add the NFP PCI ID to the IGB UIO driver:

.. code-block:: console

-  echo 19ee 6003 > /sys/bus/pci/drivers/nfp_uio/new_id
+  echo 19ee 6003 > /sys/bus/pci/drivers/igb_uio/new_id

-   And then to bind the device to the nfp_uio driver:
+   And then to bind the device to the igb_uio driver:

.. code-block:: console

-  echo :03:08.0 > /sys/bus/pci/drivers/nfp_uio/bind
+  echo :03:08.0 > /sys/bus/pci/drivers/igb_uio/bind

   lspci -d19ee: -k

-   lspci should show that device bound to nfp_uio driver.
-
-#. **Using tools from Netronome?s BSP to install and bind modules:** DPDK 
provides
-   scripts which are useful for installing the UIO modules and for binding the
-   right device to those modules avoiding doing so manually. However, these 
scripts
-   have not support for Netronome?s UIO driver. Along with drivers, the BSP 
installs
-   those DPDK scripts slightly modified with support for Netronome?s UIO 
driver.
+   lspci should show that device bound to igb_uio driver.

-   Those specific scripts can be found in Netronome?s BSP installation 
directory.
-   Refer to BSP documentation for more information.
+#. **Using scripts to install and bind modules:** DPDK provides scripts which 
are
+   useful for installing the UIO modules and for binding the right device to 
those
+   modules avoiding doing so manually:

* **setup.sh**
* **dpdk_nic_bind.py**
-- 
1.9.1

[dpdk-dev] [PATCH v13 5/8] ethdev: add speed capabilities

2016-03-29 Thread Alejandro Lucero

For nfp.c, speed_capa should be ETH_LINK_SPEED_40G instead of
ETH_LINK_SPEED_50G.
By the way, the change in patch 4 sets the right link speed using the new
constants.

Regards

On Sat, Mar 26, 2016 at 1:27 AM, Marc Sune  wrote:

> The speed capabilities of a device can be retrieved with
> rte_eth_dev_info_get().
>
> The new field speed_capa is initialized in the drivers without
> taking care of device characteristics in this patch.
> When the capabilities of a driver are accurate, the table in
> overview.rst must be filled.
>
> Signed-off-by: Marc Sune 
> ---
>  doc/guides/nics/overview.rst   |  1 +
>  doc/guides/rel_notes/release_16_04.rst |  8 
>  drivers/net/bnx2x/bnx2x_ethdev.c   |  1 +
>  drivers/net/cxgbe/cxgbe_ethdev.c   |  1 +
>  drivers/net/e1000/em_ethdev.c  |  4 
>  drivers/net/e1000/igb_ethdev.c |  4 
>  drivers/net/ena/ena_ethdev.c   |  9 +
>  drivers/net/fm10k/fm10k_ethdev.c   |  4 
>  drivers/net/i40e/i40e_ethdev.c |  8 
>  drivers/net/ixgbe/ixgbe_ethdev.c   |  8 
>  drivers/net/mlx4/mlx4.c|  6 ++
>  drivers/net/mlx5/mlx5_ethdev.c |  8 
>  drivers/net/nfp/nfp_net.c  |  2 ++
>  lib/librte_ether/rte_ethdev.h  | 21 +
>  14 files changed, 85 insertions(+)
>
> diff --git a/doc/guides/nics/overview.rst b/doc/guides/nics/overview.rst
> index 542479a..62f1868 100644
> --- a/doc/guides/nics/overview.rst
> +++ b/doc/guides/nics/overview.rst
> @@ -86,6 +86,7 @@ Most of these differences are summarized below.
>e   e   e   e   e
>e
>c   c   c   c   c
>c
>  = = = = = = = = = = = = = = = = = = = = = = = = =
> = = = = = = = =
> +   speed capabilities
> link status  X   X X
>  X X
> link status eventX X
>X
> queue status event
>X
> diff --git a/doc/guides/rel_notes/release_16_04.rst
> b/doc/guides/rel_notes/release_16_04.rst
> index 79d76e1..9e7b0b7 100644
> --- a/doc/guides/rel_notes/release_16_04.rst
> +++ b/doc/guides/rel_notes/release_16_04.rst
> @@ -47,6 +47,11 @@ This section should contain new features added in this
> release. Sample format:
>A new function ``rte_pktmbuf_alloc_bulk()`` has been added to allow the
> user
>to allocate a bulk of mbufs.
>
> +* **Added device link speed capabilities.**
> +
> +  The structure ``rte_eth_dev_info`` has now a ``speed_capa`` bitmap,
> which
> +  allows the application to know the supported speeds of each device.
> +
>  * **Added new poll-mode driver for Amazon Elastic Network Adapters
> (ENA).**
>
>The driver operates variety of ENA adapters through feature negotiation
> @@ -456,6 +461,9 @@ This section should contain API changes. Sample format:
>All drivers are now counting the missed packets only once, i.e. drivers
> will
>not increment ierrors anymore for missed packets.
>
> +* The ethdev structure ``rte_eth_dev_info`` was changed to support device
> +  speed capabilities.
> +
>  * The functions ``rte_eth_dev_udp_tunnel_add`` and
> ``rte_eth_dev_udp_tunnel_delete``
>have been renamed into ``rte_eth_dev_udp_tunnel_port_add`` and
>``rte_eth_dev_udp_tunnel_port_delete``.
> diff --git a/drivers/net/bnx2x/bnx2x_ethdev.c
> b/drivers/net/bnx2x/bnx2x_ethdev.c
> index a3c6c01..897081f 100644
> --- a/drivers/net/bnx2x/bnx2x_ethdev.c
> +++ b/drivers/net/bnx2x/bnx2x_ethdev.c
> @@ -327,6 +327,7 @@ bnx2x_dev_infos_get(struct rte_eth_dev *dev,
> __rte_unused struct rte_eth_dev_inf
> dev_info->min_rx_bufsize = BNX2X_MIN_RX_BUF_SIZE;
> dev_info->max_rx_pktlen  = BNX2X_MAX_RX_PKT_LEN;
> dev_info->max_mac_addrs  = BNX2X_MAX_MAC_ADDRS;
> +   dev_info->speed_capa = ETH_LINK_SPEED_10G | ETH_LINK_SPEED_20G;
>  }
>
>  static void
> diff --git a/drivers/net/cxgbe/cxgbe_ethdev.c
> b/drivers/net/cxgbe/cxgbe_ethdev.c
> index 8845c76..bb134e5 100644
> --- a/drivers/net/cxgbe/cxgbe_ethdev.c
> +++ b/drivers/net/cxgbe/cxgbe_ethdev.c
> @@ -171,6 +171,7 @@ static void cxgbe_dev_info_get(struct rte_eth_dev
> *eth_dev,
>
> device_info->rx_desc_lim = cxgbe_desc_lim;
> device_info->tx_desc_lim = cxgbe_desc_lim;
> +   device_info->speed_capa = ETH_LINK_SPEED_10G | ETH_LINK_SPEED_40G;
>  }
>
>  static void cxgbe_dev_promiscuous_enable(struct rte_eth_dev *eth_dev)
> diff --git a/drivers/net/e1000/em_ethdev.c b/drivers/net/e1000/em_ethdev.c
> index 473d77f..d5f8c7f 100644
> --- a/drivers/net/e1000/em_ethdev.c
> +++ b/drivers/net/e1000/em_ethdev.c
> @@ -1054,6 +1054,10 @@ eth_em_infos_get(struct rte_eth_dev *dev, struct
> rte_eth_dev_info *dev_info)
> .nb_min = E1000_MIN_RING_DESC,
> .nb_align = EM_TXD_ALIGN,
> };
> +
> +   dev_info->speed_capa = ETH_LINK_SPEED_10M_HD | ETH_LINK_SPEED_10M |
> +

[dpdk-dev] [PATCH] nfp: copy pci info from pci to ethdev

2016-03-29 Thread Alejandro Lucero

Hi guys,

Sorry for the delay but I was on a Easter break.

That patch is OK for me. In fact, I had one patch ready for upstreaming
with this change needed for supporting hotplug. I was waiting for some
feedback from one internal project needing this hotplug functionality
before submitting.

Regards

On Fri, Mar 25, 2016 at 12:31 PM, Bruce Richardson <
bruce.richardson at intel.com> wrote:

> On Wed, Mar 23, 2016 at 08:51:36AM -0700, Stephen Hemminger wrote:
> > The NFP driver (unlike other PCI devices) was not copying the pci info
> > from the pci_dev to the eth_dev.  This would make the driver_name be
> > null (and other unset fields) when application uses dev_info_get.
> >
> > This was found by code review; do not have the hardware.
> >
> > Signed-off-by: Stephen Hemminger 
> > ---
> Alejandro,
>
> any review or ack on this patch for nfp driver?
>
> Regards,
> /Bruce
>

[dpdk-dev] [PATCH] nfp: fix tx queue reset

2016-03-14 Thread Alejandro Lucero

When using start-stop functionality the per queue fields need to
be properly reset.

Fixes: b812daadad0d (\"nfp: add Rx and Tx\")

Signed-off-by: Alejandro Lucero 
---
 drivers/net/nfp/nfp_net.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index 9c4f218..e1e014f 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -358,6 +358,7 @@ nfp_net_reset_tx_queue(struct nfp_net_txq *txq)
txq->wr_p = 0;
txq->rd_p = 0;
txq->tail = 0;
+   txq->qcp_rd_p = 0;
 }

 static int
-- 
1.7.9.5

[dpdk-dev] [PATCH] nfp: fix how tx checksum is advertised to firmware

2016-03-03 Thread Alejandro Lucero

Even with tx checksum offload available, do not set the flag by default.

Fixes: b812daadad0d (\"nfp: add Rx and Tx\")

Signed-off-by: Alejandro Lucero 
---
 drivers/net/nfp/nfp_net.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index 0e3705e..6078e9f 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -1543,7 +1543,8 @@ nfp_net_tx_cksum(struct nfp_net_txq *txq, struct 
nfp_net_tx_desc *txd,
break;
}

-   txd->flags |= PCIE_DESC_TX_CSUM;
+   if (ol_flags & (PKT_TX_IP_CKSUM | PKT_TX_L4_MASK))
+   txd->flags |= PCIE_DESC_TX_CSUM;
 }

 /* nfp_net_rx_cksum - set mbuf checksum flags based on RX descriptor flags */
-- 
1.7.9.5

[dpdk-dev] [PATCH v2] nfp: fix variable type in tx checksum offload

2016-03-03 Thread Alejandro Lucero

The mbuf ol_flags field was changed to uin64_t with DPDK version 1.8

Fixes: b812daadad0d (\"nfp: add Rx and Tx\")

Signed-off-by: Alejandro Lucero 
---
 drivers/net/nfp/nfp_net.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index fd4dd39..0e3705e 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -1522,7 +1522,7 @@ static inline void
 nfp_net_tx_cksum(struct nfp_net_txq *txq, struct nfp_net_tx_desc *txd,
 struct rte_mbuf *mb)
 {
-   uint16_t ol_flags;
+   uint64_t ol_flags;
struct nfp_net_hw *hw = txq->hw;

if (!(hw->cap & NFP_NET_CFG_CTRL_TXCSUM))
-- 
1.7.9.5

[dpdk-dev] [PATCH] nfp: tx checksum offload fixes

2016-03-03 Thread Alejandro Lucero

Signed-off-by: Alejandro Lucero 
---
 drivers/net/nfp/nfp_net.c |5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index fd4dd39..6078e9f 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -1522,7 +1522,7 @@ static inline void
 nfp_net_tx_cksum(struct nfp_net_txq *txq, struct nfp_net_tx_desc *txd,
 struct rte_mbuf *mb)
 {
-   uint16_t ol_flags;
+   uint64_t ol_flags;
struct nfp_net_hw *hw = txq->hw;

if (!(hw->cap & NFP_NET_CFG_CTRL_TXCSUM))
@@ -1543,7 +1543,8 @@ nfp_net_tx_cksum(struct nfp_net_txq *txq, struct 
nfp_net_tx_desc *txd,
break;
}

-   txd->flags |= PCIE_DESC_TX_CSUM;
+   if (ol_flags & (PKT_TX_IP_CKSUM | PKT_TX_L4_MASK))
+   txd->flags |= PCIE_DESC_TX_CSUM;
 }

 /* nfp_net_rx_cksum - set mbuf checksum flags based on RX descriptor flags */
-- 
1.7.9.5

[dpdk-dev] thoughts on DPDK after a few days of reading sources

2016-02-11 Thread Alejandro Lucero

Hi Seth,

I do not know if you and Ubuntu know about the kernel VFIO no-iommu mode
which DPDK will use in the future (then getting rid of UIO drives).

This implies distributions enabling that kernel VFIO mode which is not
enable by default as it is a security issue.

It would be good to know which is the Ubuntu position regarding this issue
and if there are any date or plan for supporting this.

Thanks

On Thu, Feb 11, 2016 at 7:58 AM, Thomas Monjalon 
wrote:

> Hi,
>
> 2016-02-10 19:05, Seth Arnold:
> > I've taken some notes while reading the sources; I'm sharing them in the
> > hopes that it's useful: on the one hand my fresh eyes may spot things
> that
> > you've overlooked, on the other hand your familiarity with the code means
> > that you're better suited to judge what I've found.
>
> Thanks for taking time and sharing, it's very valuable.
>
> > - shellcheck reports extensive cases of forgotten quotes to prevent word
> >   splitting or globbing, potentially unused variables, error-prone printf
> >   formatting. The scripts that are going to be used at runtime should be
> >   fixed:
> >   - ./debian/dpdk-init
> >   - ./debian/dpdk.init
>
> These files are not in the tree. Should they?
>
> > - ./drivers/net/cxgbe/cxgbe_ethdev.c eth_cxgbe_dev_init() memory leak in
> >   out_free_adapter: that doesn't free adapter
> > - ./drivers/net/virtio/virtio_ethdev.c virtio_set_multiple_queues() calls
> >   virtio_send_command(), which performs:
> >   memcpy(vq->virtio_net_hdr_mz->addr, ctrl, sizeof(struct
> virtio_pmd_ctrl));
> >   This copies a potentially huge amount of uninitialized data into ->addr
> >   because the struct virtio_pmd_ctrl ctrl was not zeroed before being
> >   passed. How much of this data leaves the system? Does this require a
> >   CVE?
>
> We are not used to open a CVE.
>
> [...]
> >   It's nearly impossible to solve issues without error reporting. Good
> >   error reporting saves admins time and money.
>
> Until now, the errors were reported on the list and most often fixed
> quickly.
> While I agree we need a more formal process (a bug tracker), I think we
> must
> be noticed of new bugs on the mailing list.
> Since nobody was against the bugzilla proposal, a deployment will be
> planned.
> http://dpdk.org/ml/archives/dev/2015-August/023012.html
>

[dpdk-dev] [PATCH] nfp: fix non-x86 build

2016-02-08 Thread Alejandro Lucero

On Sat, Feb 6, 2016 at 9:51 PM, Thomas Monjalon 
wrote:

> The file sys/io.h was included but it can be unavailable in some
> non-x86 toolchains.
> As others system includes in the file nfp_net.c, it seems useless,
> so the easy fix is to remove them.
>
> Signed-off-by: Thomas Monjalon 
> ---
>  drivers/net/nfp/nfp_net.c | 11 ---
>  1 file changed, 11 deletions(-)
>
> diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
> index bc2089f..283269e 100644
> --- a/drivers/net/nfp/nfp_net.c
> +++ b/drivers/net/nfp/nfp_net.c
> @@ -39,18 +39,7 @@
>   * Netronome vNIC DPDK Poll-Mode Driver: Main entry point
>   */
>
> -#include 
> -#include 
> -#include 
> -#include 
> -#include 
> -#include 
> -#include 
> -#include 
> -#include 
> -#include 
>  #include 
> -#include 
>
>  #include 
>  #include 
> --
> 2.7.0
>
>
This is fine for me.

Thanks

[dpdk-dev] VFIO no-iommu

2015-12-15 Thread Alejandro Lucero

Hi,

I know a bit about VFIO implementation, have been debugging IOMMU (intel)
problems,  know how QEMU/KVM work about using legacy or vfio attached
devices, and I'm the maintainer of a DPDK PMD recently accepted upstream
which requires our particular UIO driver (not maintained upstream). So I
guess I could help with this effort and testing the code with our card. Of
course, I can not be full time on this but I will be happy to contribute.


On Fri, Dec 11, 2015 at 11:20 PM, Jan Viktorin 
wrote:

> Hello,
>
> I am not involved in the vfio very much, however, I was watching some
> vfio-related code in last few weeks. It looks promising to me and
> IMHO it seems to the best way to bring a support of integrated Ethernet
> MACs into DPDK (related to many SoCs). Unfortunately, the ARMv7 SoCs (I
> know) lacks of an IOMMU... The only protection there is the TrustZone
> technology but I have no idea of its support in the kernel. It's also
> far from being a replacement of an IOMMU. When using FPGAs, it is
> possible to put an IOMMU engine there (I've got such a prototype
> somewhere in my VHDL library) but nobody will probably do use because
> of saving on-chip resources.
>
> The X-Gene SoC (ARM 64) contains 2x 10 Gbps EMACs on the chip. I have no
> idea about IOMMUs there. Thus, this platform can probably benefit of
> such driver as well. The question is whether there is some interest to
> have this kind of support in DPDK.
>
> Thus, I'd like to have the vfio/no-iommu to support the ARMv7 (otherwise
> it would be effectively dead in DPDK). Unfortunately, it's not my
> primary job at the moment.
>
> Regards
> Jan
>
> Note: as far as I know, it is discouraged to refer to lkml.org as
> it is often very slow - my case today :).
>
> On Fri, 11 Dec 2015 17:28:43 +0100
> Thomas Monjalon  wrote:
>
> > Recently there were some discussions to have an upstream replacement
> > for our igb_uio module.
> > Several solutions were discussed (new uio driver, uio_pci_generic, vfio):
> >   https://lkml.org/lkml/2015/10/16/700
> >
> > Alex Williamson (maintainer of VFIO driver), submitted a solution
> > and was waiting some feedback. Unfortunately, nobody caught it and
> > he has reverted his work:
> >
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=ae5515d
> >
> > It is an important challenge to remove our out-of-tree modules and
> > especially igb_uio. It is a long way to have a standard solution
> integrated
> > in every distributions.
> > The current cooking Linux kernel is 4.4 and will have a long term
> maintenance:
> >   https://kernel.org/releases.html
> > So it is a pity to miss this opportunity.
> >
> > Stephen has fixed a bug to use the IOMMU group zero:
> >   http://dpdk.org/browse/dpdk/commit/?id=22215f141b1
> >
> > Is there someone interested to work on VFIO no-iommu and provide
> > some feedbacks?
> > We also need to prepare a documentation patch to explain its usage
> > compared to the standard VFIO mode.
> >
> > Thanks
>
>

[dpdk-dev] [PATCH v10 8/8] nfp: adding nic guide

2015-11-30 Thread Alejandro Lucero

Signed-off-by: Alejandro Lucero 
Signed-off-by: Rolf Neugebauer 
---
 MAINTAINERS   |1 +
 doc/guides/nics/index.rst |1 +
 doc/guides/nics/nfp.rst   |  265 +
 3 files changed, 267 insertions(+)
 create mode 100644 doc/guides/nics/nfp.rst

diff --git a/MAINTAINERS b/MAINTAINERS
index a23de04..b5db75f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -338,6 +338,7 @@ F: drivers/crypto/qat/
 Netronome nfp
 M: Alejandro Lucero 
 F: drivers/net/nfp/
+F: doc/guides/nics/nfp.rst

 Packet processing
 -
diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
index 0a0b724..7bf2938 100644
--- a/doc/guides/nics/index.rst
+++ b/doc/guides/nics/index.rst
@@ -46,6 +46,7 @@ Network Interface Controller Drivers
 intel_vf
 mlx4
 mlx5
+nfp
 szedata2
 virtio
 vmxnet3
diff --git a/doc/guides/nics/nfp.rst b/doc/guides/nics/nfp.rst
new file mode 100644
index 000..55ba64d
--- /dev/null
+++ b/doc/guides/nics/nfp.rst
@@ -0,0 +1,265 @@
+..  BSD LICENSE
+Copyright(c) 2015 Netronome Systems, Inc. All rights reserved.
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions
+are met:
+
+* Redistributions of source code must retain the above copyright
+notice, this list of conditions and the following disclaimer.
+* Redistributions in binary form must reproduce the above copyright
+notice, this list of conditions and the following disclaimer in
+the documentation and/or other materials provided with the
+distribution.
+* Neither the name of Intel Corporation nor the names of its
+contributors may be used to endorse or promote products derived
+from this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+NFP poll mode driver library
+
+
+Netronome's sixth generation of flow processors pack 216 programmable
+cores and over 100 hardware accelerators that uniquely combine packet,
+flow, security and content processing in a single device that scales
+up to 400 Gbps.
+
+This document explains how to use DPDK with the Netronome Poll Mode
+Driver (PMD) supporting Netronome's Network Flow Processor 6xxx
+(NFP-6xxx).
+
+Currently the driver supports virtual functions (VFs) only.
+
+Dependencies
+
+
+Before using the Netronome's DPDK PMD some NFP-6xxx configuration,
+which is not related to DPDK, is required. The system requires
+installation of **Netronome's BSP (Board Support Package)** which includes
+Linux drivers, programs and libraries.
+
+If you have a NFP-6xxx device you should already have the code and
+documentation for doing this configuration. Contact
+**support at netronome.com** to obtain the latest available firmware.
+
+The NFP Linux kernel drivers (including the required PF driver for the
+NFP) are available on Github at
+**https://github.com/Netronome/nfp-drv-kmods** along with build
+instructions.
+
+DPDK runs in userspace and PMDs uses the Linux kernel UIO interface to
+allow access to physical devices from userspace. The NFP PMD requires
+a separate UIO driver, **nfp_uio**, to perform correct
+initialization. This driver is part of Netronome?s BSP and it is
+equivalent to Intel's igb_uio driver.
+
+Building the software
+-
+
+Netronome's PMD code is provided in the **drivers/net/nfp** directory.
+Because Netronome?s BSP dependencies the driver is disabled by default
+in DPDK build using **common_linuxapp configuration** file. Enabling the
+driver or if you use another configuration file and want to have NFP
+support, this variable is needed:
+
+- **CONFIG_RTE_LIBRTE_NFP_PMD=y**
+
+Once DPDK is built all the DPDK apps and examples include support for
+the NFP PMD.
+
+
+System configuration
+
+
+Using the NFP PMD is not different to using other PMDs. Usual steps are:
+
+#. **Configure hugepages:** All major Linux distributions have the hugepages
+   functionality enabled by default. By default this allows the system uses for
+   working with transparent

[dpdk-dev] [PATCH v10 7/8] nfp: link status change interrupt support

2015-11-30 Thread Alejandro Lucero

Signed-off-by: Alejandro Lucero 
Signed-off-by: Rolf Neugebauer 
---
 drivers/net/nfp/nfp_net.c |  123 +
 1 file changed, 123 insertions(+)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index ff9a8d6..bc2089f 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -73,6 +73,9 @@
 /* Prototypes */
 static void nfp_net_close(struct rte_eth_dev *dev);
 static int nfp_net_configure(struct rte_eth_dev *dev);
+static void nfp_net_dev_interrupt_handler(struct rte_intr_handle *handle,
+ void *param);
+static void nfp_net_dev_interrupt_delayed_handler(void *param);
 static int nfp_net_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu);
 static void nfp_net_infos_get(struct rte_eth_dev *dev,
  struct rte_eth_dev_info *dev_info);
@@ -731,6 +734,7 @@ nfp_net_close(struct rte_eth_dev *dev)

nfp_net_stop(dev);

+   rte_intr_disable(>pci_dev->intr_handle);
nn_cfg_writeb(hw, NFP_NET_CFG_LSC, 0xff);

/*
@@ -1115,6 +1119,114 @@ nfp_net_rx_queue_count(struct rte_eth_dev *dev, 
uint16_t queue_idx)
return count;
 }

+static void
+nfp_net_dev_link_status_print(struct rte_eth_dev *dev)
+{
+   struct rte_eth_link link;
+
+   memset(, 0, sizeof(link));
+   nfp_net_dev_atomic_read_link_status(dev, );
+   if (link.link_status)
+   RTE_LOG(INFO, PMD, "Port %d: Link Up - speed %u Mbps - %s\n",
+   (int)(dev->data->port_id), (unsigned)link.link_speed,
+   link.link_duplex == ETH_LINK_FULL_DUPLEX
+   ? "full-duplex" : "half-duplex");
+   else
+   RTE_LOG(INFO, PMD, " Port %d: Link Down\n",
+   (int)(dev->data->port_id));
+
+   RTE_LOG(INFO, PMD, "PCI Address: %04d:%02d:%02d:%d\n",
+   dev->pci_dev->addr.domain, dev->pci_dev->addr.bus,
+   dev->pci_dev->addr.devid, dev->pci_dev->addr.function);
+}
+
+/* Interrupt configuration and handling */
+
+/*
+ * nfp_net_irq_unmask - Unmask an interrupt
+ *
+ * If MSI-X auto-masking is enabled clear the mask bit, otherwise
+ * clear the ICR for the entry.
+ */
+static void
+nfp_net_irq_unmask(struct rte_eth_dev *dev)
+{
+   struct nfp_net_hw *hw;
+
+   hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+   if (hw->ctrl & NFP_NET_CFG_CTRL_MSIXAUTO) {
+   /* If MSI-X auto-masking is used, clear the entry */
+   rte_wmb();
+   rte_intr_enable(>pci_dev->intr_handle);
+   } else {
+   /* Make sure all updates are written before un-masking */
+   rte_wmb();
+   nn_cfg_writeb(hw, NFP_NET_CFG_ICR(NFP_NET_IRQ_LSC_IDX),
+ NFP_NET_CFG_ICR_UNMASKED);
+   }
+}
+
+static void
+nfp_net_dev_interrupt_handler(__rte_unused struct rte_intr_handle *handle,
+ void *param)
+{
+   int64_t timeout;
+   struct rte_eth_link link;
+   struct rte_eth_dev *dev = (struct rte_eth_dev *)param;
+
+   PMD_DRV_LOG(DEBUG, "We got a LSC interrupt!!!\n");
+
+   /* get the link status */
+   memset(, 0, sizeof(link));
+   nfp_net_dev_atomic_read_link_status(dev, );
+
+   nfp_net_link_update(dev, 0);
+
+   /* likely to up */
+   if (!link.link_status) {
+   /* handle it 1 sec later, wait it being stable */
+   timeout = NFP_NET_LINK_UP_CHECK_TIMEOUT;
+   /* likely to down */
+   } else {
+   /* handle it 4 sec later, wait it being stable */
+   timeout = NFP_NET_LINK_DOWN_CHECK_TIMEOUT;
+   }
+
+   if (rte_eal_alarm_set(timeout * 1000,
+ nfp_net_dev_interrupt_delayed_handler,
+ (void *)dev) < 0) {
+   RTE_LOG(ERR, PMD, "Error setting alarm");
+   /* Unmasking */
+   nfp_net_irq_unmask(dev);
+   }
+}
+
+/*
+ * Interrupt handler which shall be registered for alarm callback for delayed
+ * handling specific interrupt to wait for the stable nic state. As the NIC
+ * interrupt state is not stable for nfp after link is just down, it needs
+ * to wait 4 seconds to get the stable status.
+ *
+ * @param handle   Pointer to interrupt handle.
+ * @param paramThe address of parameter (struct rte_eth_dev *)
+ *
+ * @return  void
+ */
+static void
+nfp_net_dev_interrupt_delayed_handler(void *param)
+{
+   struct rte_eth_dev *dev = (struct rte_eth_dev *)param;
+
+   nfp_net_link_update(dev, 0);
+   _rte_eth_dev_callback_process(dev, RTE_ETH_EVENT_INTR_LSC);
+
+   nfp_net_dev_link_status_print(dev);
+
+   /* Unmasking */
+   nfp_net_irq_unmask(dev);
+}
+
 static int
 nfp_net_dev_mtu_set(struct rte_et

[dpdk-dev] [PATCH v10 6/8] nfp: adding extra functionality

2015-11-30 Thread Alejandro Lucero

Signed-off-by: Alejandro Lucero 
Signed-off-by: Rolf Neugebauer 
---
 drivers/net/nfp/nfp_net.c |  191 +
 1 file changed, 191 insertions(+)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index 7c82e96..ff9a8d6 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -73,8 +73,13 @@
 /* Prototypes */
 static void nfp_net_close(struct rte_eth_dev *dev);
 static int nfp_net_configure(struct rte_eth_dev *dev);
+static int nfp_net_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu);
+static void nfp_net_infos_get(struct rte_eth_dev *dev,
+ struct rte_eth_dev_info *dev_info);
 static int nfp_net_init(struct rte_eth_dev *eth_dev);
 static int nfp_net_link_update(struct rte_eth_dev *dev, int wait_to_complete);
+static void nfp_net_promisc_enable(struct rte_eth_dev *dev);
+static void nfp_net_promisc_disable(struct rte_eth_dev *dev);
 static int nfp_net_rx_fill_freelist(struct nfp_net_rxq *rxq);
 static uint32_t nfp_net_rx_queue_count(struct rte_eth_dev *dev,
   uint16_t queue_idx);
@@ -734,6 +739,65 @@ nfp_net_close(struct rte_eth_dev *dev)
 */
 }

+static void
+nfp_net_promisc_enable(struct rte_eth_dev *dev)
+{
+   uint32_t new_ctrl, update = 0;
+   struct nfp_net_hw *hw;
+
+   PMD_DRV_LOG(DEBUG, "Promiscuous mode enable\n");
+
+   hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+   if (!(hw->cap & NFP_NET_CFG_CTRL_PROMISC)) {
+   PMD_INIT_LOG(INFO, "Promiscuous mode not supported\n");
+   return;
+   }
+
+   if (hw->ctrl & NFP_NET_CFG_CTRL_PROMISC) {
+   PMD_DRV_LOG(INFO, "Promiscuous mode already enabled\n");
+   return;
+   }
+
+   new_ctrl = hw->ctrl | NFP_NET_CFG_CTRL_PROMISC;
+   update = NFP_NET_CFG_UPDATE_GEN;
+
+   /*
+* DPDK sets promiscuous mode on just after this call assuming
+* it can not fail ...
+*/
+   if (nfp_net_reconfig(hw, new_ctrl, update) < 0)
+   return;
+
+   hw->ctrl = new_ctrl;
+}
+
+static void
+nfp_net_promisc_disable(struct rte_eth_dev *dev)
+{
+   uint32_t new_ctrl, update = 0;
+   struct nfp_net_hw *hw;
+
+   hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+   if ((hw->ctrl & NFP_NET_CFG_CTRL_PROMISC) == 0) {
+   PMD_DRV_LOG(INFO, "Promiscuous mode already disabled\n");
+   return;
+   }
+
+   new_ctrl = hw->ctrl & ~NFP_NET_CFG_CTRL_PROMISC;
+   update = NFP_NET_CFG_UPDATE_GEN;
+
+   /*
+* DPDK sets promiscuous mode off just before this call
+* assuming it can not fail ...
+*/
+   if (nfp_net_reconfig(hw, new_ctrl, update) < 0)
+   return;
+
+   hw->ctrl = new_ctrl;
+}
+
 /*
  * return 0 means link status changed, -1 means not changed
  *
@@ -948,6 +1012,65 @@ nfp_net_stats_reset(struct rte_eth_dev *dev)
nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_DISCARDS);
 }

+static void
+nfp_net_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
+{
+   struct nfp_net_hw *hw;
+
+   hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+   dev_info->driver_name = dev->driver->pci_drv.name;
+   dev_info->max_rx_queues = (uint16_t)hw->max_rx_queues;
+   dev_info->max_tx_queues = (uint16_t)hw->max_tx_queues;
+   dev_info->min_rx_bufsize = ETHER_MIN_MTU;
+   dev_info->max_rx_pktlen = hw->mtu;
+   /* Next should change when PF support is implemented */
+   dev_info->max_mac_addrs = 1;
+
+   if (hw->cap & NFP_NET_CFG_CTRL_RXVLAN)
+   dev_info->rx_offload_capa = DEV_RX_OFFLOAD_VLAN_STRIP;
+
+   if (hw->cap & NFP_NET_CFG_CTRL_RXCSUM)
+   dev_info->rx_offload_capa |= DEV_RX_OFFLOAD_IPV4_CKSUM |
+DEV_RX_OFFLOAD_UDP_CKSUM |
+DEV_RX_OFFLOAD_TCP_CKSUM;
+
+   if (hw->cap & NFP_NET_CFG_CTRL_TXVLAN)
+   dev_info->tx_offload_capa = DEV_TX_OFFLOAD_VLAN_INSERT;
+
+   if (hw->cap & NFP_NET_CFG_CTRL_TXCSUM)
+   dev_info->tx_offload_capa |= DEV_TX_OFFLOAD_IPV4_CKSUM |
+DEV_RX_OFFLOAD_UDP_CKSUM |
+DEV_RX_OFFLOAD_TCP_CKSUM;
+
+   dev_info->default_rxconf = (struct rte_eth_rxconf) {
+   .rx_thresh = {
+   .pthresh = DEFAULT_RX_PTHRESH,
+   .hthresh = DEFAULT_RX_HTHRESH,
+   .wthresh = DEFAULT_RX_WTHRESH,
+   },
+   .rx_free_thresh = DEFAULT_RX_FREE_THRESH,
+   .rx_drop_en = 0,
+   };
+
+   dev_info->de

[dpdk-dev] [PATCH v10 5/8] nfp: adding link functionality

2015-11-30 Thread Alejandro Lucero

Signed-off-by: Alejandro Lucero 
Signed-off-by: Rolf Neugebauer 
---
 drivers/net/nfp/nfp_net.c |   96 +
 1 file changed, 96 insertions(+)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index 0912064..7c82e96 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -74,6 +74,7 @@
 static void nfp_net_close(struct rte_eth_dev *dev);
 static int nfp_net_configure(struct rte_eth_dev *dev);
 static int nfp_net_init(struct rte_eth_dev *eth_dev);
+static int nfp_net_link_update(struct rte_eth_dev *dev, int wait_to_complete);
 static int nfp_net_rx_fill_freelist(struct nfp_net_rxq *rxq);
 static uint32_t nfp_net_rx_queue_count(struct rte_eth_dev *dev,
   uint16_t queue_idx);
@@ -226,6 +227,57 @@ ring_dma_zone_reserve(struct rte_eth_dev *dev, const char 
*ring_name,
   NFP_MEMZONE_ALIGN);
 }

+/*
+ * Atomically reads link status information from global structure rte_eth_dev.
+ *
+ * @param dev
+ *   - Pointer to the structure rte_eth_dev to read from.
+ *   - Pointer to the buffer to be saved with the link status.
+ *
+ * @return
+ *   - On success, zero.
+ *   - On failure, negative value.
+ */
+static inline int
+nfp_net_dev_atomic_read_link_status(struct rte_eth_dev *dev,
+   struct rte_eth_link *link)
+{
+   struct rte_eth_link *dst = link;
+   struct rte_eth_link *src = >data->dev_link;
+
+   if (rte_atomic64_cmpset((uint64_t *)dst, *(uint64_t *)dst,
+   *(uint64_t *)src) == 0)
+   return -1;
+
+   return 0;
+}
+
+/*
+ * Atomically writes the link status information into global
+ * structure rte_eth_dev.
+ *
+ * @param dev
+ *   - Pointer to the structure rte_eth_dev to read from.
+ *   - Pointer to the buffer to be saved with the link status.
+ *
+ * @return
+ *   - On success, zero.
+ *   - On failure, negative value.
+ */
+static inline int
+nfp_net_dev_atomic_write_link_status(struct rte_eth_dev *dev,
+struct rte_eth_link *link)
+{
+   struct rte_eth_link *dst = >data->dev_link;
+   struct rte_eth_link *src = link;
+
+   if (rte_atomic64_cmpset((uint64_t *)dst, *(uint64_t *)dst,
+   *(uint64_t *)src) == 0)
+   return -1;
+
+   return 0;
+}
+
 static void
 nfp_net_rx_queue_release_mbufs(struct nfp_net_rxq *rxq)
 {
@@ -682,6 +734,49 @@ nfp_net_close(struct rte_eth_dev *dev)
 */
 }

+/*
+ * return 0 means link status changed, -1 means not changed
+ *
+ * Wait to complete is needed as it can take up to 9 seconds to get the Link
+ * status.
+ */
+static int
+nfp_net_link_update(struct rte_eth_dev *dev, __rte_unused int wait_to_complete)
+{
+   struct nfp_net_hw *hw;
+   struct rte_eth_link link, old;
+   uint32_t nn_link_status;
+
+   PMD_DRV_LOG(DEBUG, "Link update\n");
+
+   hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+   memset(, 0, sizeof(old));
+   nfp_net_dev_atomic_read_link_status(dev, );
+
+   nn_link_status = nn_cfg_readl(hw, NFP_NET_CFG_STS);
+
+   memset(, 0, sizeof(struct rte_eth_link));
+
+   if (nn_link_status & NFP_NET_CFG_STS_LINK)
+   link.link_status = 1;
+
+   link.link_duplex = ETH_LINK_FULL_DUPLEX;
+   /* Other cards can limit the tx and rx rate per VF */
+   link.link_speed = ETH_LINK_SPEED_40G;
+
+   if (old.link_status != link.link_status) {
+   nfp_net_dev_atomic_write_link_status(dev, );
+   if (link.link_status)
+   PMD_DRV_LOG(INFO, "NIC Link is Up\n");
+   else
+   PMD_DRV_LOG(INFO, "NIC Link is Down\n");
+   return 0;
+   }
+
+   return -1;
+}
+
 static void
 nfp_net_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
 {
@@ -1895,6 +1990,7 @@ static struct eth_dev_ops nfp_net_eth_dev_ops = {
.dev_start  = nfp_net_start,
.dev_stop   = nfp_net_stop,
.dev_close  = nfp_net_close,
+   .link_update= nfp_net_link_update,
.stats_get  = nfp_net_stats_get,
.stats_reset= nfp_net_stats_reset,
.reta_update= nfp_net_reta_update,
-- 
1.7.9.5

[dpdk-dev] [PATCH v10 4/8] nfp: adding stats

2015-11-30 Thread Alejandro Lucero

Signed-off-by: Alejandro Lucero 
Signed-off-by: Rolf Neugebauer 
---
 drivers/net/nfp/nfp_net.c |  179 +
 1 file changed, 179 insertions(+)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index a9be403..0912064 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -90,6 +90,9 @@ static int nfp_net_tx_queue_setup(struct rte_eth_dev *dev, 
uint16_t queue_idx,
  uint16_t nb_desc, unsigned int socket_id,
  const struct rte_eth_txconf *tx_conf);
 static int nfp_net_start(struct rte_eth_dev *dev);
+static void nfp_net_stats_get(struct rte_eth_dev *dev,
+ struct rte_eth_stats *stats);
+static void nfp_net_stats_reset(struct rte_eth_dev *dev);
 static void nfp_net_stop(struct rte_eth_dev *dev);
 static uint16_t nfp_net_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
  uint16_t nb_pkts);
@@ -679,6 +682,177 @@ nfp_net_close(struct rte_eth_dev *dev)
 */
 }

+static void
+nfp_net_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
+{
+   int i;
+   struct nfp_net_hw *hw;
+   struct rte_eth_stats nfp_dev_stats;
+
+   hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+   /* RTE_ETHDEV_QUEUE_STAT_CNTRS default value is 16 */
+
+   /* reading per RX ring stats */
+   for (i = 0; i < dev->data->nb_rx_queues; i++) {
+   if (i == RTE_ETHDEV_QUEUE_STAT_CNTRS)
+   break;
+
+   nfp_dev_stats.q_ipackets[i] =
+   nn_cfg_readq(hw, NFP_NET_CFG_RXR_STATS(i));
+
+   nfp_dev_stats.q_ipackets[i] -=
+   hw->eth_stats_base.q_ipackets[i];
+
+   nfp_dev_stats.q_ibytes[i] =
+   nn_cfg_readq(hw, NFP_NET_CFG_RXR_STATS(i) + 0x8);
+
+   nfp_dev_stats.q_ibytes[i] -=
+   hw->eth_stats_base.q_ibytes[i];
+   }
+
+   /* reading per TX ring stats */
+   for (i = 0; i < dev->data->nb_tx_queues; i++) {
+   if (i == RTE_ETHDEV_QUEUE_STAT_CNTRS)
+   break;
+
+   nfp_dev_stats.q_opackets[i] =
+   nn_cfg_readq(hw, NFP_NET_CFG_TXR_STATS(i));
+
+   nfp_dev_stats.q_opackets[i] -=
+   hw->eth_stats_base.q_opackets[i];
+
+   nfp_dev_stats.q_obytes[i] =
+   nn_cfg_readq(hw, NFP_NET_CFG_TXR_STATS(i) + 0x8);
+
+   nfp_dev_stats.q_obytes[i] -=
+   hw->eth_stats_base.q_obytes[i];
+   }
+
+   nfp_dev_stats.ipackets =
+   nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_FRAMES);
+
+   nfp_dev_stats.ipackets -= hw->eth_stats_base.ipackets;
+
+   nfp_dev_stats.ibytes =
+   nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_OCTETS);
+
+   nfp_dev_stats.ibytes -= hw->eth_stats_base.ibytes;
+
+   nfp_dev_stats.opackets =
+   nn_cfg_readq(hw, NFP_NET_CFG_STATS_TX_FRAMES);
+
+   nfp_dev_stats.opackets -= hw->eth_stats_base.opackets;
+
+   nfp_dev_stats.obytes =
+   nn_cfg_readq(hw, NFP_NET_CFG_STATS_TX_OCTETS);
+
+   nfp_dev_stats.obytes -= hw->eth_stats_base.obytes;
+
+   nfp_dev_stats.imcasts =
+   nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_MC_FRAMES);
+
+   nfp_dev_stats.imcasts -= hw->eth_stats_base.imcasts;
+
+   /* reading general device stats */
+   nfp_dev_stats.ierrors =
+   nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_ERRORS);
+
+   nfp_dev_stats.ierrors -= hw->eth_stats_base.ierrors;
+
+   nfp_dev_stats.oerrors =
+   nn_cfg_readq(hw, NFP_NET_CFG_STATS_TX_ERRORS);
+
+   nfp_dev_stats.oerrors -= hw->eth_stats_base.oerrors;
+
+   /* Multicast frames received */
+   nfp_dev_stats.imcasts =
+   nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_MC_FRAMES);
+
+   nfp_dev_stats.imcasts -= hw->eth_stats_base.imcasts;
+
+   /* RX ring mbuf allocation failures */
+   nfp_dev_stats.rx_nombuf = dev->data->rx_mbuf_alloc_failed;
+
+   nfp_dev_stats.imissed =
+   nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_DISCARDS);
+
+   nfp_dev_stats.imissed -= hw->eth_stats_base.imissed;
+
+   if (stats)
+   memcpy(stats, _dev_stats, sizeof(*stats));
+}
+
+static void
+nfp_net_stats_reset(struct rte_eth_dev *dev)
+{
+   int i;
+   struct nfp_net_hw *hw;
+
+   hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+   /*
+* hw->eth_stats_base records the per counter starting point.
+* Lets update it now
+*/
+
+   /* reading per RX ring stats */
+   for (i = 0; i < dev->data->nb_rx_queues; i++) {
+   if (i == RTE_ETHDEV_QUEUE_STAT_CNTRS)
+   break;
+
+   hw->eth_s

[dpdk-dev] [PATCH v10 3/8] nfp: adding rss

2015-11-30 Thread Alejandro Lucero

Signed-off-by: Alejandro Lucero 
Signed-off-by: Rolf Neugebauer 
---
 drivers/net/nfp/nfp_net.c |  218 +
 1 file changed, 218 insertions(+)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index 0d85fa4..a9be403 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -1501,12 +1501,230 @@ xmit_end:
return i;
 }

+/* Update Redirection Table(RETA) of Receive Side Scaling of Ethernet device */
+static int
+nfp_net_reta_update(struct rte_eth_dev *dev,
+   struct rte_eth_rss_reta_entry64 *reta_conf,
+   uint16_t reta_size)
+{
+   uint32_t reta, mask;
+   int i, j;
+   int idx, shift;
+   uint32_t update;
+   struct nfp_net_hw *hw =
+   NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+   if (!(hw->ctrl & NFP_NET_CFG_CTRL_RSS))
+   return -EINVAL;
+
+   if (reta_size != NFP_NET_CFG_RSS_ITBL_SZ) {
+   RTE_LOG(ERR, PMD, "The size of hash lookup table configured "
+   "(%d) doesn't match the number hardware can supported "
+   "(%d)\n", reta_size, NFP_NET_CFG_RSS_ITBL_SZ);
+   return -EINVAL;
+   }
+
+   /*
+* Update Redirection Table. There are 128 8bit-entries which can be
+* manage as 32 32bit-entries
+*/
+   for (i = 0; i < reta_size; i += 4) {
+   /* Handling 4 RSS entries per loop */
+   idx = i / RTE_RETA_GROUP_SIZE;
+   shift = i % RTE_RETA_GROUP_SIZE;
+   mask = (uint8_t)((reta_conf[idx].mask >> shift) & 0xF);
+
+   if (!mask)
+   continue;
+
+   reta = 0;
+   /* If all 4 entries were set, don't need read RETA register */
+   if (mask != 0xF)
+   reta = nn_cfg_readl(hw, NFP_NET_CFG_RSS_ITBL + i);
+
+   for (j = 0; j < 4; j++) {
+   if (!(mask & (0x1 << j)))
+   continue;
+   if (mask != 0xF)
+   /* Clearing the entry bits */
+   reta &= ~(0xFF << (8 * j));
+   reta |= reta_conf[idx].reta[shift + j] << (8 * j);
+   }
+   nn_cfg_writel(hw, NFP_NET_CFG_RSS_ITBL + shift, reta);
+   }
+
+   update = NFP_NET_CFG_UPDATE_RSS;
+
+   if (nfp_net_reconfig(hw, hw->ctrl, update) < 0)
+   return -EIO;
+
+   return 0;
+}
+
+ /* Query Redirection Table(RETA) of Receive Side Scaling of Ethernet device. 
*/
+static int
+nfp_net_reta_query(struct rte_eth_dev *dev,
+  struct rte_eth_rss_reta_entry64 *reta_conf,
+  uint16_t reta_size)
+{
+   uint8_t i, j, mask;
+   int idx, shift;
+   uint32_t reta;
+   struct nfp_net_hw *hw;
+
+   hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+   if (!(hw->ctrl & NFP_NET_CFG_CTRL_RSS))
+   return -EINVAL;
+
+   if (reta_size != NFP_NET_CFG_RSS_ITBL_SZ) {
+   RTE_LOG(ERR, PMD, "The size of hash lookup table configured "
+   "(%d) doesn't match the number hardware can supported "
+   "(%d)\n", reta_size, NFP_NET_CFG_RSS_ITBL_SZ);
+   return -EINVAL;
+   }
+
+   /*
+* Reading Redirection Table. There are 128 8bit-entries which can be
+* manage as 32 32bit-entries
+*/
+   for (i = 0; i < reta_size; i += 4) {
+   /* Handling 4 RSS entries per loop */
+   idx = i / RTE_RETA_GROUP_SIZE;
+   shift = i % RTE_RETA_GROUP_SIZE;
+   mask = (uint8_t)((reta_conf[idx].mask >> shift) & 0xF);
+
+   if (!mask)
+   continue;
+
+   reta = nn_cfg_readl(hw, NFP_NET_CFG_RSS_ITBL + shift);
+   for (j = 0; j < 4; j++) {
+   if (!(mask & (0x1 << j)))
+   continue;
+   reta_conf->reta[shift + j] =
+   (uint8_t)((reta >> (8 * j)) & 0xF);
+   }
+   }
+   return 0;
+}
+
+static int
+nfp_net_rss_hash_update(struct rte_eth_dev *dev,
+   struct rte_eth_rss_conf *rss_conf)
+{
+   uint32_t update;
+   uint32_t cfg_rss_ctrl = 0;
+   uint8_t key;
+   uint64_t rss_hf;
+   int i;
+   struct nfp_net_hw *hw;
+
+   hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+   rss_hf = rss_conf->rss_hf;
+
+   /* Checking if RSS is enabled */
+   if (!(hw->ctrl & NFP_NET_CFG_CTRL_RSS)) {
+   if (rss_hf != 0) { /* Enable RSS? */
+

[dpdk-dev] [PATCH v10 2/8] nfp: adding rx/tx functionality

2015-11-30 Thread Alejandro Lucero

Signed-off-by: Alejandro Lucero 
Signed-off-by: Rolf Neugebauer 
---
 drivers/net/nfp/nfp_net.c |  993 +
 1 file changed, 993 insertions(+)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index b9240db..0d85fa4 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -74,8 +74,25 @@
 static void nfp_net_close(struct rte_eth_dev *dev);
 static int nfp_net_configure(struct rte_eth_dev *dev);
 static int nfp_net_init(struct rte_eth_dev *eth_dev);
+static int nfp_net_rx_fill_freelist(struct nfp_net_rxq *rxq);
+static uint32_t nfp_net_rx_queue_count(struct rte_eth_dev *dev,
+  uint16_t queue_idx);
+static uint16_t nfp_net_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
+ uint16_t nb_pkts);
+static void nfp_net_rx_queue_release(void *rxq);
+static int nfp_net_rx_queue_setup(struct rte_eth_dev *dev, uint16_t queue_idx,
+ uint16_t nb_desc, unsigned int socket_id,
+ const struct rte_eth_rxconf *rx_conf,
+ struct rte_mempool *mp);
+static int nfp_net_tx_free_bufs(struct nfp_net_txq *txq);
+static void nfp_net_tx_queue_release(void *txq);
+static int nfp_net_tx_queue_setup(struct rte_eth_dev *dev, uint16_t queue_idx,
+ uint16_t nb_desc, unsigned int socket_id,
+ const struct rte_eth_txconf *tx_conf);
 static int nfp_net_start(struct rte_eth_dev *dev);
 static void nfp_net_stop(struct rte_eth_dev *dev);
+static uint16_t nfp_net_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
+ uint16_t nb_pkts);

 /*
  * The offset of the queue controller queues in the PCIe Target. These
@@ -186,6 +203,100 @@ nn_cfg_writeq(struct nfp_net_hw *hw, int off, uint64_t 
val)
nn_writeq(rte_cpu_to_le_64(val), hw->ctrl_bar + off);
 }

+/* Creating memzone for hardware rings. */
+static const struct rte_memzone *
+ring_dma_zone_reserve(struct rte_eth_dev *dev, const char *ring_name,
+ uint16_t queue_id, uint32_t ring_size, int socket_id)
+{
+   char z_name[RTE_MEMZONE_NAMESIZE];
+   const struct rte_memzone *mz;
+
+   snprintf(z_name, sizeof(z_name), "%s_%s_%d_%d",
+dev->driver->pci_drv.name,
+ring_name, dev->data->port_id, queue_id);
+
+   mz = rte_memzone_lookup(z_name);
+   if (mz)
+   return mz;
+
+   return rte_memzone_reserve_aligned(z_name, ring_size, socket_id, 0,
+  NFP_MEMZONE_ALIGN);
+}
+
+static void
+nfp_net_rx_queue_release_mbufs(struct nfp_net_rxq *rxq)
+{
+   unsigned i;
+
+   if (rxq->rxbufs == NULL)
+   return;
+
+   for (i = 0; i < rxq->rx_count; i++) {
+   if (rxq->rxbufs[i].mbuf) {
+   rte_pktmbuf_free_seg(rxq->rxbufs[i].mbuf);
+   rxq->rxbufs[i].mbuf = NULL;
+   }
+   }
+}
+
+static void
+nfp_net_rx_queue_release(void *rx_queue)
+{
+   struct nfp_net_rxq *rxq = rx_queue;
+
+   if (rxq) {
+   nfp_net_rx_queue_release_mbufs(rxq);
+   rte_free(rxq->rxbufs);
+   rte_free(rxq);
+   }
+}
+
+static void
+nfp_net_reset_rx_queue(struct nfp_net_rxq *rxq)
+{
+   nfp_net_rx_queue_release_mbufs(rxq);
+   rxq->wr_p = 0;
+   rxq->rd_p = 0;
+   rxq->nb_rx_hold = 0;
+}
+
+static void
+nfp_net_tx_queue_release_mbufs(struct nfp_net_txq *txq)
+{
+   unsigned i;
+
+   if (txq->txbufs == NULL)
+   return;
+
+   for (i = 0; i < txq->tx_count; i++) {
+   if (txq->txbufs[i].mbuf) {
+   rte_pktmbuf_free_seg(txq->txbufs[i].mbuf);
+   txq->txbufs[i].mbuf = NULL;
+   }
+   }
+}
+
+static void
+nfp_net_tx_queue_release(void *tx_queue)
+{
+   struct nfp_net_txq *txq = tx_queue;
+
+   if (txq) {
+   nfp_net_tx_queue_release_mbufs(txq);
+   rte_free(txq->txbufs);
+   rte_free(txq);
+   }
+}
+
+static void
+nfp_net_reset_tx_queue(struct nfp_net_txq *txq)
+{
+   nfp_net_tx_queue_release_mbufs(txq);
+   txq->wr_p = 0;
+   txq->rd_p = 0;
+   txq->tail = 0;
+}
+
 static int
 __nfp_net_reconfig(struct nfp_net_hw *hw, uint32_t update)
 {
@@ -423,6 +534,18 @@ nfp_net_disable_queues(struct rte_eth_dev *dev)
hw->ctrl = new_ctrl;
 }

+static int
+nfp_net_rx_freelist_setup(struct rte_eth_dev *dev)
+{
+   int i;
+
+   for (i = 0; i < dev->data->nb_rx_queues; i++) {
+   if (nfp_net_rx_fill_freelist(dev->data->rx_queues[i]) < 0)
+   return -1;
+   }
+   return 0;
+}
+
 static void
 nfp_net_params_setup(struct nfp_net_hw *

[dpdk-dev] [PATCH v10 1/8] nfp: basic initialization

2015-11-30 Thread Alejandro Lucero

Signed-off-by: Alejandro Lucero 
Signed-off-by: Rolf Neugebauer 
---
 MAINTAINERS |3 +
 config/common_linuxapp  |6 +
 doc/guides/rel_notes/release_2_2.rst|3 +
 drivers/net/Makefile|1 +
 drivers/net/nfp/Makefile|   56 +++
 drivers/net/nfp/nfp_net.c   |  699 +++
 drivers/net/nfp/nfp_net_ctrl.h  |  324 ++
 drivers/net/nfp/nfp_net_logs.h  |   75 
 drivers/net/nfp/nfp_net_pmd.h   |  453 
 drivers/net/nfp/rte_pmd_nfp_version.map |3 +
 mk/rte.app.mk   |1 +
 11 files changed, 1624 insertions(+)
 create mode 100644 drivers/net/nfp/Makefile
 create mode 100644 drivers/net/nfp/nfp_net.c
 create mode 100644 drivers/net/nfp/nfp_net_ctrl.h
 create mode 100644 drivers/net/nfp/nfp_net_logs.h
 create mode 100644 drivers/net/nfp/nfp_net_pmd.h
 create mode 100644 drivers/net/nfp/rte_pmd_nfp_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index 4478862..a23de04 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -335,6 +335,9 @@ F: drivers/crypto/aesni_mb/
 Intel QuickAssist
 F: drivers/crypto/qat/

+Netronome nfp
+M: Alejandro Lucero 
+F: drivers/net/nfp/

 Packet processing
 -
diff --git a/config/common_linuxapp b/config/common_linuxapp
index 2866986..82f68c7 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -279,6 +279,12 @@ CONFIG_RTE_LIBRTE_VMXNET3_DEBUG_TX_FREE=n
 CONFIG_RTE_LIBRTE_VMXNET3_DEBUG_DRIVER=n

 #
+# Compile burst-oriented Netronome NFP PMD driver
+#
+CONFIG_RTE_LIBRTE_NFP_PMD=n
+CONFIG_RTE_LIBRTE_NFP_DEBUG=n
+
+#
 # Compile example software rings based PMD
 #
 CONFIG_RTE_LIBRTE_PMD_RING=y
diff --git a/doc/guides/rel_notes/release_2_2.rst 
b/doc/guides/rel_notes/release_2_2.rst
index 511d7a0..0a7c217 100644
--- a/doc/guides/rel_notes/release_2_2.rst
+++ b/doc/guides/rel_notes/release_2_2.rst
@@ -230,6 +230,9 @@ Libraries
   hardware transactional memory support, thread scaling did not work,
   due to the global ring that is shared by all cores.

+* **nfp: adding new PMD for Netronome nfp-6xxx card.**
+
+  Support for using Netronome nfp-6xxx with PCI VFs.

 Examples
 
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index cddcd57..6e4497e 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -43,6 +43,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD) += ixgbe
 DIRS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4
 DIRS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5
 DIRS-$(CONFIG_RTE_LIBRTE_MPIPE_PMD) += mpipe
+DIRS-$(CONFIG_RTE_LIBRTE_NFP_PMD) += nfp
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_NULL) += null
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_PCAP) += pcap
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_RING) += ring
diff --git a/drivers/net/nfp/Makefile b/drivers/net/nfp/Makefile
new file mode 100644
index 000..ef7a13d
--- /dev/null
+++ b/drivers/net/nfp/Makefile
@@ -0,0 +1,56 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_nfp.a
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+EXPORT_MAP := rte_pmd_nfp_version.map
+
+LIBABIVER := 1
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_NFP_PMD) += nfp_net.c
+
+# this lib depends upon:
+DEPDIRS-$(CONFIG_RTE_LIBRTE_NFP_P

[dpdk-dev] [PATCH v10 0/8] support for netronome nfp-6xxx card

2015-11-30 Thread Alejandro Lucero

This patchset adds a new PMD for Netronome nfp-6xxx card.
Just PCI Virtual Functions support.
Using this PMD requires previous Netronome BSP installation.

v10:
 - Getting rid of __u8 usage
 - Squashing last two patches in one

v9:
 -  - Adding flag RTE_PCI_DRV_INTR_LSC
 - Makefile changes for compilation as a shared library
 - Adding map file for linker version script info

v8:
 - removing remaining unnecessary flags to PMD Makefile

v7:
 - Adding support for link status changes interrupts.
 - removing unnecessary flags when compiling the PMD.

v6:
 - Making each patch compilable.

v5:
 - Splitting up patches per functionality.

v4:
 - Getting rid of nfp_uio. Just submitting PMD.
 - Removing LSC interrupt support.

v3:
 - Making all patches independent for applying and building.
 - changing commits messages following standard.

v2:
 - Code style changes based on checkpatch.pl and DPDK style guide.
 - Documentation changes using the right rst format.
 - Moving the documentation files to a new patch file.
 - Adding info to MAINTAINERS and release files.

Alejandro Lucero (8):
  nfp: basic initialization
  nfp: adding rx/tx functionality
  nfp: adding rss
  nfp: adding stats
  nfp: adding link functionality
  nfp: adding extra functionality
  nfp: link status change interrupt support
  nfp: adding nic guide

 MAINTAINERS |4 +
 config/common_linuxapp  |6 +
 doc/guides/nics/index.rst   |1 +
 doc/guides/nics/nfp.rst |  265 
 doc/guides/rel_notes/release_2_2.rst|3 +
 drivers/net/Makefile|1 +
 drivers/net/nfp/Makefile|   56 +
 drivers/net/nfp/nfp_net.c   | 2499 +++
 drivers/net/nfp/nfp_net_ctrl.h  |  324 
 drivers/net/nfp/nfp_net_logs.h  |   75 +
 drivers/net/nfp/nfp_net_pmd.h   |  453 ++
 drivers/net/nfp/rte_pmd_nfp_version.map |3 +
 mk/rte.app.mk   |1 +
 13 files changed, 3691 insertions(+)
 create mode 100644 doc/guides/nics/nfp.rst
 create mode 100644 drivers/net/nfp/Makefile
 create mode 100644 drivers/net/nfp/nfp_net.c
 create mode 100644 drivers/net/nfp/nfp_net_ctrl.h
 create mode 100644 drivers/net/nfp/nfp_net_logs.h
 create mode 100644 drivers/net/nfp/nfp_net_pmd.h
 create mode 100644 drivers/net/nfp/rte_pmd_nfp_version.map

-- 
1.7.9.5

[dpdk-dev] [PATCH v9 1/9] nfp: basic initialization

2015-11-27 Thread Alejandro Lucero

I converted (almost) all the Linux typedefs. This one went under the radar.

We do not have such thing like ixgbe/base but maybe it makes sense.

Should I send a new patchset version for fixing this "minor nit"?

Thanks

On Thu, Nov 26, 2015 at 6:14 PM, Stephen Hemminger <
stephen at networkplumber.org> wrote:

> On Thu, 26 Nov 2015 09:49:21 +
> Alejandro Lucero  wrote:
>
> > +static inline void
> > +nfp_qcp_ptr_add(__u8 *q, enum nfp_qcp_ptr ptr, uint32_t val)
> > +{
> > + uint32_t off;
>
> Minor nit. why mix use of Linux specific basic size typedefs (__u8)
> with Posix standard values (uint32_t). The DPDK style is to use
> the Posix types except in kernel drivers or code that is coming
> from unified drivers (ie ixgbe/base)
>

[dpdk-dev] [PATCH v9 8/9] nfp: adding nic guide

2015-11-26 Thread Alejandro Lucero

Signed-off-by: Alejandro Lucero 
Signed-off-by: Rolf Neugebauer 
---
 doc/guides/nics/index.rst |1 +
 doc/guides/nics/nfp.rst   |  265 +
 2 files changed, 266 insertions(+)
 create mode 100644 doc/guides/nics/nfp.rst

diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
index 0a0b724..7bf2938 100644
--- a/doc/guides/nics/index.rst
+++ b/doc/guides/nics/index.rst
@@ -46,6 +46,7 @@ Network Interface Controller Drivers
 intel_vf
 mlx4
 mlx5
+nfp
 szedata2
 virtio
 vmxnet3
diff --git a/doc/guides/nics/nfp.rst b/doc/guides/nics/nfp.rst
new file mode 100644
index 000..55ba64d
--- /dev/null
+++ b/doc/guides/nics/nfp.rst
@@ -0,0 +1,265 @@
+..  BSD LICENSE
+Copyright(c) 2015 Netronome Systems, Inc. All rights reserved.
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions
+are met:
+
+* Redistributions of source code must retain the above copyright
+notice, this list of conditions and the following disclaimer.
+* Redistributions in binary form must reproduce the above copyright
+notice, this list of conditions and the following disclaimer in
+the documentation and/or other materials provided with the
+distribution.
+* Neither the name of Intel Corporation nor the names of its
+contributors may be used to endorse or promote products derived
+from this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+NFP poll mode driver library
+
+
+Netronome's sixth generation of flow processors pack 216 programmable
+cores and over 100 hardware accelerators that uniquely combine packet,
+flow, security and content processing in a single device that scales
+up to 400 Gbps.
+
+This document explains how to use DPDK with the Netronome Poll Mode
+Driver (PMD) supporting Netronome's Network Flow Processor 6xxx
+(NFP-6xxx).
+
+Currently the driver supports virtual functions (VFs) only.
+
+Dependencies
+
+
+Before using the Netronome's DPDK PMD some NFP-6xxx configuration,
+which is not related to DPDK, is required. The system requires
+installation of **Netronome's BSP (Board Support Package)** which includes
+Linux drivers, programs and libraries.
+
+If you have a NFP-6xxx device you should already have the code and
+documentation for doing this configuration. Contact
+**support at netronome.com** to obtain the latest available firmware.
+
+The NFP Linux kernel drivers (including the required PF driver for the
+NFP) are available on Github at
+**https://github.com/Netronome/nfp-drv-kmods** along with build
+instructions.
+
+DPDK runs in userspace and PMDs uses the Linux kernel UIO interface to
+allow access to physical devices from userspace. The NFP PMD requires
+a separate UIO driver, **nfp_uio**, to perform correct
+initialization. This driver is part of Netronome?s BSP and it is
+equivalent to Intel's igb_uio driver.
+
+Building the software
+-
+
+Netronome's PMD code is provided in the **drivers/net/nfp** directory.
+Because Netronome?s BSP dependencies the driver is disabled by default
+in DPDK build using **common_linuxapp configuration** file. Enabling the
+driver or if you use another configuration file and want to have NFP
+support, this variable is needed:
+
+- **CONFIG_RTE_LIBRTE_NFP_PMD=y**
+
+Once DPDK is built all the DPDK apps and examples include support for
+the NFP PMD.
+
+
+System configuration
+
+
+Using the NFP PMD is not different to using other PMDs. Usual steps are:
+
+#. **Configure hugepages:** All major Linux distributions have the hugepages
+   functionality enabled by default. By default this allows the system uses for
+   working with transparent hugepages. But in this case some hugepages need to
+   be created/reserved for use with the DPDK through the hugetlbfs file system.
+   First the virtual file system need to be mounted:
+
+   .. code-block:: console
+
+  mount -t hugetlbfs none /mnt/hugetlbfs
+
+   The command uses the common mount point for this f

[dpdk-dev] [PATCH v9 7/9] nfp: link status change interrupt support

2015-11-26 Thread Alejandro Lucero

Signed-off-by: Alejandro Lucero 
Signed-off-by: Rolf Neugebauer 
---
 drivers/net/nfp/nfp_net.c |  123 +
 1 file changed, 123 insertions(+)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index 5383f51..3763790 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -73,6 +73,9 @@
 /* Prototypes */
 static void nfp_net_close(struct rte_eth_dev *dev);
 static int nfp_net_configure(struct rte_eth_dev *dev);
+static void nfp_net_dev_interrupt_handler(struct rte_intr_handle *handle,
+ void *param);
+static void nfp_net_dev_interrupt_delayed_handler(void *param);
 static int nfp_net_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu);
 static void nfp_net_infos_get(struct rte_eth_dev *dev,
  struct rte_eth_dev_info *dev_info);
@@ -731,6 +734,7 @@ nfp_net_close(struct rte_eth_dev *dev)

nfp_net_stop(dev);

+   rte_intr_disable(>pci_dev->intr_handle);
nn_cfg_writeb(hw, NFP_NET_CFG_LSC, 0xff);

/*
@@ -1115,6 +1119,114 @@ nfp_net_rx_queue_count(struct rte_eth_dev *dev, 
uint16_t queue_idx)
return count;
 }

+static void
+nfp_net_dev_link_status_print(struct rte_eth_dev *dev)
+{
+   struct rte_eth_link link;
+
+   memset(, 0, sizeof(link));
+   nfp_net_dev_atomic_read_link_status(dev, );
+   if (link.link_status)
+   RTE_LOG(INFO, PMD, "Port %d: Link Up - speed %u Mbps - %s\n",
+   (int)(dev->data->port_id), (unsigned)link.link_speed,
+   link.link_duplex == ETH_LINK_FULL_DUPLEX
+   ? "full-duplex" : "half-duplex");
+   else
+   RTE_LOG(INFO, PMD, " Port %d: Link Down\n",
+   (int)(dev->data->port_id));
+
+   RTE_LOG(INFO, PMD, "PCI Address: %04d:%02d:%02d:%d\n",
+   dev->pci_dev->addr.domain, dev->pci_dev->addr.bus,
+   dev->pci_dev->addr.devid, dev->pci_dev->addr.function);
+}
+
+/* Interrupt configuration and handling */
+
+/*
+ * nfp_net_irq_unmask - Unmask an interrupt
+ *
+ * If MSI-X auto-masking is enabled clear the mask bit, otherwise
+ * clear the ICR for the entry.
+ */
+static void
+nfp_net_irq_unmask(struct rte_eth_dev *dev)
+{
+   struct nfp_net_hw *hw;
+
+   hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+   if (hw->ctrl & NFP_NET_CFG_CTRL_MSIXAUTO) {
+   /* If MSI-X auto-masking is used, clear the entry */
+   rte_wmb();
+   rte_intr_enable(>pci_dev->intr_handle);
+   } else {
+   /* Make sure all updates are written before un-masking */
+   rte_wmb();
+   nn_cfg_writeb(hw, NFP_NET_CFG_ICR(NFP_NET_IRQ_LSC_IDX),
+ NFP_NET_CFG_ICR_UNMASKED);
+   }
+}
+
+static void
+nfp_net_dev_interrupt_handler(__rte_unused struct rte_intr_handle *handle,
+ void *param)
+{
+   int64_t timeout;
+   struct rte_eth_link link;
+   struct rte_eth_dev *dev = (struct rte_eth_dev *)param;
+
+   PMD_DRV_LOG(DEBUG, "We got a LSC interrupt!!!\n");
+
+   /* get the link status */
+   memset(, 0, sizeof(link));
+   nfp_net_dev_atomic_read_link_status(dev, );
+
+   nfp_net_link_update(dev, 0);
+
+   /* likely to up */
+   if (!link.link_status) {
+   /* handle it 1 sec later, wait it being stable */
+   timeout = NFP_NET_LINK_UP_CHECK_TIMEOUT;
+   /* likely to down */
+   } else {
+   /* handle it 4 sec later, wait it being stable */
+   timeout = NFP_NET_LINK_DOWN_CHECK_TIMEOUT;
+   }
+
+   if (rte_eal_alarm_set(timeout * 1000,
+ nfp_net_dev_interrupt_delayed_handler,
+ (void *)dev) < 0) {
+   RTE_LOG(ERR, PMD, "Error setting alarm");
+   /* Unmasking */
+   nfp_net_irq_unmask(dev);
+   }
+}
+
+/*
+ * Interrupt handler which shall be registered for alarm callback for delayed
+ * handling specific interrupt to wait for the stable nic state. As the NIC
+ * interrupt state is not stable for nfp after link is just down, it needs
+ * to wait 4 seconds to get the stable status.
+ *
+ * @param handle   Pointer to interrupt handle.
+ * @param paramThe address of parameter (struct rte_eth_dev *)
+ *
+ * @return  void
+ */
+static void
+nfp_net_dev_interrupt_delayed_handler(void *param)
+{
+   struct rte_eth_dev *dev = (struct rte_eth_dev *)param;
+
+   nfp_net_link_update(dev, 0);
+   _rte_eth_dev_callback_process(dev, RTE_ETH_EVENT_INTR_LSC);
+
+   nfp_net_dev_link_status_print(dev);
+
+   /* Unmasking */
+   nfp_net_irq_unmask(dev);
+}
+
 static int
 nfp_net_dev_mtu_set(struct rte_et

[dpdk-dev] [PATCH v9 6/9] nfp: adding extra functionality

2015-11-26 Thread Alejandro Lucero

Signed-off-by: Alejandro Lucero 
Signed-off-by: Rolf Neugebauer 
---
 drivers/net/nfp/nfp_net.c |  191 +
 1 file changed, 191 insertions(+)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index 567ea26..5383f51 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -73,8 +73,13 @@
 /* Prototypes */
 static void nfp_net_close(struct rte_eth_dev *dev);
 static int nfp_net_configure(struct rte_eth_dev *dev);
+static int nfp_net_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu);
+static void nfp_net_infos_get(struct rte_eth_dev *dev,
+ struct rte_eth_dev_info *dev_info);
 static int nfp_net_init(struct rte_eth_dev *eth_dev);
 static int nfp_net_link_update(struct rte_eth_dev *dev, int wait_to_complete);
+static void nfp_net_promisc_enable(struct rte_eth_dev *dev);
+static void nfp_net_promisc_disable(struct rte_eth_dev *dev);
 static int nfp_net_rx_fill_freelist(struct nfp_net_rxq *rxq);
 static uint32_t nfp_net_rx_queue_count(struct rte_eth_dev *dev,
   uint16_t queue_idx);
@@ -734,6 +739,65 @@ nfp_net_close(struct rte_eth_dev *dev)
 */
 }

+static void
+nfp_net_promisc_enable(struct rte_eth_dev *dev)
+{
+   uint32_t new_ctrl, update = 0;
+   struct nfp_net_hw *hw;
+
+   PMD_DRV_LOG(DEBUG, "Promiscuous mode enable\n");
+
+   hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+   if (!(hw->cap & NFP_NET_CFG_CTRL_PROMISC)) {
+   PMD_INIT_LOG(INFO, "Promiscuous mode not supported\n");
+   return;
+   }
+
+   if (hw->ctrl & NFP_NET_CFG_CTRL_PROMISC) {
+   PMD_DRV_LOG(INFO, "Promiscuous mode already enabled\n");
+   return;
+   }
+
+   new_ctrl = hw->ctrl | NFP_NET_CFG_CTRL_PROMISC;
+   update = NFP_NET_CFG_UPDATE_GEN;
+
+   /*
+* DPDK sets promiscuous mode on just after this call assuming
+* it can not fail ...
+*/
+   if (nfp_net_reconfig(hw, new_ctrl, update) < 0)
+   return;
+
+   hw->ctrl = new_ctrl;
+}
+
+static void
+nfp_net_promisc_disable(struct rte_eth_dev *dev)
+{
+   uint32_t new_ctrl, update = 0;
+   struct nfp_net_hw *hw;
+
+   hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+   if ((hw->ctrl & NFP_NET_CFG_CTRL_PROMISC) == 0) {
+   PMD_DRV_LOG(INFO, "Promiscuous mode already disabled\n");
+   return;
+   }
+
+   new_ctrl = hw->ctrl & ~NFP_NET_CFG_CTRL_PROMISC;
+   update = NFP_NET_CFG_UPDATE_GEN;
+
+   /*
+* DPDK sets promiscuous mode off just before this call
+* assuming it can not fail ...
+*/
+   if (nfp_net_reconfig(hw, new_ctrl, update) < 0)
+   return;
+
+   hw->ctrl = new_ctrl;
+}
+
 /*
  * return 0 means link status changed, -1 means not changed
  *
@@ -948,6 +1012,65 @@ nfp_net_stats_reset(struct rte_eth_dev *dev)
nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_DISCARDS);
 }

+static void
+nfp_net_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
+{
+   struct nfp_net_hw *hw;
+
+   hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+   dev_info->driver_name = dev->driver->pci_drv.name;
+   dev_info->max_rx_queues = (uint16_t)hw->max_rx_queues;
+   dev_info->max_tx_queues = (uint16_t)hw->max_tx_queues;
+   dev_info->min_rx_bufsize = ETHER_MIN_MTU;
+   dev_info->max_rx_pktlen = hw->mtu;
+   /* Next should change when PF support is implemented */
+   dev_info->max_mac_addrs = 1;
+
+   if (hw->cap & NFP_NET_CFG_CTRL_RXVLAN)
+   dev_info->rx_offload_capa = DEV_RX_OFFLOAD_VLAN_STRIP;
+
+   if (hw->cap & NFP_NET_CFG_CTRL_RXCSUM)
+   dev_info->rx_offload_capa |= DEV_RX_OFFLOAD_IPV4_CKSUM |
+DEV_RX_OFFLOAD_UDP_CKSUM |
+DEV_RX_OFFLOAD_TCP_CKSUM;
+
+   if (hw->cap & NFP_NET_CFG_CTRL_TXVLAN)
+   dev_info->tx_offload_capa = DEV_TX_OFFLOAD_VLAN_INSERT;
+
+   if (hw->cap & NFP_NET_CFG_CTRL_TXCSUM)
+   dev_info->tx_offload_capa |= DEV_TX_OFFLOAD_IPV4_CKSUM |
+DEV_RX_OFFLOAD_UDP_CKSUM |
+DEV_RX_OFFLOAD_TCP_CKSUM;
+
+   dev_info->default_rxconf = (struct rte_eth_rxconf) {
+   .rx_thresh = {
+   .pthresh = DEFAULT_RX_PTHRESH,
+   .hthresh = DEFAULT_RX_HTHRESH,
+   .wthresh = DEFAULT_RX_WTHRESH,
+   },
+   .rx_free_thresh = DEFAULT_RX_FREE_THRESH,
+   .rx_drop_en = 0,
+   };
+
+   dev_info->de

[dpdk-dev] [PATCH v9 5/9] nfp: adding link functionality

2015-11-26 Thread Alejandro Lucero

Signed-off-by: Alejandro Lucero 
Signed-off-by: Rolf Neugebauer 
---
 drivers/net/nfp/nfp_net.c |   96 +
 1 file changed, 96 insertions(+)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index fc02916..567ea26 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -74,6 +74,7 @@
 static void nfp_net_close(struct rte_eth_dev *dev);
 static int nfp_net_configure(struct rte_eth_dev *dev);
 static int nfp_net_init(struct rte_eth_dev *eth_dev);
+static int nfp_net_link_update(struct rte_eth_dev *dev, int wait_to_complete);
 static int nfp_net_rx_fill_freelist(struct nfp_net_rxq *rxq);
 static uint32_t nfp_net_rx_queue_count(struct rte_eth_dev *dev,
   uint16_t queue_idx);
@@ -226,6 +227,57 @@ ring_dma_zone_reserve(struct rte_eth_dev *dev, const char 
*ring_name,
   NFP_MEMZONE_ALIGN);
 }

+/*
+ * Atomically reads link status information from global structure rte_eth_dev.
+ *
+ * @param dev
+ *   - Pointer to the structure rte_eth_dev to read from.
+ *   - Pointer to the buffer to be saved with the link status.
+ *
+ * @return
+ *   - On success, zero.
+ *   - On failure, negative value.
+ */
+static inline int
+nfp_net_dev_atomic_read_link_status(struct rte_eth_dev *dev,
+   struct rte_eth_link *link)
+{
+   struct rte_eth_link *dst = link;
+   struct rte_eth_link *src = >data->dev_link;
+
+   if (rte_atomic64_cmpset((uint64_t *)dst, *(uint64_t *)dst,
+   *(uint64_t *)src) == 0)
+   return -1;
+
+   return 0;
+}
+
+/*
+ * Atomically writes the link status information into global
+ * structure rte_eth_dev.
+ *
+ * @param dev
+ *   - Pointer to the structure rte_eth_dev to read from.
+ *   - Pointer to the buffer to be saved with the link status.
+ *
+ * @return
+ *   - On success, zero.
+ *   - On failure, negative value.
+ */
+static inline int
+nfp_net_dev_atomic_write_link_status(struct rte_eth_dev *dev,
+struct rte_eth_link *link)
+{
+   struct rte_eth_link *dst = >data->dev_link;
+   struct rte_eth_link *src = link;
+
+   if (rte_atomic64_cmpset((uint64_t *)dst, *(uint64_t *)dst,
+   *(uint64_t *)src) == 0)
+   return -1;
+
+   return 0;
+}
+
 static void
 nfp_net_rx_queue_release_mbufs(struct nfp_net_rxq *rxq)
 {
@@ -682,6 +734,49 @@ nfp_net_close(struct rte_eth_dev *dev)
 */
 }

+/*
+ * return 0 means link status changed, -1 means not changed
+ *
+ * Wait to complete is needed as it can take up to 9 seconds to get the Link
+ * status.
+ */
+static int
+nfp_net_link_update(struct rte_eth_dev *dev, __rte_unused int wait_to_complete)
+{
+   struct nfp_net_hw *hw;
+   struct rte_eth_link link, old;
+   uint32_t nn_link_status;
+
+   PMD_DRV_LOG(DEBUG, "Link update\n");
+
+   hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+   memset(, 0, sizeof(old));
+   nfp_net_dev_atomic_read_link_status(dev, );
+
+   nn_link_status = nn_cfg_readl(hw, NFP_NET_CFG_STS);
+
+   memset(, 0, sizeof(struct rte_eth_link));
+
+   if (nn_link_status & NFP_NET_CFG_STS_LINK)
+   link.link_status = 1;
+
+   link.link_duplex = ETH_LINK_FULL_DUPLEX;
+   /* Other cards can limit the tx and rx rate per VF */
+   link.link_speed = ETH_LINK_SPEED_40G;
+
+   if (old.link_status != link.link_status) {
+   nfp_net_dev_atomic_write_link_status(dev, );
+   if (link.link_status)
+   PMD_DRV_LOG(INFO, "NIC Link is Up\n");
+   else
+   PMD_DRV_LOG(INFO, "NIC Link is Down\n");
+   return 0;
+   }
+
+   return -1;
+}
+
 static void
 nfp_net_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
 {
@@ -1895,6 +1990,7 @@ static struct eth_dev_ops nfp_net_eth_dev_ops = {
.dev_start  = nfp_net_start,
.dev_stop   = nfp_net_stop,
.dev_close  = nfp_net_close,
+   .link_update= nfp_net_link_update,
.stats_get  = nfp_net_stats_get,
.stats_reset= nfp_net_stats_reset,
.reta_update= nfp_net_reta_update,
-- 
1.7.9.5

[dpdk-dev] [PATCH v9 4/9] nfp: adding stats

2015-11-26 Thread Alejandro Lucero

Signed-off-by: Alejandro Lucero 
Signed-off-by: Rolf Neugebauer 
---
 drivers/net/nfp/nfp_net.c |  179 +
 1 file changed, 179 insertions(+)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index 8451a49..fc02916 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -90,6 +90,9 @@ static int nfp_net_tx_queue_setup(struct rte_eth_dev *dev, 
uint16_t queue_idx,
  uint16_t nb_desc, unsigned int socket_id,
  const struct rte_eth_txconf *tx_conf);
 static int nfp_net_start(struct rte_eth_dev *dev);
+static void nfp_net_stats_get(struct rte_eth_dev *dev,
+ struct rte_eth_stats *stats);
+static void nfp_net_stats_reset(struct rte_eth_dev *dev);
 static void nfp_net_stop(struct rte_eth_dev *dev);
 static uint16_t nfp_net_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
  uint16_t nb_pkts);
@@ -679,6 +682,177 @@ nfp_net_close(struct rte_eth_dev *dev)
 */
 }

+static void
+nfp_net_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
+{
+   int i;
+   struct nfp_net_hw *hw;
+   struct rte_eth_stats nfp_dev_stats;
+
+   hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+   /* RTE_ETHDEV_QUEUE_STAT_CNTRS default value is 16 */
+
+   /* reading per RX ring stats */
+   for (i = 0; i < dev->data->nb_rx_queues; i++) {
+   if (i == RTE_ETHDEV_QUEUE_STAT_CNTRS)
+   break;
+
+   nfp_dev_stats.q_ipackets[i] =
+   nn_cfg_readq(hw, NFP_NET_CFG_RXR_STATS(i));
+
+   nfp_dev_stats.q_ipackets[i] -=
+   hw->eth_stats_base.q_ipackets[i];
+
+   nfp_dev_stats.q_ibytes[i] =
+   nn_cfg_readq(hw, NFP_NET_CFG_RXR_STATS(i) + 0x8);
+
+   nfp_dev_stats.q_ibytes[i] -=
+   hw->eth_stats_base.q_ibytes[i];
+   }
+
+   /* reading per TX ring stats */
+   for (i = 0; i < dev->data->nb_tx_queues; i++) {
+   if (i == RTE_ETHDEV_QUEUE_STAT_CNTRS)
+   break;
+
+   nfp_dev_stats.q_opackets[i] =
+   nn_cfg_readq(hw, NFP_NET_CFG_TXR_STATS(i));
+
+   nfp_dev_stats.q_opackets[i] -=
+   hw->eth_stats_base.q_opackets[i];
+
+   nfp_dev_stats.q_obytes[i] =
+   nn_cfg_readq(hw, NFP_NET_CFG_TXR_STATS(i) + 0x8);
+
+   nfp_dev_stats.q_obytes[i] -=
+   hw->eth_stats_base.q_obytes[i];
+   }
+
+   nfp_dev_stats.ipackets =
+   nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_FRAMES);
+
+   nfp_dev_stats.ipackets -= hw->eth_stats_base.ipackets;
+
+   nfp_dev_stats.ibytes =
+   nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_OCTETS);
+
+   nfp_dev_stats.ibytes -= hw->eth_stats_base.ibytes;
+
+   nfp_dev_stats.opackets =
+   nn_cfg_readq(hw, NFP_NET_CFG_STATS_TX_FRAMES);
+
+   nfp_dev_stats.opackets -= hw->eth_stats_base.opackets;
+
+   nfp_dev_stats.obytes =
+   nn_cfg_readq(hw, NFP_NET_CFG_STATS_TX_OCTETS);
+
+   nfp_dev_stats.obytes -= hw->eth_stats_base.obytes;
+
+   nfp_dev_stats.imcasts =
+   nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_MC_FRAMES);
+
+   nfp_dev_stats.imcasts -= hw->eth_stats_base.imcasts;
+
+   /* reading general device stats */
+   nfp_dev_stats.ierrors =
+   nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_ERRORS);
+
+   nfp_dev_stats.ierrors -= hw->eth_stats_base.ierrors;
+
+   nfp_dev_stats.oerrors =
+   nn_cfg_readq(hw, NFP_NET_CFG_STATS_TX_ERRORS);
+
+   nfp_dev_stats.oerrors -= hw->eth_stats_base.oerrors;
+
+   /* Multicast frames received */
+   nfp_dev_stats.imcasts =
+   nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_MC_FRAMES);
+
+   nfp_dev_stats.imcasts -= hw->eth_stats_base.imcasts;
+
+   /* RX ring mbuf allocation failures */
+   nfp_dev_stats.rx_nombuf = dev->data->rx_mbuf_alloc_failed;
+
+   nfp_dev_stats.imissed =
+   nn_cfg_readq(hw, NFP_NET_CFG_STATS_RX_DISCARDS);
+
+   nfp_dev_stats.imissed -= hw->eth_stats_base.imissed;
+
+   if (stats)
+   memcpy(stats, _dev_stats, sizeof(*stats));
+}
+
+static void
+nfp_net_stats_reset(struct rte_eth_dev *dev)
+{
+   int i;
+   struct nfp_net_hw *hw;
+
+   hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+   /*
+* hw->eth_stats_base records the per counter starting point.
+* Lets update it now
+*/
+
+   /* reading per RX ring stats */
+   for (i = 0; i < dev->data->nb_rx_queues; i++) {
+   if (i == RTE_ETHDEV_QUEUE_STAT_CNTRS)
+   break;
+
+   hw->eth_s

[dpdk-dev] [PATCH v9 3/9] nfp: adding rss

2015-11-26 Thread Alejandro Lucero

Signed-off-by: Alejandro Lucero 
Signed-off-by: Rolf Neugebauer 
---
 drivers/net/nfp/nfp_net.c |  218 +
 1 file changed, 218 insertions(+)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index 7e30774..8451a49 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -1501,12 +1501,230 @@ xmit_end:
return i;
 }

+/* Update Redirection Table(RETA) of Receive Side Scaling of Ethernet device */
+static int
+nfp_net_reta_update(struct rte_eth_dev *dev,
+   struct rte_eth_rss_reta_entry64 *reta_conf,
+   uint16_t reta_size)
+{
+   uint32_t reta, mask;
+   int i, j;
+   int idx, shift;
+   uint32_t update;
+   struct nfp_net_hw *hw =
+   NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+   if (!(hw->ctrl & NFP_NET_CFG_CTRL_RSS))
+   return -EINVAL;
+
+   if (reta_size != NFP_NET_CFG_RSS_ITBL_SZ) {
+   RTE_LOG(ERR, PMD, "The size of hash lookup table configured "
+   "(%d) doesn't match the number hardware can supported "
+   "(%d)\n", reta_size, NFP_NET_CFG_RSS_ITBL_SZ);
+   return -EINVAL;
+   }
+
+   /*
+* Update Redirection Table. There are 128 8bit-entries which can be
+* manage as 32 32bit-entries
+*/
+   for (i = 0; i < reta_size; i += 4) {
+   /* Handling 4 RSS entries per loop */
+   idx = i / RTE_RETA_GROUP_SIZE;
+   shift = i % RTE_RETA_GROUP_SIZE;
+   mask = (uint8_t)((reta_conf[idx].mask >> shift) & 0xF);
+
+   if (!mask)
+   continue;
+
+   reta = 0;
+   /* If all 4 entries were set, don't need read RETA register */
+   if (mask != 0xF)
+   reta = nn_cfg_readl(hw, NFP_NET_CFG_RSS_ITBL + i);
+
+   for (j = 0; j < 4; j++) {
+   if (!(mask & (0x1 << j)))
+   continue;
+   if (mask != 0xF)
+   /* Clearing the entry bits */
+   reta &= ~(0xFF << (8 * j));
+   reta |= reta_conf[idx].reta[shift + j] << (8 * j);
+   }
+   nn_cfg_writel(hw, NFP_NET_CFG_RSS_ITBL + shift, reta);
+   }
+
+   update = NFP_NET_CFG_UPDATE_RSS;
+
+   if (nfp_net_reconfig(hw, hw->ctrl, update) < 0)
+   return -EIO;
+
+   return 0;
+}
+
+ /* Query Redirection Table(RETA) of Receive Side Scaling of Ethernet device. 
*/
+static int
+nfp_net_reta_query(struct rte_eth_dev *dev,
+  struct rte_eth_rss_reta_entry64 *reta_conf,
+  uint16_t reta_size)
+{
+   uint8_t i, j, mask;
+   int idx, shift;
+   uint32_t reta;
+   struct nfp_net_hw *hw;
+
+   hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+   if (!(hw->ctrl & NFP_NET_CFG_CTRL_RSS))
+   return -EINVAL;
+
+   if (reta_size != NFP_NET_CFG_RSS_ITBL_SZ) {
+   RTE_LOG(ERR, PMD, "The size of hash lookup table configured "
+   "(%d) doesn't match the number hardware can supported "
+   "(%d)\n", reta_size, NFP_NET_CFG_RSS_ITBL_SZ);
+   return -EINVAL;
+   }
+
+   /*
+* Reading Redirection Table. There are 128 8bit-entries which can be
+* manage as 32 32bit-entries
+*/
+   for (i = 0; i < reta_size; i += 4) {
+   /* Handling 4 RSS entries per loop */
+   idx = i / RTE_RETA_GROUP_SIZE;
+   shift = i % RTE_RETA_GROUP_SIZE;
+   mask = (uint8_t)((reta_conf[idx].mask >> shift) & 0xF);
+
+   if (!mask)
+   continue;
+
+   reta = nn_cfg_readl(hw, NFP_NET_CFG_RSS_ITBL + shift);
+   for (j = 0; j < 4; j++) {
+   if (!(mask & (0x1 << j)))
+   continue;
+   reta_conf->reta[shift + j] =
+   (uint8_t)((reta >> (8 * j)) & 0xF);
+   }
+   }
+   return 0;
+}
+
+static int
+nfp_net_rss_hash_update(struct rte_eth_dev *dev,
+   struct rte_eth_rss_conf *rss_conf)
+{
+   uint32_t update;
+   uint32_t cfg_rss_ctrl = 0;
+   uint8_t key;
+   uint64_t rss_hf;
+   int i;
+   struct nfp_net_hw *hw;
+
+   hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+   rss_hf = rss_conf->rss_hf;
+
+   /* Checking if RSS is enabled */
+   if (!(hw->ctrl & NFP_NET_CFG_CTRL_RSS)) {
+   if (rss_hf != 0) { /* Enable RSS? */
+

[dpdk-dev] [PATCH v9 2/9] nfp: adding rx/tx functionality

2015-11-26 Thread Alejandro Lucero

Signed-off-by: Alejandro Lucero 
Signed-off-by: Rolf Neugebauer 
---
 drivers/net/nfp/nfp_net.c |  993 +
 1 file changed, 993 insertions(+)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index 18067c0..7e30774 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -74,8 +74,25 @@
 static void nfp_net_close(struct rte_eth_dev *dev);
 static int nfp_net_configure(struct rte_eth_dev *dev);
 static int nfp_net_init(struct rte_eth_dev *eth_dev);
+static int nfp_net_rx_fill_freelist(struct nfp_net_rxq *rxq);
+static uint32_t nfp_net_rx_queue_count(struct rte_eth_dev *dev,
+  uint16_t queue_idx);
+static uint16_t nfp_net_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
+ uint16_t nb_pkts);
+static void nfp_net_rx_queue_release(void *rxq);
+static int nfp_net_rx_queue_setup(struct rte_eth_dev *dev, uint16_t queue_idx,
+ uint16_t nb_desc, unsigned int socket_id,
+ const struct rte_eth_rxconf *rx_conf,
+ struct rte_mempool *mp);
+static int nfp_net_tx_free_bufs(struct nfp_net_txq *txq);
+static void nfp_net_tx_queue_release(void *txq);
+static int nfp_net_tx_queue_setup(struct rte_eth_dev *dev, uint16_t queue_idx,
+ uint16_t nb_desc, unsigned int socket_id,
+ const struct rte_eth_txconf *tx_conf);
 static int nfp_net_start(struct rte_eth_dev *dev);
 static void nfp_net_stop(struct rte_eth_dev *dev);
+static uint16_t nfp_net_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
+ uint16_t nb_pkts);

 /*
  * The offset of the queue controller queues in the PCIe Target. These
@@ -186,6 +203,100 @@ nn_cfg_writeq(struct nfp_net_hw *hw, int off, uint64_t 
val)
nn_writeq(rte_cpu_to_le_64(val), hw->ctrl_bar + off);
 }

+/* Creating memzone for hardware rings. */
+static const struct rte_memzone *
+ring_dma_zone_reserve(struct rte_eth_dev *dev, const char *ring_name,
+ uint16_t queue_id, uint32_t ring_size, int socket_id)
+{
+   char z_name[RTE_MEMZONE_NAMESIZE];
+   const struct rte_memzone *mz;
+
+   snprintf(z_name, sizeof(z_name), "%s_%s_%d_%d",
+dev->driver->pci_drv.name,
+ring_name, dev->data->port_id, queue_id);
+
+   mz = rte_memzone_lookup(z_name);
+   if (mz)
+   return mz;
+
+   return rte_memzone_reserve_aligned(z_name, ring_size, socket_id, 0,
+  NFP_MEMZONE_ALIGN);
+}
+
+static void
+nfp_net_rx_queue_release_mbufs(struct nfp_net_rxq *rxq)
+{
+   unsigned i;
+
+   if (rxq->rxbufs == NULL)
+   return;
+
+   for (i = 0; i < rxq->rx_count; i++) {
+   if (rxq->rxbufs[i].mbuf) {
+   rte_pktmbuf_free_seg(rxq->rxbufs[i].mbuf);
+   rxq->rxbufs[i].mbuf = NULL;
+   }
+   }
+}
+
+static void
+nfp_net_rx_queue_release(void *rx_queue)
+{
+   struct nfp_net_rxq *rxq = rx_queue;
+
+   if (rxq) {
+   nfp_net_rx_queue_release_mbufs(rxq);
+   rte_free(rxq->rxbufs);
+   rte_free(rxq);
+   }
+}
+
+static void
+nfp_net_reset_rx_queue(struct nfp_net_rxq *rxq)
+{
+   nfp_net_rx_queue_release_mbufs(rxq);
+   rxq->wr_p = 0;
+   rxq->rd_p = 0;
+   rxq->nb_rx_hold = 0;
+}
+
+static void
+nfp_net_tx_queue_release_mbufs(struct nfp_net_txq *txq)
+{
+   unsigned i;
+
+   if (txq->txbufs == NULL)
+   return;
+
+   for (i = 0; i < txq->tx_count; i++) {
+   if (txq->txbufs[i].mbuf) {
+   rte_pktmbuf_free_seg(txq->txbufs[i].mbuf);
+   txq->txbufs[i].mbuf = NULL;
+   }
+   }
+}
+
+static void
+nfp_net_tx_queue_release(void *tx_queue)
+{
+   struct nfp_net_txq *txq = tx_queue;
+
+   if (txq) {
+   nfp_net_tx_queue_release_mbufs(txq);
+   rte_free(txq->txbufs);
+   rte_free(txq);
+   }
+}
+
+static void
+nfp_net_reset_tx_queue(struct nfp_net_txq *txq)
+{
+   nfp_net_tx_queue_release_mbufs(txq);
+   txq->wr_p = 0;
+   txq->rd_p = 0;
+   txq->tail = 0;
+}
+
 static int
 __nfp_net_reconfig(struct nfp_net_hw *hw, uint32_t update)
 {
@@ -423,6 +534,18 @@ nfp_net_disable_queues(struct rte_eth_dev *dev)
hw->ctrl = new_ctrl;
 }

+static int
+nfp_net_rx_freelist_setup(struct rte_eth_dev *dev)
+{
+   int i;
+
+   for (i = 0; i < dev->data->nb_rx_queues; i++) {
+   if (nfp_net_rx_fill_freelist(dev->data->rx_queues[i]) < 0)
+   return -1;
+   }
+   return 0;
+}
+
 static void
 nfp_net_params_setup(struct nfp_net_hw *

[dpdk-dev] [PATCH v9 1/9] nfp: basic initialization

2015-11-26 Thread Alejandro Lucero

Signed-off-by: Alejandro Lucero 
Signed-off-by: Rolf Neugebauer 
---
 MAINTAINERS |3 +
 config/common_linuxapp  |6 +
 doc/guides/rel_notes/release_2_2.rst|4 +
 drivers/net/Makefile|1 +
 drivers/net/nfp/Makefile|   56 +++
 drivers/net/nfp/nfp_net.c   |  699 +++
 drivers/net/nfp/nfp_net_ctrl.h  |  324 ++
 drivers/net/nfp/nfp_net_logs.h  |   75 
 drivers/net/nfp/nfp_net_pmd.h   |  453 
 drivers/net/nfp/rte_pmd_nfp_version.map |3 +
 mk/rte.app.mk   |1 +
 11 files changed, 1625 insertions(+)
 create mode 100644 drivers/net/nfp/Makefile
 create mode 100644 drivers/net/nfp/nfp_net.c
 create mode 100644 drivers/net/nfp/nfp_net_ctrl.h
 create mode 100644 drivers/net/nfp/nfp_net_logs.h
 create mode 100644 drivers/net/nfp/nfp_net_pmd.h
 create mode 100644 drivers/net/nfp/rte_pmd_nfp_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index 840faeb..df5b962 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -318,6 +318,9 @@ Null PMD
 M: Tetsuya Mukawa 
 F: drivers/net/null/

+Netronome nfp
+M: Alejandro Lucero 
+F: drivers/net/nfp/

 Packet processing
 -
diff --git a/config/common_linuxapp b/config/common_linuxapp
index 4a68da4..1d77db7 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -279,6 +279,12 @@ CONFIG_RTE_LIBRTE_VMXNET3_DEBUG_TX_FREE=n
 CONFIG_RTE_LIBRTE_VMXNET3_DEBUG_DRIVER=n

 #
+# Compile burst-oriented Netronome NFP PMD driver
+#
+CONFIG_RTE_LIBRTE_NFP_PMD=n
+CONFIG_RTE_LIBRTE_NFP_DEBUG=n
+
+#
 # Compile example software rings based PMD
 #
 CONFIG_RTE_LIBRTE_PMD_RING=y
diff --git a/doc/guides/rel_notes/release_2_2.rst 
b/doc/guides/rel_notes/release_2_2.rst
index 8c77768..8154db7 100644
--- a/doc/guides/rel_notes/release_2_2.rst
+++ b/doc/guides/rel_notes/release_2_2.rst
@@ -195,6 +195,10 @@ Drivers

   Fixed issue when releasing null control queue.

+* **nfp: adding new PMD for Netronome nfp-6xxx card.**
+
+  Support for using Netronome nfp-6xxx with PCI VFs.
+

 Libraries
 ~
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index cddcd57..6e4497e 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -43,6 +43,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD) += ixgbe
 DIRS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4
 DIRS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5
 DIRS-$(CONFIG_RTE_LIBRTE_MPIPE_PMD) += mpipe
+DIRS-$(CONFIG_RTE_LIBRTE_NFP_PMD) += nfp
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_NULL) += null
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_PCAP) += pcap
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_RING) += ring
diff --git a/drivers/net/nfp/Makefile b/drivers/net/nfp/Makefile
new file mode 100644
index 000..ef7a13d
--- /dev/null
+++ b/drivers/net/nfp/Makefile
@@ -0,0 +1,56 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_nfp.a
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+EXPORT_MAP := rte_pmd_nfp_version.map
+
+LIBABIVER := 1
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_NFP_PMD) += nfp_net.c
+
+# this lib depends upon:
+DEPDIRS-$(CONFIG_RTE_LIBRTE_NFP_PMD) += lib/librte_eal lib/librte_ether
+DEPDIRS-$(CONFIG_RTE_LIBRTE_NFP_PMD) += lib/librte_m

[dpdk-dev] [PATCH v9 0/9] support for netronome nfp-6xxx card

2015-11-26 Thread Alejandro Lucero

This patchset adds a new PMD for Netronome nfp-6xxx card.
Just PCI Virtual Functions supported.
Using this PMD requires previous Netronome BSP installation.

v9:
 - Adding flag RTE_PCI_DRV_INTR_LSC
 - Makefile changes for compilation as a shared library
 - Adding map file for linker version script info

v8:
 - removing remaining unnecessary flags to PMD Makefile

v7:
 - Adding support for link status changes interrupts.
 - removing unnecessary flags when compiling the PMD.

v6:
 - Making each patch compilable.

v5:
 - Splitting up patches per functionality.

v4:
 - Getting rid of nfp_uio. Just submitting PMD.
 - Removing LSC interrupt support

v3:
 - Making all patches independent for applying and building.
 - changing commits messages following standard

v2:
 - Code style changes based on checkpatch.pl and DPDK style guide.
 - Documentation changes using the right rst format.
 - Moving the documentation files to a new patch file.
 - Adding info to MAINTAINERS and release files.

Alejandro Lucero (9):
  nfp: basic initialization
  nfp: adding rx/tx functionality
  nfp: adding rss
  nfp: adding stats
  nfp: adding link functionality
  nfp: adding extra functionality
  nfp: link status change interrupt support
  nfp: adding nic guide
  nfp: updating maintainers

 MAINTAINERS |4 +
 config/common_linuxapp  |6 +
 doc/guides/nics/index.rst   |1 +
 doc/guides/nics/nfp.rst |  265 
 doc/guides/rel_notes/release_2_2.rst|4 +
 drivers/net/Makefile|1 +
 drivers/net/nfp/Makefile|   56 +
 drivers/net/nfp/nfp_net.c   | 2499 +++
 drivers/net/nfp/nfp_net_ctrl.h  |  324 
 drivers/net/nfp/nfp_net_logs.h  |   75 +
 drivers/net/nfp/nfp_net_pmd.h   |  453 ++
 drivers/net/nfp/rte_pmd_nfp_version.map |3 +
 mk/rte.app.mk   |1 +
 13 files changed, 3692 insertions(+)
 create mode 100644 doc/guides/nics/nfp.rst
 create mode 100644 drivers/net/nfp/Makefile
 create mode 100644 drivers/net/nfp/nfp_net.c
 create mode 100644 drivers/net/nfp/nfp_net_ctrl.h
 create mode 100644 drivers/net/nfp/nfp_net_logs.h
 create mode 100644 drivers/net/nfp/nfp_net_pmd.h
 create mode 100644 drivers/net/nfp/rte_pmd_nfp_version.map

-- 
1.7.9.5

[dpdk-dev] [PATCH v8 5/9] nfp: adding link functionality

2015-11-25 Thread Alejandro Lucero

I tried to do that but there is some issue with the inlining. I think this
is due to inline keyword being processed (also) as static by the compiler.

On Wed, Nov 25, 2015 at 4:29 PM, Stephen Hemminger <
stephen at networkplumber.org> wrote:

> On Wed, 25 Nov 2015 16:19:51 +
> "Alejandro.Lucero"  wrote:
>
> > +/*
> > + * Atomically reads link status information from global structure
> rte_eth_dev.
> > + *
> > + * @param dev
> > + *   - Pointer to the structure rte_eth_dev to read from.
> > + *   - Pointer to the buffer to be saved with the link status.
> > + *
> > + * @return
> > + *   - On success, zero.
> > + *   - On failure, negative value.
> > + */
> > +static inline int
> > +nfp_net_dev_atomic_read_link_status(struct rte_eth_dev *dev,
> > + struct rte_eth_link *link)
> > +{
> > + struct rte_eth_link *dst = link;
> > + struct rte_eth_link *src = >data->dev_link;
> > +
> > + if (rte_atomic64_cmpset((uint64_t *)dst, *(uint64_t *)dst,
> > + *(uint64_t *)src) == 0)
> > + return -1;
> > +
> > + return 0;
> > +}
> > +
> > +/
>
> Sigh, this code has been copied and pasted to every driver.
> Why is it not part of standard rte_ethdev code.
>

[dpdk-dev] [PATCH v6 0/7] support for netronome nfp-6xxx card

2015-11-06 Thread Alejandro Lucero

Yes.

There was a bug in 1.8 affecting how BARs are used in the device, but this
should be fixed in 2.2

On Thu, Nov 5, 2015 at 11:42 PM, Stephen Hemminger <
stephen at networkplumber.org> wrote:

> On Thu, 05 Nov 2015 11:59:59 +0100
> Vincent JARDIN  wrote:
>
> >
> > On 05/11/2015 11:43, Alejandro.Lucero wrote:
> > > From: "Alejandro.Lucero" 
> > >
> > > This patchset adds a new PMD for Netronome nfp-6xxx card.
> > > Just PCI Virtual Functions supported.
> > > Using this PMD requires previous Netronome BSP installation.
> > >
> >
> > I understand that this PMD needs a kernel driver which is not upstream
> > yet. Am I correct?
> >
> >
> https://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/tree/drivers/net/ethernet
> >
> >
> > Best regards,
> >Vincent
> >
>
> Does this driver work with VFIO?
>

[dpdk-dev] [PATCH v6 0/7] support for netronome nfp-6xxx card

2015-11-05 Thread Alejandro Lucero

Yes, this is true.

There is a internal Netronome project for upstreaming the netdev kernel
driver along with a BSP driver.
PMD support will be in the BSP.

There is a public github repo with current drivers:

https://github.com/Netronome/nfp-drv-kmods

On Thu, Nov 5, 2015 at 10:59 AM, Vincent JARDIN 
wrote:

>
> On 05/11/2015 11:43, Alejandro.Lucero wrote:
>
>> From: "Alejandro.Lucero" 
>>
>> This patchset adds a new PMD for Netronome nfp-6xxx card.
>> Just PCI Virtual Functions supported.
>> Using this PMD requires previous Netronome BSP installation.
>>
>>
> I understand that this PMD needs a kernel driver which is not upstream
> yet. Am I correct?
>
>
> https://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/tree/drivers/net/ethernet
>
>
> Best regards,
>   Vincent
>
>

[dpdk-dev] [PATCH v5 8/9] nfp: adding nic guide

2015-11-04 Thread Alejandro Lucero

Yes. It will build by now.

Once we add the PF then BSP will be needed. I guess this is the same for
MLX PMDs needing specific Mellanox libraries.

On Wed, Nov 4, 2015 at 3:03 PM, Thomas Monjalon 
wrote:

> 2015-11-02 12:25, Alejandro.Lucero:
> > +Before using the Netronome's DPDK PMD some NFP-6xxx configuration,
> > +which is not related to DPDK, is required. The system requires
> > +installation of **Netronome's BSP (Board Support Package)** which
> includes
> > +Linux drivers, programs and libraries.
>
> Do you confirm we can check the PMD build without having the BSP?
>

[dpdk-dev] [PATCH v3 2/4] nfp-uio: new uio driver for netronome nfp6000 card

2015-10-22 Thread Alejandro Lucero

Submitting just the PMD for integration makes sense. I will remove all the
references to nfp_uio.

My doubt is with documentation. Working with the NFP PMD will not be
possible without nfp_uio. We could modify the documentation saying it is
possible to use igb_uio, but this is not the right thing to do (pci mask
will be wrong). So, would it be acceptable to submit a new PMD without any
documentation by now? I prefer this for the sake of integration than giving
wrong or incomplete documentation.

Thanks

On Wed, Oct 21, 2015 at 8:40 PM, Alejandro Lucero <
alejandro.lucero at netronome.com> wrote:

>
>
> On Wed, Oct 21, 2015 at 5:03 PM, Thomas Monjalon <
> thomas.monjalon at 6wind.com> wrote:
>
>> 2015-10-21 16:57, Alejandro Lucero:
>> > I understand interest for not having another UIO driver does exist. We
>> > could maintain an external nfp_uio by now till either we get rid of it
>> or
>> > we definitely find out it is really needed. any chance to accept
>> nfp_uio by
>> > now?
>>
>> No, there are some work currently to get rid of igb_uio.
>> So there are little chances to accept nfp_uio one day.
>> Please take the first step of integrating your PMD without link interrupt.
>> Later we'll be able to discuss how to mitigate the interrupt issue.
>>
>
> Ok. I will create a new patchset version without nfp_uio.
>
> By the way, that work with igb_uio is about the patches to
> pci_uio_generic? I thought there was some reticence from the maintainer for
> adding pci bus master there.
>
>
>

[dpdk-dev] [PATCH v3 2/4] nfp-uio: new uio driver for netronome nfp6000 card

2015-10-21 Thread Alejandro Lucero

On Wed, Oct 21, 2015 at 5:03 PM, Thomas Monjalon 
wrote:

> 2015-10-21 16:57, Alejandro Lucero:
> > I understand interest for not having another UIO driver does exist. We
> > could maintain an external nfp_uio by now till either we get rid of it or
> > we definitely find out it is really needed. any chance to accept nfp_uio
> by
> > now?
>
> No, there are some work currently to get rid of igb_uio.
> So there are little chances to accept nfp_uio one day.
> Please take the first step of integrating your PMD without link interrupt.
> Later we'll be able to discuss how to mitigate the interrupt issue.
>

Ok. I will create a new patchset version without nfp_uio.

By the way, that work with igb_uio is about the patches to pci_uio_generic?
I thought there was some reticence from the maintainer for adding pci bus
master there.

[dpdk-dev] [PATCH v3 2/4] nfp-uio: new uio driver for netronome nfp6000 card

2015-10-21 Thread Alejandro Lucero

Wow!

This is just what we (likely) need.

We could have that support in our BSP but this is something yet to be
approved. Setting  per VF pci bus master and pci mask should also be added
to BSP.

Current PMD submitted would need nfp_uio by now for LSC interrupt support.
As I said, this is not critical so NFP devices could be used with that
limitation. My concern is tests without using nfp_uio have not been done,
just some minor work for proof of concept.

I understand interest for not having another UIO driver does exist. We
could maintain an external nfp_uio by now till either we get rid of it or
we definitely find out it is really needed. any chance to accept nfp_uio by
now?

Thanks

On Wed, Oct 21, 2015 at 4:25 PM, Thomas Monjalon 
wrote:

> 2015-10-21 15:39, Alejandro Lucero:
> > On Wed, Oct 21, 2015 at 6:24 AM, David Marchand <
> david.marchand at 6wind.com>
> > wrote:
> > > Please, can you elaborate on the need for (yet another) uio driver,
> rather
> > > than make igb_uio work with your hardware ?
> [...]
> > I have been looking at the possibility of getting rid of nfp_uio. The
> fact
> > is our PMD can work without it, both for the PF and VF (not the PMD
> version
> > already submitted but one under development).  The PF support requires
> not
> > using UIO at all, because the device is attached to the BSP driver. The
> > only problem with this approach is we do not have support for interrupts,
> > what is not critical (I can see other PMDs not having support for Link
> > Status Changes) but we do not like it as programs can register callbacks
> > for these interrupts which would not work at all.
> >
> > Interrupt support could be implemented in the BSP, doing the same UIO or
> > VFIO do, but this will require (minor) changes to DPDK for having another
> > intr_handle (not UIO, not VFIO). I do not know if other PMDs could also
> > make use of such a change but I guess that would help to accept those
> > changes.
>
> We are going to have an external handler (used for mlx5):
> http://dpdk.org/ml/archives/dev/2015-October/024678.html
> Problem solved :)
>
> Is it possible to rework your PMD without nfp-uio?
>
> Thanks
>

[dpdk-dev] rte_eth_rx_queue_count accuracy

2015-09-22 Thread Alejandro Lucero

I can not see that code. Can you point out where is it?

Thanks

On Mon, Sep 21, 2015 at 11:41 PM, Stephen Hemminger <
stephen at networkplumber.org> wrote:

> On Fri, 18 Sep 2015 11:33:36 +0100
> Alejandro Lucero  wrote:
>
> > I have seen the API definition says nothing about accuracy but some PMD
> > implementations sacrifice accuracy for the sake of performance. If I'm
> not
> > understanding the code wrongly  i40e and ixgbe check DD bit just for the
> > first descriptor in a group of 4, and they take all of them as used if
> the
> > first descriptor is used.
> >
> > By other hand, they do a "heavy" calculation when the descriptor ring
> wraps
> > which does not make sense (to me) if same performance goal is used.
> >
> > There are PMDs not supporting this option and I can not see any app or
> > example using it so I do not know how important is this function, its
> > accuracy and its performance impact. Can someone comment on this?
> >
> > Thanks
>
> I have version of this for virtio/vmxnet3
> It is useful when using the interrupt control mode.
>

[dpdk-dev] rte_eth_rx_queue_count accuracy

2015-09-18 Thread Alejandro Lucero

I have seen the API definition says nothing about accuracy but some PMD
implementations sacrifice accuracy for the sake of performance. If I'm not
understanding the code wrongly  i40e and ixgbe check DD bit just for the
first descriptor in a group of 4, and they take all of them as used if the
first descriptor is used.

By other hand, they do a "heavy" calculation when the descriptor ring wraps
which does not make sense (to me) if same performance goal is used.

There are PMDs not supporting this option and I can not see any app or
example using it so I do not know how important is this function, its
accuracy and its performance impact. Can someone comment on this?

Thanks

[dpdk-dev] [PATCH v2] vfio: Fix overflow while assigning vfio BAR region offset and size

2015-07-10 Thread Alejandro Lucero

Hi Rahul,

Go ahead. That's fine for me.

Thanks

On Fri, Jul 10, 2015 at 10:54 AM, Rahul Lakkireddy <
rahul.lakkireddy at chelsio.com> wrote:

> On Tue, Jul 07, 2015 at 10:50:23 +, Burakov, Anatoly wrote:
> > Hi Rahul,
> >
> > > However, unsigned long seems to be working fine for all builds.
> >
> > unsigned long it is then, if there aren't any other objections.
> >
> > Thanks,
> > Anatoly
>
> Hi Alejandro,
>
> Are you planning to update the original patch as per below and re-submit:
> http://dpdk.org/ml/archives/dev/2015-July/020963.html
>
> Or, I can also submit it if you want.
> Please let me know.
>
> Thanks,
> Rahul
>
>

[dpdk-dev] [PATCH v2] vfio: Fix overflow while assigning vfio BAR region offset and size

2015-07-06 Thread Alejandro Lucero

Hi all,

>From the kernel VFIO maintainer:

"I suppose in the short term, mmap should not be advertised as available
on 32bit hosts.  Thanks,"

So, as VFIO support for 32bit systems is broken, DPDK should not configure
VFIO in that case.

This is the complete email sent to the kernel maintainer and his answer:

On Thu, 2015-07-02 at 14:42 +0100, Alejandro Lucero wrote:
> Hi Alex,
>
> is VFIO expected to work in 32 bit systems?
>
> I know VFIO initial goal was a better control of device assignment to
> virtual machines and it is based on IOMMU hardware support. I do not know
> how likely is to have a 32 bit system with IOMMU but it seems to be
> possible such hardware configuration.
>
> The problem with VFIO and 32 bit systems is the VFIO kernel code uses the
> upper bits (VFIO_PCI_OFFSET_SHIFT) of a __u64 variable, offset field in
> struct vfio_region_info, for saving info about the PCI BAR index to work
> with. This is done inside the ioctl command VFIO_DEVICE_GET_REGION_INFO.
> That vfio_region_info is got by the process doing the system call and the
> offset is used as a parameter for mmap system call which expects such a
> parameter as unsigned long. If I am not wrong, unsigned long in 32 bit
> linux systems is a 32 bit type, so when the vfio_pci_mmap function is
> executed, the index BAR to work with is obtained from the offset, which
> turns to be always 0 as the value was "lost in translation". There is a
> chance current implementation can work if all the PCI BARs are equal in
> terms of size, but obviously this is not acceptable.
>
> So, if VFIO needs to work in 32 bits systems another way to map the device
> PCI BARs is needed.

Not necessarily, VFIO_PCI_OFFSET_SHIFT is an internal implementation
detail, userspace should always use  vfio_region_info.offset.  We're
therefore free to come up with other algorithms for handling this
limitation.  If we want to continue using a single device file
descriptor, we could simply choose a smaller shift on 32bit systems.  A
29 bit shift would give us 512MB regions support, which is sufficient
for the vast majority of devices.

We could also replace the macros with functions such that we pack
regions as tightly as possible within the device file descriptor.  It's
reasonable to expect that we could support up to 2G BARs using such a
method.  A hybrid approach is also possible, for instance the config
space region could also contain the ROM and VGA areas (or we could
simply choose not to support VGA on 32bit hosts).

If we need to support 4G BARs, our only choice is really to extend the
vfio region support for a separate file descriptor per region.  The only
devices I'm aware of with 4G BARs are Nvidia Tesla.  This is possible,
but I would expect such devices would be extremely rare on 32bit hosts.

I suppose in the short term, mmap should not be advertised as available
on 32bit hosts.  Thanks,

Alex

On Wed, Jul 1, 2015 at 11:00 AM, Burakov, Anatoly  wrote:

> Hi all,
>
> > The last patch from Rahul does not solve the problem. For those cases
> where the MSI-X table is in one of the BARs to map, the memreg array is
> still in use.
>
> Rahul's initial patch was pretty much what you have submitted, it just
> didn't build on a 32-bit system.
>
> > My fix was using unsigned long instead of uint32_t for the memreg array
> as this is used as  a parameter for mmap system call which expects such a
> type for the offset (and size).
>
> Maybe use off_t? That would at least be guaranteed to compile on any
> system...
>
> > In a 32-bit system mmap system call and VFIO mmap implementation will
> get an unsigned long offset, as it does the struct vma_area_struct for
> vm_pgoff.
> > VFIO will not be able to map the right BAR except for BAR 0.
> >
> > So, basically, VFIO kernel code does not work for 32 bit systems.
> >
> > I think we should define memreg as unsigned long and to report this
> problem to the VFIO kernel maintainer.
>
> If that's the case, this should indeed be taken up with the kernel
> maintainers. I don't have a 32-bit system handy to test it, unfortunately.
>
> Thanks,
> Anatoly
>

[dpdk-dev] [PATCH v2] vfio: Fix overflow while assigning vfio BAR region offset and size

2015-07-01 Thread Alejandro Lucero

I submitted a patch for fixing this issue on the 25th of June. I did not
notice someone had reported this before. The last patch from Rahul does not
solve the problem. For those cases where the MSI-X table is in one of the
BARs to map, the memreg array is still in use.

My fix was using unsigned long instead of uint32_t for the memreg array as
this is used as  a parameter for mmap system call which expects such a type
for the offset (and size). This worked for me but I did not realize this
has to be compiled for 32 bit systems as well. In that case unsigned long
will work for the mmap but not for the VFIO kernel API which expects
uint64_t for the offset and size inside the struct vfio_region_info.

The point is, the offset param from the vfio_region_info has the index BAR
to map. For this VFIO kernel code uses VFIO_PCI_INDEX_TO_OFFSET:

 #define VFIO_PCI_OFFSET_SHIFT

40
 #define VFIO_PCI_INDEX_TO_OFFSET
(index
) ((u64
)(index
) <<
VFIO_PCI_OFFSET_SHIFT
)

This index will be used by the VFIO mmap implementation when the DPDK code
tries to map the BARs. That code does the opposite for getting the index:

index = vma->vm_pgoff >> (VFIO_PCI_OFFSET_SHIFT - PAGE_SHIFT);

In this case PAGE_SHIFT needs to be used because the mmap system call
modifies the offset previously.

In a 32-bit system mmap system call and VFIO mmap implementation will get
an unsigned long offset, as it does the struct vma_area_struct for
vm_pgoff. VFIO will not be able to map the right BAR except for BAR 0.

So, basically, VFIO kernel code does not work for 32 bit systems.

I think we should define memreg as unsigned long and to report this problem
to the VFIO kernel maintainer.

On Tue, Jun 30, 2015 at 10:12 PM, Thomas Monjalon  wrote:

> Hi Anatoly,
> Please could you review this fix to allow Chelsio using VFIO?
> Thanks
>
> 2015-06-23 20:30, Rahul Lakkireddy:
> > When using vfio, the probe fails over Chelsio T5 adapters after
> > commit-id 90a1633b2 (eal/linux: allow to map BARs with MSI-X tables).
> >
> > While debugging further, found that the BAR region offset and size read
> from
> > vfio are u64, but are assigned to uint32_t variables.  This results in
> the u64
> > value getting truncated to 0 and passing wrong offset and size to mmap
> for
> > subsequent BAR regions (i.e. trying to overwrite previously allocated
> BAR 0
> > region).
> >
> > The fix is to use these region offset and size directly rather than
> assigning
> > to uint32_t variables.
> >
> > Fixes: 90a1633b2347 ("eal/linux: allow to map BARs with MSI-X tables")
> > Signed-off-by: Rahul Lakkireddy 
> > Signed-off-by: Kumar Sanghvi 
>
>

[dpdk-dev] [PATCH] eal_pci_vfio.c: fix the type for handling BAR size and offset info

2015-06-24 Thread Alejandro Lucero

Kernel mmap syscall and VFIO kernel driver expect unsigned long for offset.
The BAR index to work with inside VFIO kernel driver will be the wrong one
except for BAR 0.

The patch solves the issue.

[dpdk-dev] ret_pktmbuf_pool_init problem with opaque_arg

2015-01-09 Thread Alejandro Lucero

Hi Olivier,



On Fri, Jan 9, 2015 at 2:28 PM, Olivier MATZ  wrote:

> Hi Alejandro,
>
> On 01/09/2015 03:12 PM, Alejandro Lucero wrote:
> > Inside this function mbuf_data_room_size is set to a default value if
> > opaque_arg is null and it should be set to the value pointed by
> opaque_arg
> > if not null. Current implementation is using not the value but with the
> > pointer itself. I think this:
> >
> > roomsz = (uint16_t)(uintptr_t)opaque_arg;
> >
> > should be something like this:
> >
> > roomsz = *(uint16_t *)opaque_arg;
> >
>
> In this particular case, the integer value is stored in the pointer
> value: the pointer is not used as a pointer but as an integer. I agree
> it can be surprising, but I think the code is correct.
>
>
Likely there is a good reason for doing things this way but I can not see
the point.

And it will confuse the user.

Thanks and Regards


> Regards,
> Olivier
>

[dpdk-dev] About RTE_MAX_ETHPORT_QUEUE_STATS_MAPS

2014-08-21 Thread Alejandro Lucero

Hi,

Documentation and header files describe stat_idx parameter for

rte_eth_dev_set_tx_queue_stats_mapping

and

rte_eth_dev_set_rx_queue_stats_mapping

as

The value must be in the range
[0, RTE_MAX_ETHPORT_QUEUE_STATS_MAPS - 1]

I have not found a definition for RTE_MAX_ETHPORT_QUEUE_STATS_MAPS but the
per queue counters inside struct rte_eth_stats are arrays with length
RTE_ETHDEV_QUEUE_STAT_CNTRS which is defined at

config/defconfig_x86_64-default-linuxapp-gcc

CONFIG_RTE_ETHDEV_QUEUE_STAT_CNTRS=16

I assume RTE_MAX_ETHPORT_QUEUE_STATS_MAPS is equal to
CONFIG_RTE_ETHDEV_QUEUE_STAT_CNTRS.

Can anyone confirm this?

Thanks

85 matches

Mail list logo