[dpdk-dev] [PATCH] mlx4: use dummy rxqs when a non-pow2 number is requested

2016-04-15 Thread Olivier Matz
Hi,

On 03/22/2016 03:27 PM, Wiles, Keith wrote:
> Hi Olivier,
>
>> Hi Keith,
>>
>> On 03/21/2016 06:38 PM, Wiles, Keith wrote:
 On Mar 21, 2016, at 11:10 AM, Olivier Matz  
 wrote:

 When using RSS, the number of rxqs has to be a power of two.
 This is a problem because there is no API is dpdk that makes
 the application aware of that.

 A good compromise is to allow the application to request a
 number of rxqs that is not a power of 2, but having inactive
 queues that will never receive packets. In this configuration,
 a warning will be issued to users to let them know that
 this is not an optimal configuration.
>>>
>>> Not sure I like this solution. I think an error should be returned with a 
>>> log message instead. What if the next driver needs power of three or must 
>>> be odd or even number.
>>>
>>> The bigger problem is the application is no longer portable for any given 
>>> nic configuration.
>>>
>>> We need a method for the application to query the system for these types of 
>>> information. But as we do not have that API we need to just error the 
>>> request off.
>>
>>
>> The initial problem is that the driver says "I support a maximum
>> of X queues" and if the application configures a lower number, it
>> gets an error.
>>
>> There is no API in DPDK to tell that only specific number of queues
>> are supported. Adding an API is a solution, but in this case it's
>> probably overkill. With this patch, the driver can present the proper
>> number of queues to the application, knowing that the spreading of
>> the packets won't be ideal (some queues won't receive packets), but
>> it will work.
>>
>> A step further in this direction would be to configure more queues
>> than asked in hardware to do a better spreading, almost similar to
>> what is done with RETA tables in mlx5. But this is more complicated
>> to do, especially if we want it for 16.04.
>
> Well I guess I must agree with the solution, but I am not real happy. Can we 
> mark this a temp fix until we figure out a cleaner solution as I would not 
> want this type of solution forever or be the standard way to handle these 
> problems.

Back on this issue, I agree that a cleaner solution may be needed,
probably in the ethdev API. I did a quick look in the drivers directory
and, from what I remember, vmxnet3 also need to have a number of rxq
and txq to be a power of 2.

>From what I see in the code, if an application tries to configure a
number of queue which is not a power of 2 on vmxnet3, the driver will
fail to start without any log.

Yong, do you feel a patch similar to what was done on mlx4 is
feasable/suitable in vmxnet3 driver? Shouldn't at least have some
error logs saying that the number of rxq/txq is invalid?

It cleanup or rework is planned in the ethdev API for 16.11, maybe
this is a problem that should be addressed.


Regards,
Olivier


[dpdk-dev] [PATCH] mlx4: use dummy rxqs when a non-pow2 number is requested

2016-03-24 Thread Olivier Matz
Hi Bruce,

On 03/24/2016 01:20 PM, Bruce Richardson wrote:
>> @@ -734,12 +734,12 @@ dev_configure(struct rte_eth_dev *dev)
>>  }
>>  if (rxqs_n == priv->rxqs_n)
>>  return 0;
>> -if ((rxqs_n & (rxqs_n - 1)) != 0) {
>> -ERROR("%p: invalid number of RX queues (%u),"
>> -  " must be a power of 2",
>> +if (!rte_is_power_of_2(rxqs_n)) {
>> +WARN("%p: number of RX queues (%u), must be a"
>> +  " power of 2: remaining queues will be inactive",
> 
> I'm not sure how clear this warning message is. To the reader there are no
> extra "remaining" queues referred to, as it's not stated that the driver is
> allocating extra queues. How about e.g.:
> 
> WARN("%p: number of RX queues on device must by a power of 2. Allocating %u
>   queues, of which %u will be active. Remaining queues will be 
> inactive"...)
> 

You're right, I'll send a v2 with a clearer message.

Regards,
Olivier



[dpdk-dev] [PATCH] mlx4: use dummy rxqs when a non-pow2 number is requested

2016-03-24 Thread Bruce Richardson
On Mon, Mar 21, 2016 at 05:08:04PM +0100, Olivier Matz wrote:
> When using RSS, the number of rxqs has to be a power of two.
> This is a problem because there is no API is dpdk that makes
> the application aware of that.
> 
> A good compromise is to allow the application to request a
> number of rxqs that is not a power of 2, but having inactive
> queues that will never receive packets. In this configuration,
> a warning will be issued to users to let them know that
> this is not an optimal configuration.
> 
> Signed-off-by: Olivier Matz 
> ---
>  drivers/net/mlx4/mlx4.c | 27 +--
>  1 file changed, 17 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
> index cc4e9aa..eaf06db 100644
> --- a/drivers/net/mlx4/mlx4.c
> +++ b/drivers/net/mlx4/mlx4.c
> @@ -698,7 +698,7 @@ txq_cleanup(struct txq *txq);
>  
>  static int
>  rxq_setup(struct rte_eth_dev *dev, struct rxq *rxq, uint16_t desc,
> -   unsigned int socket, const struct rte_eth_rxconf *conf,
> +   unsigned int socket, int inactive, const struct rte_eth_rxconf *conf,
> struct rte_mempool *mp);
>  
>  static void
> @@ -734,12 +734,12 @@ dev_configure(struct rte_eth_dev *dev)
>   }
>   if (rxqs_n == priv->rxqs_n)
>   return 0;
> - if ((rxqs_n & (rxqs_n - 1)) != 0) {
> - ERROR("%p: invalid number of RX queues (%u),"
> -   " must be a power of 2",
> + if (!rte_is_power_of_2(rxqs_n)) {
> + WARN("%p: number of RX queues (%u), must be a"
> +   " power of 2: remaining queues will be inactive",

I'm not sure how clear this warning message is. To the reader there are no
extra "remaining" queues referred to, as it's not stated that the driver is
allocating extra queues. How about e.g.:

WARN("%p: number of RX queues on device must by a power of 2. Allocating %u
queues, of which %u will be active. Remaining queues will be 
inactive"...)

/Bruce


[dpdk-dev] [PATCH] mlx4: use dummy rxqs when a non-pow2 number is requested

2016-03-22 Thread Wiles, Keith
Hi Olivier,

>Hi Keith,
>
>On 03/21/2016 06:38 PM, Wiles, Keith wrote:
>>> On Mar 21, 2016, at 11:10 AM, Olivier Matz  
>>> wrote:
>>>
>>> When using RSS, the number of rxqs has to be a power of two.
>>> This is a problem because there is no API is dpdk that makes
>>> the application aware of that.
>>>
>>> A good compromise is to allow the application to request a
>>> number of rxqs that is not a power of 2, but having inactive
>>> queues that will never receive packets. In this configuration,
>>> a warning will be issued to users to let them know that
>>> this is not an optimal configuration.
>> 
>> Not sure I like this solution. I think an error should be returned with a 
>> log message instead. What if the next driver needs power of three or must be 
>> odd or even number. 
>> 
>> The bigger problem is the application is no longer portable for any given 
>> nic configuration.
>> 
>> We need a method for the application to query the system for these types of 
>> information. But as we do not have that API we need to just error the 
>> request off.
>
>
>The initial problem is that the driver says "I support a maximum
>of X queues" and if the application configures a lower number, it
>gets an error.
>
>There is no API in DPDK to tell that only specific number of queues
>are supported. Adding an API is a solution, but in this case it's
>probably overkill. With this patch, the driver can present the proper
>number of queues to the application, knowing that the spreading of
>the packets won't be ideal (some queues won't receive packets), but
>it will work.
>
>A step further in this direction would be to configure more queues
>than asked in hardware to do a better spreading, almost similar to
>what is done with RETA tables in mlx5. But this is more complicated
>to do, especially if we want it for 16.04.

Well I guess I must agree with the solution, but I am not real happy. Can we 
mark this a temp fix until we figure out a cleaner solution as I would not want 
this type of solution forever or be the standard way to handle these problems.

>
>Hope this is clearer with the explanation.
>
>Regards,
>Olivier
>
>


Regards,
Keith






[dpdk-dev] [PATCH] mlx4: use dummy rxqs when a non-pow2 number is requested

2016-03-22 Thread Olivier Matz
Hi Keith,

On 03/21/2016 06:38 PM, Wiles, Keith wrote:
>> On Mar 21, 2016, at 11:10 AM, Olivier Matz  wrote:
>>
>> When using RSS, the number of rxqs has to be a power of two.
>> This is a problem because there is no API is dpdk that makes
>> the application aware of that.
>>
>> A good compromise is to allow the application to request a
>> number of rxqs that is not a power of 2, but having inactive
>> queues that will never receive packets. In this configuration,
>> a warning will be issued to users to let them know that
>> this is not an optimal configuration.
> 
> Not sure I like this solution. I think an error should be returned with a log 
> message instead. What if the next driver needs power of three or must be odd 
> or even number. 
> 
> The bigger problem is the application is no longer portable for any given nic 
> configuration.
> 
> We need a method for the application to query the system for these types of 
> information. But as we do not have that API we need to just error the request 
> off.


The initial problem is that the driver says "I support a maximum
of X queues" and if the application configures a lower number, it
gets an error.

There is no API in DPDK to tell that only specific number of queues
are supported. Adding an API is a solution, but in this case it's
probably overkill. With this patch, the driver can present the proper
number of queues to the application, knowing that the spreading of
the packets won't be ideal (some queues won't receive packets), but
it will work.

A step further in this direction would be to configure more queues
than asked in hardware to do a better spreading, almost similar to
what is done with RETA tables in mlx5. But this is more complicated
to do, especially if we want it for 16.04.

Hope this is clearer with the explanation.

Regards,
Olivier



[dpdk-dev] [PATCH] mlx4: use dummy rxqs when a non-pow2 number is requested

2016-03-21 Thread Adrien Mazarguil
On Mon, Mar 21, 2016 at 05:08:04PM +0100, Olivier Matz wrote:
> When using RSS, the number of rxqs has to be a power of two.
> This is a problem because there is no API is dpdk that makes
> the application aware of that.
> 
> A good compromise is to allow the application to request a
> number of rxqs that is not a power of 2, but having inactive
> queues that will never receive packets. In this configuration,
> a warning will be issued to users to let them know that
> this is not an optimal configuration.
> 
> Signed-off-by: Olivier Matz 
> ---
>  drivers/net/mlx4/mlx4.c | 27 +--
>  1 file changed, 17 insertions(+), 10 deletions(-)

Acked-by: Adrien Mazarguil 

-- 
Adrien Mazarguil
6WIND


[dpdk-dev] [PATCH] mlx4: use dummy rxqs when a non-pow2 number is requested

2016-03-21 Thread Wiles, Keith


Sent from my iPhone

> On Mar 21, 2016, at 11:10 AM, Olivier Matz  wrote:
> 
> When using RSS, the number of rxqs has to be a power of two.
> This is a problem because there is no API is dpdk that makes
> the application aware of that.
> 
> A good compromise is to allow the application to request a
> number of rxqs that is not a power of 2, but having inactive
> queues that will never receive packets. In this configuration,
> a warning will be issued to users to let them know that
> this is not an optimal configuration.

Not sure I like this solution. I think an error should be returned with a log 
message instead. What if the next driver needs power of three or must be odd or 
even number. 

The bigger problem is the application is no longer portable for any given nic 
configuration.

We need a method for the application to query the system for these types of 
information. But as we do not have that API we need to just error the request 
off.

> 
> Signed-off-by: Olivier Matz 
> ---
> drivers/net/mlx4/mlx4.c | 27 +--
> 1 file changed, 17 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
> index cc4e9aa..eaf06db 100644
> --- a/drivers/net/mlx4/mlx4.c
> +++ b/drivers/net/mlx4/mlx4.c
> @@ -698,7 +698,7 @@ txq_cleanup(struct txq *txq);
> 
> static int
> rxq_setup(struct rte_eth_dev *dev, struct rxq *rxq, uint16_t desc,
> -  unsigned int socket, const struct rte_eth_rxconf *conf,
> +  unsigned int socket, int inactive, const struct rte_eth_rxconf *conf,
>  struct rte_mempool *mp);
> 
> static void
> @@ -734,12 +734,12 @@ dev_configure(struct rte_eth_dev *dev)
>}
>if (rxqs_n == priv->rxqs_n)
>return 0;
> -if ((rxqs_n & (rxqs_n - 1)) != 0) {
> -ERROR("%p: invalid number of RX queues (%u),"
> -  " must be a power of 2",
> +if (!rte_is_power_of_2(rxqs_n)) {
> +WARN("%p: number of RX queues (%u), must be a"
> +  " power of 2: remaining queues will be inactive",
>  (void *)dev, rxqs_n);
> -return EINVAL;
>}
> +
>INFO("%p: RX queues number update: %u -> %u",
> (void *)dev, priv->rxqs_n, rxqs_n);
>/* If RSS is enabled, disable it first. */
> @@ -775,7 +775,7 @@ dev_configure(struct rte_eth_dev *dev)
>priv->rss = 1;
>tmp = priv->rxqs_n;
>priv->rxqs_n = rxqs_n;
> -ret = rxq_setup(dev, >rxq_parent, 0, 0, NULL, NULL);
> +ret = rxq_setup(dev, >rxq_parent, 0, 0, 0, NULL, NULL);
>if (!ret)
>return 0;
>/* Failure, rollback. */
> @@ -3466,7 +3466,8 @@ rxq_setup_qp_rss(struct priv *priv, struct ibv_cq *cq, 
> uint16_t desc,
>attr.qpg.qpg_type = IBV_EXP_QPG_PARENT;
>/* TSS isn't necessary. */
>attr.qpg.parent_attrib.tss_child_count = 0;
> -attr.qpg.parent_attrib.rss_child_count = priv->rxqs_n;
> +attr.qpg.parent_attrib.rss_child_count =
> +rte_align32pow2(priv->rxqs_n + 1) >> 1;
>DEBUG("initializing parent RSS queue");
>} else {
>attr.qpg.qpg_type = IBV_EXP_QPG_CHILD_RX;
> @@ -3689,6 +3690,9 @@ skip_rtr:
>  *   Number of descriptors to configure in queue.
>  * @param socket
>  *   NUMA socket on which memory must be allocated.
> + * @param inactive
> + *   If true, the queue is disabled because its index is higher or
> + *   equal to the real number of queues, which must be a power of 2.
>  * @param[in] conf
>  *   Thresholds parameters.
>  * @param mp
> @@ -3699,7 +3703,7 @@ skip_rtr:
>  */
> static int
> rxq_setup(struct rte_eth_dev *dev, struct rxq *rxq, uint16_t desc,
> -  unsigned int socket, const struct rte_eth_rxconf *conf,
> +  unsigned int socket, int inactive, const struct rte_eth_rxconf *conf,
>  struct rte_mempool *mp)
> {
>struct priv *priv = dev->data->dev_private;
> @@ -3800,7 +3804,7 @@ skip_mr:
>DEBUG("priv->device_attr.max_sge is %d",
>  priv->device_attr.max_sge);
> #ifdef RSS_SUPPORT
> -if (priv->rss)
> +if (priv->rss && !inactive)
>tmpl.qp = rxq_setup_qp_rss(priv, tmpl.cq, desc, parent,
>   tmpl.rd);
>else
> @@ -3936,6 +3940,7 @@ mlx4_rx_queue_setup(struct rte_eth_dev *dev, uint16_t 
> idx, uint16_t desc,
> {
>struct priv *priv = dev->data->dev_private;
>struct rxq *rxq = (*priv->rxqs)[idx];
> +int inactive = 0;
>int ret;
> 
>if (mlx4_is_secondary())
> @@ -3967,7 +3972,9 @@ mlx4_rx_queue_setup(struct rte_eth_dev *dev, uint16_t 
> idx, uint16_t desc,
>return -ENOMEM;
>}
>}
> -ret = rxq_setup(dev, rxq, desc, socket, conf, mp);
> +if (idx >= rte_align32pow2(priv->rxqs_n + 1) >> 1)
> +inactive = 1;
> +ret = rxq_setup(dev, rxq, desc, socket, inactive, conf, mp);
>if (ret)
>rte_free(rxq);
>else {
> -- 
> 2.1.4
> 


[dpdk-dev] [PATCH] mlx4: use dummy rxqs when a non-pow2 number is requested

2016-03-21 Thread Olivier Matz
When using RSS, the number of rxqs has to be a power of two.
This is a problem because there is no API is dpdk that makes
the application aware of that.

A good compromise is to allow the application to request a
number of rxqs that is not a power of 2, but having inactive
queues that will never receive packets. In this configuration,
a warning will be issued to users to let them know that
this is not an optimal configuration.

Signed-off-by: Olivier Matz 
---
 drivers/net/mlx4/mlx4.c | 27 +--
 1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index cc4e9aa..eaf06db 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -698,7 +698,7 @@ txq_cleanup(struct txq *txq);

 static int
 rxq_setup(struct rte_eth_dev *dev, struct rxq *rxq, uint16_t desc,
- unsigned int socket, const struct rte_eth_rxconf *conf,
+ unsigned int socket, int inactive, const struct rte_eth_rxconf *conf,
  struct rte_mempool *mp);

 static void
@@ -734,12 +734,12 @@ dev_configure(struct rte_eth_dev *dev)
}
if (rxqs_n == priv->rxqs_n)
return 0;
-   if ((rxqs_n & (rxqs_n - 1)) != 0) {
-   ERROR("%p: invalid number of RX queues (%u),"
- " must be a power of 2",
+   if (!rte_is_power_of_2(rxqs_n)) {
+   WARN("%p: number of RX queues (%u), must be a"
+ " power of 2: remaining queues will be inactive",
  (void *)dev, rxqs_n);
-   return EINVAL;
}
+
INFO("%p: RX queues number update: %u -> %u",
 (void *)dev, priv->rxqs_n, rxqs_n);
/* If RSS is enabled, disable it first. */
@@ -775,7 +775,7 @@ dev_configure(struct rte_eth_dev *dev)
priv->rss = 1;
tmp = priv->rxqs_n;
priv->rxqs_n = rxqs_n;
-   ret = rxq_setup(dev, >rxq_parent, 0, 0, NULL, NULL);
+   ret = rxq_setup(dev, >rxq_parent, 0, 0, 0, NULL, NULL);
if (!ret)
return 0;
/* Failure, rollback. */
@@ -3466,7 +3466,8 @@ rxq_setup_qp_rss(struct priv *priv, struct ibv_cq *cq, 
uint16_t desc,
attr.qpg.qpg_type = IBV_EXP_QPG_PARENT;
/* TSS isn't necessary. */
attr.qpg.parent_attrib.tss_child_count = 0;
-   attr.qpg.parent_attrib.rss_child_count = priv->rxqs_n;
+   attr.qpg.parent_attrib.rss_child_count =
+   rte_align32pow2(priv->rxqs_n + 1) >> 1;
DEBUG("initializing parent RSS queue");
} else {
attr.qpg.qpg_type = IBV_EXP_QPG_CHILD_RX;
@@ -3689,6 +3690,9 @@ skip_rtr:
  *   Number of descriptors to configure in queue.
  * @param socket
  *   NUMA socket on which memory must be allocated.
+ * @param inactive
+ *   If true, the queue is disabled because its index is higher or
+ *   equal to the real number of queues, which must be a power of 2.
  * @param[in] conf
  *   Thresholds parameters.
  * @param mp
@@ -3699,7 +3703,7 @@ skip_rtr:
  */
 static int
 rxq_setup(struct rte_eth_dev *dev, struct rxq *rxq, uint16_t desc,
- unsigned int socket, const struct rte_eth_rxconf *conf,
+ unsigned int socket, int inactive, const struct rte_eth_rxconf *conf,
  struct rte_mempool *mp)
 {
struct priv *priv = dev->data->dev_private;
@@ -3800,7 +3804,7 @@ skip_mr:
DEBUG("priv->device_attr.max_sge is %d",
  priv->device_attr.max_sge);
 #ifdef RSS_SUPPORT
-   if (priv->rss)
+   if (priv->rss && !inactive)
tmpl.qp = rxq_setup_qp_rss(priv, tmpl.cq, desc, parent,
   tmpl.rd);
else
@@ -3936,6 +3940,7 @@ mlx4_rx_queue_setup(struct rte_eth_dev *dev, uint16_t 
idx, uint16_t desc,
 {
struct priv *priv = dev->data->dev_private;
struct rxq *rxq = (*priv->rxqs)[idx];
+   int inactive = 0;
int ret;

if (mlx4_is_secondary())
@@ -3967,7 +3972,9 @@ mlx4_rx_queue_setup(struct rte_eth_dev *dev, uint16_t 
idx, uint16_t desc,
return -ENOMEM;
}
}
-   ret = rxq_setup(dev, rxq, desc, socket, conf, mp);
+   if (idx >= rte_align32pow2(priv->rxqs_n + 1) >> 1)
+   inactive = 1;
+   ret = rxq_setup(dev, rxq, desc, socket, inactive, conf, mp);
if (ret)
rte_free(rxq);
else {
-- 
2.1.4