Re: [PATCH] mlx4: Fix tx ring affinity_mask creation

2015-04-28 Thread Or Gerlitz
On Fri, Apr 10, 2015 at 7:27 PM, Benjamin Poirier  wrote:
> By default, the number of tx queues is limited by the number of online cpus in
> mlx4_en_get_profile(). However, this limit no longer holds after the ethtool
> .set_channels method has been called. In that situation, the driver may access
> invalid bits of certain cpumask variables when queue_index > nr_cpu_ids.
>

Hi Benjamin,

Can this fix be related to a specific commit? if yes, would be good if
you can add here a Fixes: line so it would be easier to spot down to
which stable kernels the fix should go.

Or.

> Signed-off-by: Benjamin Poirier 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mlx4: Fix tx ring affinity_mask creation

2015-04-28 Thread Ido Shamay

On 4/28/2015 6:26 AM, Benjamin Poirier wrote:

On 2015/04/13 17:22, Benjamin Poirier wrote:

On 2015/04/12 10:03, Ido Shamay wrote:

Hi Benjamin,

On 4/10/2015 7:27 PM, Benjamin Poirier wrote:

By default, the number of tx queues is limited by the number of online cpus in
mlx4_en_get_profile(). However, this limit no longer holds after the ethtool
.set_channels method has been called. In that situation, the driver may access
invalid bits of certain cpumask variables when queue_index > nr_cpu_ids.

I must say I don't see the above issue with the current code.
Whatever is the modified value of priv->num_tx_rings_p_up, it will set XPS
only on queues which have
been set with CPU affinity mask (no access to invalid bits).

The problem is not with the call to netif_set_xps_queue() it is with the
calls to cpu_online() and cpumask_set_cpu().

For example, if the user calls `ethtool -L ethX tx 32`, queue_index in
mlx4_en_create_tx_ring() can be up to 255. Depending on CONFIG_NR_CPUS
and CONFIG_CPUMASK_OFFSTACK this may result in calls to cpu_online() and
cpumask_set_cpu() with cpu >= nr_cpumask_bits which is an invalid usage
of the cpumask api. The driver will potentially read or write beyond the
end of the bitmap. With CONFIG_CPUMASK_OFFSTACK=y and
CONFIG_DEBUG_PER_CPU_MAPS=y, the aforementioned ethtool call on a system
with <32 cpus triggers the warning in cpumask_check().


Mellanox, can you please
ack the patch as submitted, or
clarify what changes you'd like to see given my reply above, or
submit a fix of your own for this problem

Thanks,
-Benjamin

Hi Benjamin,

After further review and better understanding of the issue, we are okay 
with your patch as is.

Thanks for the good work.

Acked-by: Ido Shamay 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mlx4: Fix tx ring affinity_mask creation

2015-04-28 Thread Ido Shamay

On 4/28/2015 6:26 AM, Benjamin Poirier wrote:

On 2015/04/13 17:22, Benjamin Poirier wrote:

On 2015/04/12 10:03, Ido Shamay wrote:

Hi Benjamin,

On 4/10/2015 7:27 PM, Benjamin Poirier wrote:

By default, the number of tx queues is limited by the number of online cpus in
mlx4_en_get_profile(). However, this limit no longer holds after the ethtool
.set_channels method has been called. In that situation, the driver may access
invalid bits of certain cpumask variables when queue_index  nr_cpu_ids.

I must say I don't see the above issue with the current code.
Whatever is the modified value of priv-num_tx_rings_p_up, it will set XPS
only on queues which have
been set with CPU affinity mask (no access to invalid bits).

The problem is not with the call to netif_set_xps_queue() it is with the
calls to cpu_online() and cpumask_set_cpu().

For example, if the user calls `ethtool -L ethX tx 32`, queue_index in
mlx4_en_create_tx_ring() can be up to 255. Depending on CONFIG_NR_CPUS
and CONFIG_CPUMASK_OFFSTACK this may result in calls to cpu_online() and
cpumask_set_cpu() with cpu = nr_cpumask_bits which is an invalid usage
of the cpumask api. The driver will potentially read or write beyond the
end of the bitmap. With CONFIG_CPUMASK_OFFSTACK=y and
CONFIG_DEBUG_PER_CPU_MAPS=y, the aforementioned ethtool call on a system
with 32 cpus triggers the warning in cpumask_check().


Mellanox, can you please
ack the patch as submitted, or
clarify what changes you'd like to see given my reply above, or
submit a fix of your own for this problem

Thanks,
-Benjamin

Hi Benjamin,

After further review and better understanding of the issue, we are okay 
with your patch as is.

Thanks for the good work.

Acked-by: Ido Shamay i...@mellanox.com

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mlx4: Fix tx ring affinity_mask creation

2015-04-28 Thread Or Gerlitz
On Fri, Apr 10, 2015 at 7:27 PM, Benjamin Poirier bpoir...@suse.de wrote:
 By default, the number of tx queues is limited by the number of online cpus in
 mlx4_en_get_profile(). However, this limit no longer holds after the ethtool
 .set_channels method has been called. In that situation, the driver may access
 invalid bits of certain cpumask variables when queue_index  nr_cpu_ids.


Hi Benjamin,

Can this fix be related to a specific commit? if yes, would be good if
you can add here a Fixes: line so it would be easier to spot down to
which stable kernels the fix should go.

Or.

 Signed-off-by: Benjamin Poirier bpoir...@suse.de
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mlx4: Fix tx ring affinity_mask creation

2015-04-27 Thread Benjamin Poirier
On 2015/04/13 17:22, Benjamin Poirier wrote:
> On 2015/04/12 10:03, Ido Shamay wrote:
> > Hi Benjamin,
> > 
> > On 4/10/2015 7:27 PM, Benjamin Poirier wrote:
> > >By default, the number of tx queues is limited by the number of online 
> > >cpus in
> > >mlx4_en_get_profile(). However, this limit no longer holds after the 
> > >ethtool
> > >.set_channels method has been called. In that situation, the driver may 
> > >access
> > >invalid bits of certain cpumask variables when queue_index > nr_cpu_ids.
> > 
> > I must say I don't see the above issue with the current code.
> > Whatever is the modified value of priv->num_tx_rings_p_up, it will set XPS
> > only on queues which have
> > been set with CPU affinity mask (no access to invalid bits).
> 
> The problem is not with the call to netif_set_xps_queue() it is with the
> calls to cpu_online() and cpumask_set_cpu().
> 
> For example, if the user calls `ethtool -L ethX tx 32`, queue_index in
> mlx4_en_create_tx_ring() can be up to 255. Depending on CONFIG_NR_CPUS
> and CONFIG_CPUMASK_OFFSTACK this may result in calls to cpu_online() and
> cpumask_set_cpu() with cpu >= nr_cpumask_bits which is an invalid usage
> of the cpumask api. The driver will potentially read or write beyond the
> end of the bitmap. With CONFIG_CPUMASK_OFFSTACK=y and
> CONFIG_DEBUG_PER_CPU_MAPS=y, the aforementioned ethtool call on a system
> with <32 cpus triggers the warning in cpumask_check(). 
> 

Mellanox, can you please
ack the patch as submitted, or
clarify what changes you'd like to see given my reply above, or
submit a fix of your own for this problem

Thanks,
-Benjamin

> > 
> > It's true that when priv->num_tx_rings_p_up > nr_cpus. not all queues will
> > be set with XPS.
> > This is because the code tries to preserve 1:1 mapping of queues to cores,
> > to avoid a double mapping
> > of queues to cores.
> > I guess it's ok to break the 1:1 mapping in this condition, but the commit
> > message should say that instead
> > of invalid bits. Please fix me if I'm wrong.
> > 
> > >Signed-off-by: Benjamin Poirier 
> > >---
> > >  drivers/net/ethernet/mellanox/mlx4/en_tx.c | 8 +---
> > >  1 file changed, 5 insertions(+), 3 deletions(-)
> > >
> > >diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c 
> > >b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
> > >index 55f9f5c..8c234ec 100644
> > >--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
> > >+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
> > >@@ -143,8 +143,10 @@ int mlx4_en_create_tx_ring(struct mlx4_en_priv *priv,
> > >   ring->hwtstamp_tx_type = priv->hwtstamp_config.tx_type;
> > >   ring->queue_index = queue_index;
> > >-  if (queue_index < priv->num_tx_rings_p_up && cpu_online(queue_index))
> > >-  cpumask_set_cpu(queue_index, >affinity_mask);
> > >+  if (queue_index < priv->num_tx_rings_p_up)
> > >+  cpumask_set_cpu_local_first(queue_index,
> > >+  priv->mdev->dev->numa_node,
> > >+  >affinity_mask);
> > Moving from cpumask_set_cpu to cpumask_set_cpu_local_first is great, but
> > should come in a different commit, since
> > the behavior of the XPS is changed here (xps_cpus[tx_ring[queue_index]] !=
> > queue_index from now).
> > Commit should state of this behavior change.
> > Thanks a lot Benjamin.
> > >   *pring = ring;
> > >   return 0;
> > >@@ -213,7 +215,7 @@ int mlx4_en_activate_tx_ring(struct mlx4_en_priv *priv,
> > >   err = mlx4_qp_to_ready(mdev->dev, >wqres.mtt, >context,
> > >  >qp, >qp_state);
> > >-  if (!user_prio && cpu_online(ring->queue_index))
> > >+  if (!cpumask_empty(>affinity_mask))
> > >   netif_set_xps_queue(priv->dev, >affinity_mask,
> > >   ring->queue_index);
> > 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mlx4: Fix tx ring affinity_mask creation

2015-04-27 Thread Benjamin Poirier
On 2015/04/13 17:22, Benjamin Poirier wrote:
 On 2015/04/12 10:03, Ido Shamay wrote:
  Hi Benjamin,
  
  On 4/10/2015 7:27 PM, Benjamin Poirier wrote:
  By default, the number of tx queues is limited by the number of online 
  cpus in
  mlx4_en_get_profile(). However, this limit no longer holds after the 
  ethtool
  .set_channels method has been called. In that situation, the driver may 
  access
  invalid bits of certain cpumask variables when queue_index  nr_cpu_ids.
  
  I must say I don't see the above issue with the current code.
  Whatever is the modified value of priv-num_tx_rings_p_up, it will set XPS
  only on queues which have
  been set with CPU affinity mask (no access to invalid bits).
 
 The problem is not with the call to netif_set_xps_queue() it is with the
 calls to cpu_online() and cpumask_set_cpu().
 
 For example, if the user calls `ethtool -L ethX tx 32`, queue_index in
 mlx4_en_create_tx_ring() can be up to 255. Depending on CONFIG_NR_CPUS
 and CONFIG_CPUMASK_OFFSTACK this may result in calls to cpu_online() and
 cpumask_set_cpu() with cpu = nr_cpumask_bits which is an invalid usage
 of the cpumask api. The driver will potentially read or write beyond the
 end of the bitmap. With CONFIG_CPUMASK_OFFSTACK=y and
 CONFIG_DEBUG_PER_CPU_MAPS=y, the aforementioned ethtool call on a system
 with 32 cpus triggers the warning in cpumask_check(). 
 

Mellanox, can you please
ack the patch as submitted, or
clarify what changes you'd like to see given my reply above, or
submit a fix of your own for this problem

Thanks,
-Benjamin

  
  It's true that when priv-num_tx_rings_p_up  nr_cpus. not all queues will
  be set with XPS.
  This is because the code tries to preserve 1:1 mapping of queues to cores,
  to avoid a double mapping
  of queues to cores.
  I guess it's ok to break the 1:1 mapping in this condition, but the commit
  message should say that instead
  of invalid bits. Please fix me if I'm wrong.
  
  Signed-off-by: Benjamin Poirier bpoir...@suse.de
  ---
drivers/net/ethernet/mellanox/mlx4/en_tx.c | 8 +---
1 file changed, 5 insertions(+), 3 deletions(-)
  
  diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c 
  b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
  index 55f9f5c..8c234ec 100644
  --- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
  +++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
  @@ -143,8 +143,10 @@ int mlx4_en_create_tx_ring(struct mlx4_en_priv *priv,
 ring-hwtstamp_tx_type = priv-hwtstamp_config.tx_type;
 ring-queue_index = queue_index;
  -  if (queue_index  priv-num_tx_rings_p_up  cpu_online(queue_index))
  -  cpumask_set_cpu(queue_index, ring-affinity_mask);
  +  if (queue_index  priv-num_tx_rings_p_up)
  +  cpumask_set_cpu_local_first(queue_index,
  +  priv-mdev-dev-numa_node,
  +  ring-affinity_mask);
  Moving from cpumask_set_cpu to cpumask_set_cpu_local_first is great, but
  should come in a different commit, since
  the behavior of the XPS is changed here (xps_cpus[tx_ring[queue_index]] !=
  queue_index from now).
  Commit should state of this behavior change.
  Thanks a lot Benjamin.
 *pring = ring;
 return 0;
  @@ -213,7 +215,7 @@ int mlx4_en_activate_tx_ring(struct mlx4_en_priv *priv,
 err = mlx4_qp_to_ready(mdev-dev, ring-wqres.mtt, ring-context,
ring-qp, ring-qp_state);
  -  if (!user_prio  cpu_online(ring-queue_index))
  +  if (!cpumask_empty(ring-affinity_mask))
 netif_set_xps_queue(priv-dev, ring-affinity_mask,
 ring-queue_index);
  
 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mlx4: Fix tx ring affinity_mask creation

2015-04-13 Thread Benjamin Poirier
On 2015/04/12 10:03, Ido Shamay wrote:
> Hi Benjamin,
> 
> On 4/10/2015 7:27 PM, Benjamin Poirier wrote:
> >By default, the number of tx queues is limited by the number of online cpus 
> >in
> >mlx4_en_get_profile(). However, this limit no longer holds after the ethtool
> >.set_channels method has been called. In that situation, the driver may 
> >access
> >invalid bits of certain cpumask variables when queue_index > nr_cpu_ids.
> 
> I must say I don't see the above issue with the current code.
> Whatever is the modified value of priv->num_tx_rings_p_up, it will set XPS
> only on queues which have
> been set with CPU affinity mask (no access to invalid bits).

The problem is not with the call to netif_set_xps_queue() it is with the
calls to cpu_online() and cpumask_set_cpu().

For example, if the user calls `ethtool -L ethX tx 32`, queue_index in
mlx4_en_create_tx_ring() can be up to 255. Depending on CONFIG_NR_CPUS
and CONFIG_CPUMASK_OFFSTACK this may result in calls to cpu_online() and
cpumask_set_cpu() with cpu >= nr_cpumask_bits which is an invalid usage
of the cpumask api. The driver will potentially read or write beyond the
end of the bitmap. With CONFIG_CPUMASK_OFFSTACK=y and
CONFIG_DEBUG_PER_CPU_MAPS=y, the aforementioned ethtool call on a system
with <32 cpus triggers the warning in cpumask_check(). 

> 
> It's true that when priv->num_tx_rings_p_up > nr_cpus. not all queues will
> be set with XPS.
> This is because the code tries to preserve 1:1 mapping of queues to cores,
> to avoid a double mapping
> of queues to cores.
> I guess it's ok to break the 1:1 mapping in this condition, but the commit
> message should say that instead
> of invalid bits. Please fix me if I'm wrong.
> 
> >Signed-off-by: Benjamin Poirier 
> >---
> >  drivers/net/ethernet/mellanox/mlx4/en_tx.c | 8 +---
> >  1 file changed, 5 insertions(+), 3 deletions(-)
> >
> >diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c 
> >b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
> >index 55f9f5c..8c234ec 100644
> >--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
> >+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
> >@@ -143,8 +143,10 @@ int mlx4_en_create_tx_ring(struct mlx4_en_priv *priv,
> > ring->hwtstamp_tx_type = priv->hwtstamp_config.tx_type;
> > ring->queue_index = queue_index;
> >-if (queue_index < priv->num_tx_rings_p_up && cpu_online(queue_index))
> >-cpumask_set_cpu(queue_index, >affinity_mask);
> >+if (queue_index < priv->num_tx_rings_p_up)
> >+cpumask_set_cpu_local_first(queue_index,
> >+priv->mdev->dev->numa_node,
> >+>affinity_mask);
> Moving from cpumask_set_cpu to cpumask_set_cpu_local_first is great, but
> should come in a different commit, since
> the behavior of the XPS is changed here (xps_cpus[tx_ring[queue_index]] !=
> queue_index from now).
> Commit should state of this behavior change.
> Thanks a lot Benjamin.
> > *pring = ring;
> > return 0;
> >@@ -213,7 +215,7 @@ int mlx4_en_activate_tx_ring(struct mlx4_en_priv *priv,
> > err = mlx4_qp_to_ready(mdev->dev, >wqres.mtt, >context,
> >>qp, >qp_state);
> >-if (!user_prio && cpu_online(ring->queue_index))
> >+if (!cpumask_empty(>affinity_mask))
> > netif_set_xps_queue(priv->dev, >affinity_mask,
> > ring->queue_index);
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mlx4: Fix tx ring affinity_mask creation

2015-04-13 Thread Benjamin Poirier
On 2015/04/12 10:03, Ido Shamay wrote:
 Hi Benjamin,
 
 On 4/10/2015 7:27 PM, Benjamin Poirier wrote:
 By default, the number of tx queues is limited by the number of online cpus 
 in
 mlx4_en_get_profile(). However, this limit no longer holds after the ethtool
 .set_channels method has been called. In that situation, the driver may 
 access
 invalid bits of certain cpumask variables when queue_index  nr_cpu_ids.
 
 I must say I don't see the above issue with the current code.
 Whatever is the modified value of priv-num_tx_rings_p_up, it will set XPS
 only on queues which have
 been set with CPU affinity mask (no access to invalid bits).

The problem is not with the call to netif_set_xps_queue() it is with the
calls to cpu_online() and cpumask_set_cpu().

For example, if the user calls `ethtool -L ethX tx 32`, queue_index in
mlx4_en_create_tx_ring() can be up to 255. Depending on CONFIG_NR_CPUS
and CONFIG_CPUMASK_OFFSTACK this may result in calls to cpu_online() and
cpumask_set_cpu() with cpu = nr_cpumask_bits which is an invalid usage
of the cpumask api. The driver will potentially read or write beyond the
end of the bitmap. With CONFIG_CPUMASK_OFFSTACK=y and
CONFIG_DEBUG_PER_CPU_MAPS=y, the aforementioned ethtool call on a system
with 32 cpus triggers the warning in cpumask_check(). 

 
 It's true that when priv-num_tx_rings_p_up  nr_cpus. not all queues will
 be set with XPS.
 This is because the code tries to preserve 1:1 mapping of queues to cores,
 to avoid a double mapping
 of queues to cores.
 I guess it's ok to break the 1:1 mapping in this condition, but the commit
 message should say that instead
 of invalid bits. Please fix me if I'm wrong.
 
 Signed-off-by: Benjamin Poirier bpoir...@suse.de
 ---
   drivers/net/ethernet/mellanox/mlx4/en_tx.c | 8 +---
   1 file changed, 5 insertions(+), 3 deletions(-)
 
 diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c 
 b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
 index 55f9f5c..8c234ec 100644
 --- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
 +++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
 @@ -143,8 +143,10 @@ int mlx4_en_create_tx_ring(struct mlx4_en_priv *priv,
  ring-hwtstamp_tx_type = priv-hwtstamp_config.tx_type;
  ring-queue_index = queue_index;
 -if (queue_index  priv-num_tx_rings_p_up  cpu_online(queue_index))
 -cpumask_set_cpu(queue_index, ring-affinity_mask);
 +if (queue_index  priv-num_tx_rings_p_up)
 +cpumask_set_cpu_local_first(queue_index,
 +priv-mdev-dev-numa_node,
 +ring-affinity_mask);
 Moving from cpumask_set_cpu to cpumask_set_cpu_local_first is great, but
 should come in a different commit, since
 the behavior of the XPS is changed here (xps_cpus[tx_ring[queue_index]] !=
 queue_index from now).
 Commit should state of this behavior change.
 Thanks a lot Benjamin.
  *pring = ring;
  return 0;
 @@ -213,7 +215,7 @@ int mlx4_en_activate_tx_ring(struct mlx4_en_priv *priv,
  err = mlx4_qp_to_ready(mdev-dev, ring-wqres.mtt, ring-context,
 ring-qp, ring-qp_state);
 -if (!user_prio  cpu_online(ring-queue_index))
 +if (!cpumask_empty(ring-affinity_mask))
  netif_set_xps_queue(priv-dev, ring-affinity_mask,
  ring-queue_index);
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mlx4: Fix tx ring affinity_mask creation

2015-04-12 Thread Ido Shamay

Hi Benjamin,

On 4/10/2015 7:27 PM, Benjamin Poirier wrote:

By default, the number of tx queues is limited by the number of online cpus in
mlx4_en_get_profile(). However, this limit no longer holds after the ethtool
.set_channels method has been called. In that situation, the driver may access
invalid bits of certain cpumask variables when queue_index > nr_cpu_ids.


I must say I don't see the above issue with the current code.
Whatever is the modified value of priv->num_tx_rings_p_up, it will set 
XPS only on queues which have

been set with CPU affinity mask (no access to invalid bits).

It's true that when priv->num_tx_rings_p_up > nr_cpus. not all queues 
will be set with XPS.
This is because the code tries to preserve 1:1 mapping of queues to 
cores, to avoid a double mapping

of queues to cores.
I guess it's ok to break the 1:1 mapping in this condition, but the 
commit message should say that instead

of invalid bits. Please fix me if I'm wrong.


Signed-off-by: Benjamin Poirier 
---
  drivers/net/ethernet/mellanox/mlx4/en_tx.c | 8 +---
  1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c 
b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index 55f9f5c..8c234ec 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -143,8 +143,10 @@ int mlx4_en_create_tx_ring(struct mlx4_en_priv *priv,
ring->hwtstamp_tx_type = priv->hwtstamp_config.tx_type;
ring->queue_index = queue_index;
  
-	if (queue_index < priv->num_tx_rings_p_up && cpu_online(queue_index))

-   cpumask_set_cpu(queue_index, >affinity_mask);
+   if (queue_index < priv->num_tx_rings_p_up)
+   cpumask_set_cpu_local_first(queue_index,
+   priv->mdev->dev->numa_node,
+   >affinity_mask);
Moving from cpumask_set_cpu to cpumask_set_cpu_local_first is great, but 
should come in a different commit, since
the behavior of the XPS is changed here (xps_cpus[tx_ring[queue_index]] 
!= queue_index from now).

Commit should state of this behavior change.
Thanks a lot Benjamin.
  
  	*pring = ring;

return 0;
@@ -213,7 +215,7 @@ int mlx4_en_activate_tx_ring(struct mlx4_en_priv *priv,
  
  	err = mlx4_qp_to_ready(mdev->dev, >wqres.mtt, >context,

   >qp, >qp_state);
-   if (!user_prio && cpu_online(ring->queue_index))
+   if (!cpumask_empty(>affinity_mask))
netif_set_xps_queue(priv->dev, >affinity_mask,
ring->queue_index);
  


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mlx4: Fix tx ring affinity_mask creation

2015-04-12 Thread Ido Shamay

Hi Benjamin,

On 4/10/2015 7:27 PM, Benjamin Poirier wrote:

By default, the number of tx queues is limited by the number of online cpus in
mlx4_en_get_profile(). However, this limit no longer holds after the ethtool
.set_channels method has been called. In that situation, the driver may access
invalid bits of certain cpumask variables when queue_index  nr_cpu_ids.


I must say I don't see the above issue with the current code.
Whatever is the modified value of priv-num_tx_rings_p_up, it will set 
XPS only on queues which have

been set with CPU affinity mask (no access to invalid bits).

It's true that when priv-num_tx_rings_p_up  nr_cpus. not all queues 
will be set with XPS.
This is because the code tries to preserve 1:1 mapping of queues to 
cores, to avoid a double mapping

of queues to cores.
I guess it's ok to break the 1:1 mapping in this condition, but the 
commit message should say that instead

of invalid bits. Please fix me if I'm wrong.


Signed-off-by: Benjamin Poirier bpoir...@suse.de
---
  drivers/net/ethernet/mellanox/mlx4/en_tx.c | 8 +---
  1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c 
b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index 55f9f5c..8c234ec 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -143,8 +143,10 @@ int mlx4_en_create_tx_ring(struct mlx4_en_priv *priv,
ring-hwtstamp_tx_type = priv-hwtstamp_config.tx_type;
ring-queue_index = queue_index;
  
-	if (queue_index  priv-num_tx_rings_p_up  cpu_online(queue_index))

-   cpumask_set_cpu(queue_index, ring-affinity_mask);
+   if (queue_index  priv-num_tx_rings_p_up)
+   cpumask_set_cpu_local_first(queue_index,
+   priv-mdev-dev-numa_node,
+   ring-affinity_mask);
Moving from cpumask_set_cpu to cpumask_set_cpu_local_first is great, but 
should come in a different commit, since
the behavior of the XPS is changed here (xps_cpus[tx_ring[queue_index]] 
!= queue_index from now).

Commit should state of this behavior change.
Thanks a lot Benjamin.
  
  	*pring = ring;

return 0;
@@ -213,7 +215,7 @@ int mlx4_en_activate_tx_ring(struct mlx4_en_priv *priv,
  
  	err = mlx4_qp_to_ready(mdev-dev, ring-wqres.mtt, ring-context,

   ring-qp, ring-qp_state);
-   if (!user_prio  cpu_online(ring-queue_index))
+   if (!cpumask_empty(ring-affinity_mask))
netif_set_xps_queue(priv-dev, ring-affinity_mask,
ring-queue_index);
  


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/