Re: Extend irq_set_affinity_notifier() to use a call chain

2014-05-27 Thread Thomas Gleixner
On Tue, 27 May 2014, Amir Vadai wrote:
> On 5/26/2014 3:39 PM, Thomas Gleixner wrote:
> > Even if you'd solve that and have a callback in the driver, then the
> > callback never can restart the napi session directly. All it can do is
> > set a flag which needs to be checked in the RX path, right?
> > 
> > So what's the point of adding notifier call chain complexity, ordering
> > problems etc., if you can simply note the fact that the affinity
> > changed in the rmap itself and check that in the RX path?
> 
> I will try to find a solution in the spirit of what you suggested -
> to let the rmap library notify napi about affinity changes - without
> adding this complexity to the code.

I'd rather avoid the term "notify". Because it's not an active
notification which results in an immediate NAPI restart. NAPI has to
poll the information in rmap or wherever you end up storing it.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Extend irq_set_affinity_notifier() to use a call chain

2014-05-27 Thread Amir Vadai

On 5/26/2014 3:39 PM, Thomas Gleixner wrote:

[...]



The rmap _IS_ instantiated by the driver, and both the driver and the
networking core know about it.

So it's not completely different consumers. Just because it's a
library does not mean it's disjunct from the code which uses it.

Aside of the fact, that maintaining a per irq notifier chain is going
to be ugly as hell due to life time and locking issues, it's just
opening a can of worms. How do you make sure that the invocation order
is correct? What are the dependency rules of the driver restarting the
napi session versus updating the rmap?

Even if you'd solve that and have a callback in the driver, then the
callback never can restart the napi session directly. All it can do is
set a flag which needs to be checked in the RX path, right?

So what's the point of adding notifier call chain complexity, ordering
problems etc., if you can simply note the fact that the affinity
changed in the rmap itself and check that in the RX path?


I will try to find a solution in the spirit of what you suggested - to 
let the rmap library notify napi about affinity changes - without adding 
this complexity to the code.


Thanks,
Amir



Thanks,

tglx


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Extend irq_set_affinity_notifier() to use a call chain

2014-05-27 Thread Amir Vadai

On 5/26/2014 3:39 PM, Thomas Gleixner wrote:

[...]



The rmap _IS_ instantiated by the driver, and both the driver and the
networking core know about it.

So it's not completely different consumers. Just because it's a
library does not mean it's disjunct from the code which uses it.

Aside of the fact, that maintaining a per irq notifier chain is going
to be ugly as hell due to life time and locking issues, it's just
opening a can of worms. How do you make sure that the invocation order
is correct? What are the dependency rules of the driver restarting the
napi session versus updating the rmap?

Even if you'd solve that and have a callback in the driver, then the
callback never can restart the napi session directly. All it can do is
set a flag which needs to be checked in the RX path, right?

So what's the point of adding notifier call chain complexity, ordering
problems etc., if you can simply note the fact that the affinity
changed in the rmap itself and check that in the RX path?


I will try to find a solution in the spirit of what you suggested - to 
let the rmap library notify napi about affinity changes - without adding 
this complexity to the code.


Thanks,
Amir



Thanks,

tglx


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Extend irq_set_affinity_notifier() to use a call chain

2014-05-27 Thread Thomas Gleixner
On Tue, 27 May 2014, Amir Vadai wrote:
 On 5/26/2014 3:39 PM, Thomas Gleixner wrote:
  Even if you'd solve that and have a callback in the driver, then the
  callback never can restart the napi session directly. All it can do is
  set a flag which needs to be checked in the RX path, right?
  
  So what's the point of adding notifier call chain complexity, ordering
  problems etc., if you can simply note the fact that the affinity
  changed in the rmap itself and check that in the RX path?
 
 I will try to find a solution in the spirit of what you suggested -
 to let the rmap library notify napi about affinity changes - without
 adding this complexity to the code.

I'd rather avoid the term notify. Because it's not an active
notification which results in an immediate NAPI restart. NAPI has to
poll the information in rmap or wherever you end up storing it.

Thanks,

tglx
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Extend irq_set_affinity_notifier() to use a call chain

2014-05-26 Thread Thomas Gleixner
On Mon, 26 May 2014, Amir Vadai wrote:
> On 5/26/2014 2:34 PM, Thomas Gleixner wrote:
> > You are not describing what needs to be notified and why. Please
> > explain the details of that and how the RFS (whatever that is) and the
> > network driver are connected
> The goal of RFS is to increase datacache hitrate by steering
> kernel processing of packets in multi-queue devices to the CPU where the
> application thread consuming the packet is running.
> 
> In order to select the right queue, the networking stack needs to have a
> reverse map of IRQ affinty. This is the rmap that was added by Ben Hutchings
> [1]. To keep the rmap updated, cpu_rmap registers on the affinity notify.
> 
> This is the first affinity callback - it is located as a general library and
> not under net/...
>
> The motivation to the second irq affinity callback is:
> When traffic starts, first packet fires an interrupt which starts the napi
> polling on the cpu according the irq affinity.
> If there is always packets to be consumed by the napi polling, no further
> interrupts will be fired, and napi will consume all the packets from the cpu
> it was started.
> If the user changes the irq affinity, napi polling will continue to be done
> from the original cpu.
> Only when the traffic will pause, napi session will be finished, and when
> traffic will resume, the new napi session will be done from the new cpu.
> This is a problematic behavior, because from the user point of view, cpu
> affinity can't be changed in a non-stop traffic scenario.
> 
> To solve this, the network driver should be notified on irq affinity change
> event, and restart the napi session. This could be done by closing the napi
> session and arming the interrupts. Next packet arrives will trigger an
> interrupt and napi will session will start, this time on the new CPU.
> 
> > and why this notification cannot be
> > propagated inside the network stack itself.
> 
> To my understanding, those are two different consumers to the same event, one
> is a general library to maintain a reverse irq affinity map, and the other is
> networking specific, and maybe even a networking driver specific.

The rmap _IS_ instantiated by the driver, and both the driver and the
networking core know about it.

So it's not completely different consumers. Just because it's a
library does not mean it's disjunct from the code which uses it.

Aside of the fact, that maintaining a per irq notifier chain is going
to be ugly as hell due to life time and locking issues, it's just
opening a can of worms. How do you make sure that the invocation order
is correct? What are the dependency rules of the driver restarting the
napi session versus updating the rmap?

Even if you'd solve that and have a callback in the driver, then the
callback never can restart the napi session directly. All it can do is
set a flag which needs to be checked in the RX path, right?

So what's the point of adding notifier call chain complexity, ordering
problems etc., if you can simply note the fact that the affinity
changed in the rmap itself and check that in the RX path?

Thanks,

tglx




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Extend irq_set_affinity_notifier() to use a call chain

2014-05-26 Thread Amir Vadai

On 5/26/2014 2:34 PM, Thomas Gleixner wrote:

On Mon, 26 May 2014, Amir Vadai wrote:


On 5/26/2014 2:15 PM, Thomas Gleixner wrote:

On Sun, 25 May 2014, Amir Vadai wrote:

In order to do that, I need to add a new irq affinity notification
callback (In addition to the existing cpu_rmap notification). For
that I would like to extend irq_set_affinity_notifier() to have a
notifier call-chain instead of a single notifier callback.


Why? "I would like" is a non argument.


Current implementation enables only one callback to be registered for irq
affinity change notifications.


I'm well aware of that.


cpu_rmap is registered be notified - for RFS purposes.  mlx4_en (and
probably other network drivers) needs to be notified too, in order
to stop the napi polling on the old cpu and move to the new one.  To
enable more than 1 notification callbacks, I suggest to use a
notifier call chain.


You are not describing what needs to be notified and why. Please
explain the details of that and how the RFS (whatever that is) and the
network driver are connected

The goal of RFS is to increase datacache hitrate by steering
kernel processing of packets in multi-queue devices to the CPU where the 
application thread consuming the packet is running.


In order to select the right queue, the networking stack needs to have a 
reverse map of IRQ affinty. This is the rmap that was added by Ben 
Hutchings [1]. To keep the rmap updated, cpu_rmap registers on the 
affinity notify.


This is the first affinity callback - it is located as a general library 
and not under net/...


The motivation to the second irq affinity callback is:
When traffic starts, first packet fires an interrupt which starts the 
napi polling on the cpu according the irq affinity.
If there is always packets to be consumed by the napi polling, no 
further interrupts will be fired, and napi will consume all the packets 
from the cpu it was started.
If the user changes the irq affinity, napi polling will continue to be 
done from the original cpu.
Only when the traffic will pause, napi session will be finished, and 
when traffic will resume, the new napi session will be done from the new 
cpu.
This is a problematic behavior, because from the user point of view, cpu 
affinity can't be changed in a non-stop traffic scenario.


To solve this, the network driver should be notified on irq affinity 
change event, and restart the napi session. This could be done by 
closing the napi session and arming the interrupts. Next packet arrives 
will trigger an interrupt and napi will session will start, this time on 
the new CPU.


> and why this notification cannot be
> propagated inside the network stack itself.

To my understanding, those are two different consumers to the same 
event, one is a general library to maintain a reverse irq affinity map, 
and the other is networking specific, and maybe even a networking driver 
specific.


[1] - c39649c lib: cpu_rmap: CPU affinity reverse-mapping

Thanks,
Amir



notifier chains are almost always a clear sign for a design disaster
and I'm not going to even think about it before I do not have a
concice explanation of the problem at hand and why a notifier chain is
a good solution.

Thanks,

tglx




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Extend irq_set_affinity_notifier() to use a call chain

2014-05-26 Thread Thomas Gleixner
On Mon, 26 May 2014, Amir Vadai wrote:

> On 5/26/2014 2:15 PM, Thomas Gleixner wrote:
> > On Sun, 25 May 2014, Amir Vadai wrote:
> > > In order to do that, I need to add a new irq affinity notification
> > > callback (In addition to the existing cpu_rmap notification). For
> > > that I would like to extend irq_set_affinity_notifier() to have a
> > > notifier call-chain instead of a single notifier callback.
> > 
> > Why? "I would like" is a non argument.
> 
> Current implementation enables only one callback to be registered for irq
> affinity change notifications.

I'm well aware of that.
 
> cpu_rmap is registered be notified - for RFS purposes.  mlx4_en (and
> probably other network drivers) needs to be notified too, in order
> to stop the napi polling on the old cpu and move to the new one.  To
> enable more than 1 notification callbacks, I suggest to use a
> notifier call chain.

You are not describing what needs to be notified and why. Please
explain the details of that and how the RFS (whatever that is) and the
network driver are connected and why this notification cannot be
propagated inside the network stack itself.

notifier chains are almost always a clear sign for a design disaster
and I'm not going to even think about it before I do not have a
concice explanation of the problem at hand and why a notifier chain is
a good solution.

Thanks,

tglx


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Extend irq_set_affinity_notifier() to use a call chain

2014-05-26 Thread Amir Vadai

On 5/26/2014 2:15 PM, Thomas Gleixner wrote:

On Sun, 25 May 2014, Amir Vadai wrote:

In order to do that, I need to add a new irq affinity notification
callback (In addition to the existing cpu_rmap notification). For
that I would like to extend irq_set_affinity_notifier() to have a
notifier call-chain instead of a single notifier callback.


Why? "I would like" is a non argument.
Current implementation enables only one callback to be registered for 
irq affinity change notifications.


cpu_rmap is registered be notified - for RFS purposes.
mlx4_en (and probably other network drivers) needs to be notified too, 
in order to stop the napi polling on the old cpu and move to the new one.
To enable more than 1 notification callbacks, I suggest to use a 
notifier call chain.


Amir



Thanks,

tglx



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Extend irq_set_affinity_notifier() to use a call chain

2014-05-26 Thread Thomas Gleixner
On Sun, 25 May 2014, Amir Vadai wrote:
> In order to do that, I need to add a new irq affinity notification
> callback (In addition to the existing cpu_rmap notification). For
> that I would like to extend irq_set_affinity_notifier() to have a
> notifier call-chain instead of a single notifier callback.

Why? "I would like" is a non argument.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Extend irq_set_affinity_notifier() to use a call chain

2014-05-26 Thread Thomas Gleixner
On Sun, 25 May 2014, Amir Vadai wrote:
 In order to do that, I need to add a new irq affinity notification
 callback (In addition to the existing cpu_rmap notification). For
 that I would like to extend irq_set_affinity_notifier() to have a
 notifier call-chain instead of a single notifier callback.

Why? I would like is a non argument.

Thanks,

tglx
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Extend irq_set_affinity_notifier() to use a call chain

2014-05-26 Thread Amir Vadai

On 5/26/2014 2:15 PM, Thomas Gleixner wrote:

On Sun, 25 May 2014, Amir Vadai wrote:

In order to do that, I need to add a new irq affinity notification
callback (In addition to the existing cpu_rmap notification). For
that I would like to extend irq_set_affinity_notifier() to have a
notifier call-chain instead of a single notifier callback.


Why? I would like is a non argument.
Current implementation enables only one callback to be registered for 
irq affinity change notifications.


cpu_rmap is registered be notified - for RFS purposes.
mlx4_en (and probably other network drivers) needs to be notified too, 
in order to stop the napi polling on the old cpu and move to the new one.
To enable more than 1 notification callbacks, I suggest to use a 
notifier call chain.


Amir



Thanks,

tglx



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Extend irq_set_affinity_notifier() to use a call chain

2014-05-26 Thread Thomas Gleixner
On Mon, 26 May 2014, Amir Vadai wrote:

 On 5/26/2014 2:15 PM, Thomas Gleixner wrote:
  On Sun, 25 May 2014, Amir Vadai wrote:
   In order to do that, I need to add a new irq affinity notification
   callback (In addition to the existing cpu_rmap notification). For
   that I would like to extend irq_set_affinity_notifier() to have a
   notifier call-chain instead of a single notifier callback.
  
  Why? I would like is a non argument.
 
 Current implementation enables only one callback to be registered for irq
 affinity change notifications.

I'm well aware of that.
 
 cpu_rmap is registered be notified - for RFS purposes.  mlx4_en (and
 probably other network drivers) needs to be notified too, in order
 to stop the napi polling on the old cpu and move to the new one.  To
 enable more than 1 notification callbacks, I suggest to use a
 notifier call chain.

You are not describing what needs to be notified and why. Please
explain the details of that and how the RFS (whatever that is) and the
network driver are connected and why this notification cannot be
propagated inside the network stack itself.

notifier chains are almost always a clear sign for a design disaster
and I'm not going to even think about it before I do not have a
concice explanation of the problem at hand and why a notifier chain is
a good solution.

Thanks,

tglx


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Extend irq_set_affinity_notifier() to use a call chain

2014-05-26 Thread Amir Vadai

On 5/26/2014 2:34 PM, Thomas Gleixner wrote:

On Mon, 26 May 2014, Amir Vadai wrote:


On 5/26/2014 2:15 PM, Thomas Gleixner wrote:

On Sun, 25 May 2014, Amir Vadai wrote:

In order to do that, I need to add a new irq affinity notification
callback (In addition to the existing cpu_rmap notification). For
that I would like to extend irq_set_affinity_notifier() to have a
notifier call-chain instead of a single notifier callback.


Why? I would like is a non argument.


Current implementation enables only one callback to be registered for irq
affinity change notifications.


I'm well aware of that.


cpu_rmap is registered be notified - for RFS purposes.  mlx4_en (and
probably other network drivers) needs to be notified too, in order
to stop the napi polling on the old cpu and move to the new one.  To
enable more than 1 notification callbacks, I suggest to use a
notifier call chain.


You are not describing what needs to be notified and why. Please
explain the details of that and how the RFS (whatever that is) and the
network driver are connected

The goal of RFS is to increase datacache hitrate by steering
kernel processing of packets in multi-queue devices to the CPU where the 
application thread consuming the packet is running.


In order to select the right queue, the networking stack needs to have a 
reverse map of IRQ affinty. This is the rmap that was added by Ben 
Hutchings [1]. To keep the rmap updated, cpu_rmap registers on the 
affinity notify.


This is the first affinity callback - it is located as a general library 
and not under net/...


The motivation to the second irq affinity callback is:
When traffic starts, first packet fires an interrupt which starts the 
napi polling on the cpu according the irq affinity.
If there is always packets to be consumed by the napi polling, no 
further interrupts will be fired, and napi will consume all the packets 
from the cpu it was started.
If the user changes the irq affinity, napi polling will continue to be 
done from the original cpu.
Only when the traffic will pause, napi session will be finished, and 
when traffic will resume, the new napi session will be done from the new 
cpu.
This is a problematic behavior, because from the user point of view, cpu 
affinity can't be changed in a non-stop traffic scenario.


To solve this, the network driver should be notified on irq affinity 
change event, and restart the napi session. This could be done by 
closing the napi session and arming the interrupts. Next packet arrives 
will trigger an interrupt and napi will session will start, this time on 
the new CPU.


 and why this notification cannot be
 propagated inside the network stack itself.

To my understanding, those are two different consumers to the same 
event, one is a general library to maintain a reverse irq affinity map, 
and the other is networking specific, and maybe even a networking driver 
specific.


[1] - c39649c lib: cpu_rmap: CPU affinity reverse-mapping

Thanks,
Amir



notifier chains are almost always a clear sign for a design disaster
and I'm not going to even think about it before I do not have a
concice explanation of the problem at hand and why a notifier chain is
a good solution.

Thanks,

tglx




--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Extend irq_set_affinity_notifier() to use a call chain

2014-05-26 Thread Thomas Gleixner
On Mon, 26 May 2014, Amir Vadai wrote:
 On 5/26/2014 2:34 PM, Thomas Gleixner wrote:
  You are not describing what needs to be notified and why. Please
  explain the details of that and how the RFS (whatever that is) and the
  network driver are connected
 The goal of RFS is to increase datacache hitrate by steering
 kernel processing of packets in multi-queue devices to the CPU where the
 application thread consuming the packet is running.
 
 In order to select the right queue, the networking stack needs to have a
 reverse map of IRQ affinty. This is the rmap that was added by Ben Hutchings
 [1]. To keep the rmap updated, cpu_rmap registers on the affinity notify.
 
 This is the first affinity callback - it is located as a general library and
 not under net/...

 The motivation to the second irq affinity callback is:
 When traffic starts, first packet fires an interrupt which starts the napi
 polling on the cpu according the irq affinity.
 If there is always packets to be consumed by the napi polling, no further
 interrupts will be fired, and napi will consume all the packets from the cpu
 it was started.
 If the user changes the irq affinity, napi polling will continue to be done
 from the original cpu.
 Only when the traffic will pause, napi session will be finished, and when
 traffic will resume, the new napi session will be done from the new cpu.
 This is a problematic behavior, because from the user point of view, cpu
 affinity can't be changed in a non-stop traffic scenario.
 
 To solve this, the network driver should be notified on irq affinity change
 event, and restart the napi session. This could be done by closing the napi
 session and arming the interrupts. Next packet arrives will trigger an
 interrupt and napi will session will start, this time on the new CPU.
 
  and why this notification cannot be
  propagated inside the network stack itself.
 
 To my understanding, those are two different consumers to the same event, one
 is a general library to maintain a reverse irq affinity map, and the other is
 networking specific, and maybe even a networking driver specific.

The rmap _IS_ instantiated by the driver, and both the driver and the
networking core know about it.

So it's not completely different consumers. Just because it's a
library does not mean it's disjunct from the code which uses it.

Aside of the fact, that maintaining a per irq notifier chain is going
to be ugly as hell due to life time and locking issues, it's just
opening a can of worms. How do you make sure that the invocation order
is correct? What are the dependency rules of the driver restarting the
napi session versus updating the rmap?

Even if you'd solve that and have a callback in the driver, then the
callback never can restart the napi session directly. All it can do is
set a flag which needs to be checked in the RX path, right?

So what's the point of adding notifier call chain complexity, ordering
problems etc., if you can simply note the fact that the affinity
changed in the rmap itself and check that in the RX path?

Thanks,

tglx




--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Extend irq_set_affinity_notifier() to use a call chain

2014-05-25 Thread Amir Vadai

On 5/25/2014 3:15 PM, Amir Vadai wrote:

Hi,

I'm working for Mellanox on mlx4_en NIC driver.

We need to be able to be notified on irq affinity changes.
This is because, during non-stop full bandwidth traffic, napi will poll
constantly and no interrupt will be fired. Because of that, even if the
user changes the irq affinity, polling will continue to be done on the
original CPU that was chosen on the first packet.
We would like to be notified when the affinity is changed. When such an
event happen, the driver will arm the interrupts and end the napi
session. An interrupt will start a new napi session on the right CPU.

In order to do that, I need to add a new irq affinity notification
callback (In addition to the existing cpu_rmap notification). For that I
would like to extend irq_set_affinity_notifier() to have a notifier
call-chain instead of a single notifier callback.

I wanted to hear your opinion on this, and unless there is a better
solution, will send an RFC later on.

References:
- http://patchwork.ozlabs.org/patch/65244/ - Review done by Thomas
Glexiber to Ben Hutchings first version of the irq affinity notifiers.
- http://patchwork.ozlabs.org/patch/79593/ - Final version of
irq_set_affinity_notifier() that was applied

Thanks,
Amir


Didn't mention that a patch to set irq affinity notifier was already 
added to mlx4_en [1], and now aRFS is broken because cpu_rmap callback 
is dropped when mlx4_en callback is set.


[1] - http://patchwork.ozlabs.org/patch/348669/

Amir

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Extend irq_set_affinity_notifier() to use a call chain

2014-05-25 Thread Amir Vadai

On 5/25/2014 3:15 PM, Amir Vadai wrote:

Hi,

I'm working for Mellanox on mlx4_en NIC driver.

We need to be able to be notified on irq affinity changes.
This is because, during non-stop full bandwidth traffic, napi will poll
constantly and no interrupt will be fired. Because of that, even if the
user changes the irq affinity, polling will continue to be done on the
original CPU that was chosen on the first packet.
We would like to be notified when the affinity is changed. When such an
event happen, the driver will arm the interrupts and end the napi
session. An interrupt will start a new napi session on the right CPU.

In order to do that, I need to add a new irq affinity notification
callback (In addition to the existing cpu_rmap notification). For that I
would like to extend irq_set_affinity_notifier() to have a notifier
call-chain instead of a single notifier callback.

I wanted to hear your opinion on this, and unless there is a better
solution, will send an RFC later on.

References:
- http://patchwork.ozlabs.org/patch/65244/ - Review done by Thomas
Glexiber to Ben Hutchings first version of the irq affinity notifiers.
- http://patchwork.ozlabs.org/patch/79593/ - Final version of
irq_set_affinity_notifier() that was applied

Thanks,
Amir


Didn't mention that a patch to set irq affinity notifier was already 
added to mlx4_en [1], and now aRFS is broken because cpu_rmap callback 
is dropped when mlx4_en callback is set.


[1] - http://patchwork.ozlabs.org/patch/348669/

Amir

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/