Re: Fighting out-of-order reception with RPS?

2015-07-14 Thread Oliver Hartkopp

On 13.07.2015 06:57, Eric Dumazet wrote:

On Sun, 2015-07-12 at 21:15 +0200, Oliver Hartkopp wrote:


E.g. with

skb_set_hash(skb, dev-ifindex, PKT_HASH_TYPE_L2);

and

echo f  /sys/class/net/can0/queues/rx-0/rps_cpus

I get properly ordered CAN frames - even with netif_rx() processed skbs. I
just want to have this stuff to be enabled by default for CAN interfaces to
kill the OOO frame issue.


I doubt your skb_set_hash() makes any difference.

RPS prefers a L4 hash anyway (skb_get_hash()), so flow dissection
happens.



Please take a look into netif_rx_internal() in net/core/dev.c

http://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/tree/net/core/dev.c?id=v4.2-rc1#n3486

with CONFIG_RPS netif_rx() takes care about the rps cpu and puts the skb into 
the correct hash specific queue.


As we usually have several PF_CAN sockets which get CAN frames from a specific 
CAN interface it makes no sense to enqueue packets into queues sorted by 
receiving sockets or L4 hash (we don't have L4 addressing on CAN).


The skb_set_hash(skb, dev-ifindex, PKT_HASH_TYPE_L2) makes sure that the skbs 
from a specific CAN netdev are always processed in the same queue.


When this is not wanted in 'fastpath netif_rx()' why is the CONFIG_RPS section 
in there?


What is the advantage of implementing NAPI and a 'private sk_buf queue' 
suggested by Tom in http://marc.info/?l=linux-canm=143681458003381w=2 to set 
the hash as shown and enable rps_cpus?


The latter just looks just like a complexity boost to have a functionality 
that already exists in netif_rx(). I just want to understand it.


Regards,
Oliver


--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fighting out-of-order reception with RPS?

2015-07-14 Thread David Miller
From: Tom Herbert t...@herbertland.com
Date: Tue, 14 Jul 2015 11:02:16 -0700

 then implement NAPI for CAN drivers as has been suggested now by
 three very experienced developers. This solves the your OOO problem
 and moves drivers to NAPI which is the greatly preferred interface.

+1

+1

+1
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fighting out-of-order reception with RPS?

2015-07-14 Thread Oliver Hartkopp



On 14.07.2015 21:03, David Miller wrote:

From: Tom Herbert t...@herbertland.com
Date: Tue, 14 Jul 2015 11:02:16 -0700


then implement NAPI for CAN drivers as has been suggested now by
three very experienced developers. This solves the your OOO problem
and moves drivers to NAPI which is the greatly preferred interface.


+1

+1

+1


ok got it :-)


--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fighting out-of-order reception with RPS?

2015-07-14 Thread Tom Herbert
On Tue, Jul 14, 2015 at 10:09 AM, Oliver Hartkopp
socket...@hartkopp.net wrote:
 On 13.07.2015 06:57, Eric Dumazet wrote:

 On Sun, 2015-07-12 at 21:15 +0200, Oliver Hartkopp wrote:

 E.g. with

 skb_set_hash(skb, dev-ifindex, PKT_HASH_TYPE_L2);

 and

 echo f  /sys/class/net/can0/queues/rx-0/rps_cpus

 I get properly ordered CAN frames - even with netif_rx() processed skbs.
 I
 just want to have this stuff to be enabled by default for CAN interfaces
 to
 kill the OOO frame issue.


 I doubt your skb_set_hash() makes any difference.

 RPS prefers a L4 hash anyway (skb_get_hash()), so flow dissection
 happens.


 Please take a look into netif_rx_internal() in net/core/dev.c

 http://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/tree/net/core/dev.c?id=v4.2-rc1#n3486

 with CONFIG_RPS netif_rx() takes care about the rps cpu and puts the skb
 into the correct hash specific queue.

 As we usually have several PF_CAN sockets which get CAN frames from a
 specific CAN interface it makes no sense to enqueue packets into queues
 sorted by receiving sockets or L4 hash (we don't have L4 addressing on CAN).

 The skb_set_hash(skb, dev-ifindex, PKT_HASH_TYPE_L2) makes sure that the
 skbs from a specific CAN netdev are always processed in the same queue.

 When this is not wanted in 'fastpath netif_rx()' why is the CONFIG_RPS
 section in there?

 What is the advantage of implementing NAPI and a 'private sk_buf queue'
 suggested by Tom in http://marc.info/?l=linux-canm=143681458003381w=2 to
 set the hash as shown and enable rps_cpus?

 The latter just looks just like a complexity boost to have a functionality
 that already exists in netif_rx(). I just want to understand it.


netif_rx allows OOO packets and the way you're fixing that with RPS
and setting a flow hash are nothing more than hacks. Sorry, but
there is no way we are going to change RPS just to satisfy this your
case. If you want to do this right and get something in the kernel,
then implement NAPI for CAN drivers as has been suggested now by three
very experienced developers. This solves the your OOO problem and
moves drivers to NAPI which is the greatly preferred interface. I
really don't see why there needs to be any more discussion on this.

Tom



 Regards,
 Oliver


--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fighting out-of-order reception with RPS?

2015-07-13 Thread Tom Herbert
On Sun, Jul 12, 2015 at 12:15 PM, Oliver Hartkopp
socket...@hartkopp.net wrote:
 Hello Eric,

 On 07/11/2015 06:35 AM, Eric Dumazet wrote:
 On Fri, 2015-07-10 at 22:36 +0200, Oliver Hartkopp wrote:

 Hm. Doesn't sound like a good solution when there's a difference between 
 NAPI
 and non-NAPI drivers in matters of OOO, right?

 Isn't OOO a problem for you ? Then you either have to :

 1) Use a single CPU to handle IRQ from the device
 2) Use NAPI


 See below ...

 What about checking in netif_rx() if the non-NAPI driver has set a hash (aka
 the driver is OOO sensitive)?
 And if so we could automatically set rps_cpus for this interface in a way 
 that
 all CPUs are enabled to take skbs following the hash.

 Wow, netif_rx() is packet processing fast path, certainly not the place
 to add controlling path decisions.

 My only requirement is to be able to pick CAN frames (contained in skbs) from
 the socket in the same order they have been received.

 Please convert your driver to NAPI. You might then even benefit from
 GRO.

 Just some remarks about CAN and CAN frames as you suggest GRO which is
 completely pointless for CAN.

 CAN frames have a 11 or 29 bit CAN Identifier (no MAC but content addressing)
 and 0 to 64 bytes of payload. Therefore the MTU for CAN interfaces is 16 or 72
 byte (see struct can(fd)_frame). Each skbuff contains a single CAN frame.

 There are CAN controllers which have a FIFO for up to 32 CAN frames, e.g.
 flexcan.c which also implements NAPI. Others (e.g. sja1000.c) don't have any
 FIFO and the reading of the CAN frame from the memory mapped registers needs
 to be processed in the irq context instantly. So 'fast path' netif_rx() is
 reasonable, right?

 So why is it not possible to pass netif_rx() skbs from a specific CAN network
 interface to whatever queue where they are processed in order?

 E.g. with

 skb_set_hash(skb, dev-ifindex, PKT_HASH_TYPE_L2);

 and

 echo f  /sys/class/net/can0/queues/rx-0/rps_cpus

 I get properly ordered CAN frames - even with netif_rx() processed skbs. I
 just want to have this stuff to be enabled by default for CAN interfaces to
 kill the OOO frame issue.

If you really must process the CAN FIFO in the hard interrupt then
create a private sk_buf queue. In the interrupt, dequeue from FIFO and
enqueue on the sk_buf queue. Then schedule NAPI, and when that runs
process the sk_buf queue calling call netif_receive_skb for each
enqueued skb. Pretty simple actually :-)

 Regards,
 Oliver

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fighting out-of-order reception with RPS?

2015-07-12 Thread David Miller
From: Oliver Hartkopp socket...@hartkopp.net
Date: Sun, 12 Jul 2015 21:15:36 +0200

 Just some remarks about CAN and CAN frames as you suggest GRO which is
 completely pointless for CAN.

GRO may be pointless for CAN, but NAPI _definitely_ is useful for every
single network device, period.

So you should do NAPI for reasons outside of packet receive ordering,
and in return you'll have your packet ordering problem solved as well.

I really am stumped as to why you are avoiding NAPI so vehemently.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fighting out-of-order reception with RPS?

2015-07-12 Thread Oliver Hartkopp
Hello Eric,

On 07/11/2015 06:35 AM, Eric Dumazet wrote:
 On Fri, 2015-07-10 at 22:36 +0200, Oliver Hartkopp wrote:

 Hm. Doesn't sound like a good solution when there's a difference between NAPI
 and non-NAPI drivers in matters of OOO, right?
 
 Isn't OOO a problem for you ? Then you either have to :
 
 1) Use a single CPU to handle IRQ from the device
 2) Use NAPI
 

See below ...

 What about checking in netif_rx() if the non-NAPI driver has set a hash (aka
 the driver is OOO sensitive)?
 And if so we could automatically set rps_cpus for this interface in a way 
 that
 all CPUs are enabled to take skbs following the hash.
 
 Wow, netif_rx() is packet processing fast path, certainly not the place
 to add controlling path decisions.

My only requirement is to be able to pick CAN frames (contained in skbs) from
the socket in the same order they have been received.

 Please convert your driver to NAPI. You might then even benefit from
 GRO.

Just some remarks about CAN and CAN frames as you suggest GRO which is
completely pointless for CAN.

CAN frames have a 11 or 29 bit CAN Identifier (no MAC but content addressing)
and 0 to 64 bytes of payload. Therefore the MTU for CAN interfaces is 16 or 72
byte (see struct can(fd)_frame). Each skbuff contains a single CAN frame.

There are CAN controllers which have a FIFO for up to 32 CAN frames, e.g.
flexcan.c which also implements NAPI. Others (e.g. sja1000.c) don't have any
FIFO and the reading of the CAN frame from the memory mapped registers needs
to be processed in the irq context instantly. So 'fast path' netif_rx() is
reasonable, right?

So why is it not possible to pass netif_rx() skbs from a specific CAN network
interface to whatever queue where they are processed in order?

E.g. with

skb_set_hash(skb, dev-ifindex, PKT_HASH_TYPE_L2);

and

echo f  /sys/class/net/can0/queues/rx-0/rps_cpus

I get properly ordered CAN frames - even with netif_rx() processed skbs. I
just want to have this stuff to be enabled by default for CAN interfaces to
kill the OOO frame issue.

Regards,
Oliver

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fighting out-of-order reception with RPS?

2015-07-12 Thread Eric Dumazet
On Sun, 2015-07-12 at 21:15 +0200, Oliver Hartkopp wrote:

 E.g. with
 
   skb_set_hash(skb, dev-ifindex, PKT_HASH_TYPE_L2);
 
 and
 
   echo f  /sys/class/net/can0/queues/rx-0/rps_cpus
 
 I get properly ordered CAN frames - even with netif_rx() processed skbs. I
 just want to have this stuff to be enabled by default for CAN interfaces to
 kill the OOO frame issue.

I doubt your skb_set_hash() makes any difference.

RPS prefers a L4 hash anyway (skb_get_hash()), so flow dissection
happens.



--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fighting out-of-order reception with RPS?

2015-07-11 Thread Eric Dumazet
On Fri, 2015-07-10 at 22:36 +0200, Oliver Hartkopp wrote:
 On 07/10/2015 04:48 AM, Tom Herbert wrote:
  On Wed, Jul 8, 2015 at 10:55 PM, Oliver Hartkopp socket...@hartkopp.net 
  wrote:
  Both drivers do not use NAPI. The just follow the way
 
  interrupt - alloc_skb() - fill skb - netif_rx(skb)
 
  I'm usually testing with the USB adapters as the PCIe setup is not very
  handy.
 
  Okay, I see what is happening. In netif_rx when RPS is not enabled
  that packet is queued to the backlog queue for the local CPU. Since
  you're doing round robin on the interrupts then OOO packets can be a
  result. Unfortunately, this is the expected behavior. The correct
  kernel fix would be to move to these drivers to use NAPI.
 
 Hm. Doesn't sound like a good solution when there's a difference between NAPI
 and non-NAPI drivers in matters of OOO, right?

Isn't OOO a problem for you ? Then you either have to :

1) Use a single CPU to handle IRQ from the device
2) Use NAPI

 
  RPS
  eliminates the OOO, but if there is no ability to derive a flow hash
  from packets everything will wind up one queue without load balancing.
 
 Correct.
 
 That's why I added
 
   skb_set_hash(skb, dev-ifindex, PKT_HASH_TYPE_L2);
 
 in my driver, because the only relevant flow identifiction is the number of
 the incoming CAN interface.
 
  Besides that, automatically setting RPS in drivers is a difficult
  proposition since there is no definitively correct way to do that in
  an arbitrary configuration.
 
 What about checking in netif_rx() if the non-NAPI driver has set a hash (aka
 the driver is OOO sensitive)?
 And if so we could automatically set rps_cpus for this interface in a way that
 all CPUs are enabled to take skbs following the hash.

Wow, netif_rx() is packet processing fast path, certainly not the place
to add controlling path decisions.

Please convert your driver to NAPI. You might then even benefit from
GRO.


--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fighting out-of-order reception with RPS?

2015-07-10 Thread Oliver Hartkopp
On 07/10/2015 04:48 AM, Tom Herbert wrote:
 On Wed, Jul 8, 2015 at 10:55 PM, Oliver Hartkopp socket...@hartkopp.net 
 wrote:
 Both drivers do not use NAPI. The just follow the way

 interrupt - alloc_skb() - fill skb - netif_rx(skb)

 I'm usually testing with the USB adapters as the PCIe setup is not very
 handy.

 Okay, I see what is happening. In netif_rx when RPS is not enabled
 that packet is queued to the backlog queue for the local CPU. Since
 you're doing round robin on the interrupts then OOO packets can be a
 result. Unfortunately, this is the expected behavior. The correct
 kernel fix would be to move to these drivers to use NAPI.

Hm. Doesn't sound like a good solution when there's a difference between NAPI
and non-NAPI drivers in matters of OOO, right?

 RPS
 eliminates the OOO, but if there is no ability to derive a flow hash
 from packets everything will wind up one queue without load balancing.

Correct.

That's why I added

skb_set_hash(skb, dev-ifindex, PKT_HASH_TYPE_L2);

in my driver, because the only relevant flow identifiction is the number of
the incoming CAN interface.

 Besides that, automatically setting RPS in drivers is a difficult
 proposition since there is no definitively correct way to do that in
 an arbitrary configuration.

What about checking in netif_rx() if the non-NAPI driver has set a hash (aka
the driver is OOO sensitive)?
And if so we could automatically set rps_cpus for this interface in a way that
all CPUs are enabled to take skbs following the hash.

Best regards,
Oliver


--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fighting out-of-order reception with RPS?

2015-07-09 Thread Tom Herbert
On Wed, Jul 8, 2015 at 10:55 PM, Oliver Hartkopp socket...@hartkopp.net wrote:

 On 08.07.2015 23:17, Tom Herbert wrote:

 On Wed, Jul 8, 2015 at 10:49 AM, Oliver Hartkopp socket...@hartkopp.net
 wrote:

 (..)

 When receiving CAN frames from a specific CAN network interface (e.g.
 can0)
 the frames are sporadically out-of-order on SMP systems like my Core i7
 laptop
 with 4 CPUs. This out-of-order reception kills reliable communication
 e.g. for
 CAN transport protocols.

 First approach was to set the smp_affinity for the USB adapter on irq 28
 with:

 (..)

 Next idea was to use RPS after reading
 Documentation/networking/scaling.txt

 (..)


 My two questions:

 1. Is there any better solution to meet the described requirements?


 I would suggest that you look into how there are OOO packets in the
 first place. Even if the interrupts is allowed to happen on different
 CPUs by sm_affinity, NAPI execution should be serialized for the
 device so that OOO shouldn't happen. The result of your RPS setting
 should be all packets go to the same queue, this shouldn't normally
 affect the ordering. Looking at drivers/net/can there are apparently
 several variants of the driver. Do you know which one you're running?


 I have two CAN hardware interfaces I can test together with a SMP system:

 1. PCAN-USB using the driver at drivers/net/can/usb/peak_usb/
 2. PCAN Compact PCIe using drivers/net/can/sja1000/(peak_pci.c / sja1000.c)

 Both drivers do not use NAPI. The just follow the way

 interrupt - alloc_skb() - fill skb - netif_rx(skb)

 I'm usually testing with the USB adapters as the PCIe setup is not very
 handy.

Okay, I see what is happening. In netif_rx when RPS is not enabled
that packet is queued to the backlog queue for the local CPU. Since
you're doing round robin on the interrupts then OOO packets can be a
result. Unfortunately, this is the expected behavior. The correct
kernel fix would be to move to these drivers to use NAPI. RPS
eliminates the OOO, but if there is no ability to derive a flow hash
from packets everything will wind up one queue without load balancing.
Besides that, automatically setting RPS in drivers is a difficult
proposition since there is no definitively correct way to do that in
an arbitrary configuration.

Tom

 Best regards,
 Oliver



 2. If not: How can enable this RPS solution by default for CAN
 interfaces?


--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fighting out-of-order reception with RPS?

2015-07-09 Thread Holger Schurig
 and has to be done by hand depending on where the CAN interfaces
 are attached to the system.

No ...  udev rules rock!   :-)
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fighting out-of-order reception with RPS?

2015-07-09 Thread Oliver Hartkopp

On 09.07.2015 08:34, Holger Schurig wrote:

and has to be done by hand depending on where the CAN interfaces
are attached to the system.


No ...  udev rules rock!   :-)



Yeah. But it can not be the approach to fix a known problem in the kernel by 
urging people to make workarounds in userspace. For my testing I did it by 
hand. But in real life the kernel has to do 'the right thing' by having wise 
defaults without interaction.


Regards,
Oliver
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fighting out-of-order reception with RPS?

2015-07-08 Thread Tom Herbert
On Wed, Jul 8, 2015 at 10:49 AM, Oliver Hartkopp socket...@hartkopp.net wrote:
 I'm picking up the request 'Setting RPS affinities from network driver' from
 Sunil Kovvuri http://marc.info/?t=14242402351r=1w=2 as I assume to have
 the same issue here.

 When receiving CAN frames from a specific CAN network interface (e.g. can0)
 the frames are sporadically out-of-order on SMP systems like my Core i7 laptop
 with 4 CPUs. This out-of-order reception kills reliable communication e.g. for
 CAN transport protocols.

 First approach was to set the smp_affinity for the USB adapter on irq 28 with:

 echo 1  /proc/irq/28/smp_affinity

 This worked in my case but it looks wrong to pin the USB host adapter to a
 single CPU and has to be done by hand depending on where the CAN interfaces
 are attached to the system.

 Next idea was to use RPS after reading Documentation/networking/scaling.txt

 As the only relevant flow identifiction is the number of the incoming CAN
 interface I added

 skb_set_hash(skb, dev-ifindex, PKT_HASH_TYPE_L2);

 when creating CAN skbs with alloc_can_skb() in drivers/net/can/dev.c

 After enabling RPS for my four CPUs with

 echo f  /sys/class/net/can0/queues/rx-0/rps_cpus

 I had no more out-of-order frames in my system :-)

 My two questions:

 1. Is there any better solution to meet the described requirements?

I would suggest that you look into how there are OOO packets in the
first place. Even if the interrupts is allowed to happen on different
CPUs by sm_affinity, NAPI execution should be serialized for the
device so that OOO shouldn't happen. The result of your RPS setting
should be all packets go to the same queue, this shouldn't normally
affect the ordering. Looking at drivers/net/can there are apparently
several variants of the driver. Do you know which one you're running?

Tom

 2. If not: How can enable this RPS solution by default for CAN interfaces?

 Best regards,
 Oliver

 --
 To unsubscribe from this list: send the line unsubscribe netdev in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fighting out-of-order reception with RPS?

2015-07-08 Thread Oliver Hartkopp


On 08.07.2015 23:17, Tom Herbert wrote:

On Wed, Jul 8, 2015 at 10:49 AM, Oliver Hartkopp socket...@hartkopp.net wrote:

(..)

When receiving CAN frames from a specific CAN network interface (e.g. can0)
the frames are sporadically out-of-order on SMP systems like my Core i7 laptop
with 4 CPUs. This out-of-order reception kills reliable communication e.g. for
CAN transport protocols.

First approach was to set the smp_affinity for the USB adapter on irq 28 with:

(..)

Next idea was to use RPS after reading Documentation/networking/scaling.txt

(..)


My two questions:

1. Is there any better solution to meet the described requirements?


I would suggest that you look into how there are OOO packets in the
first place. Even if the interrupts is allowed to happen on different
CPUs by sm_affinity, NAPI execution should be serialized for the
device so that OOO shouldn't happen. The result of your RPS setting
should be all packets go to the same queue, this shouldn't normally
affect the ordering. Looking at drivers/net/can there are apparently
several variants of the driver. Do you know which one you're running?


I have two CAN hardware interfaces I can test together with a SMP system:

1. PCAN-USB using the driver at drivers/net/can/usb/peak_usb/
2. PCAN Compact PCIe using drivers/net/can/sja1000/(peak_pci.c / sja1000.c)

Both drivers do not use NAPI. The just follow the way

interrupt - alloc_skb() - fill skb - netif_rx(skb)

I'm usually testing with the USB adapters as the PCIe setup is not very handy.

Best regards,
Oliver




2. If not: How can enable this RPS solution by default for CAN interfaces?


--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html