Locking in network code

2018-05-06 Thread Jacob S. Moroni
Hello,

I have a stupid question regarding which variant of spin_lock to use
throughout the network stack, and inside RX handlers specifically.

It's my understanding that skbuffs are normally passed into the stack
from soft IRQ context if the device is using NAPI, and hard IRQ
context if it's not using NAPI (and I guess process context too if the
driver does it's own workqueue thing). 

So, that means that handlers registered with netdev_rx_handler_register
may end up being called from any context.

However, the RX handler in the macvlan code calls ip_check_defrag,
which could eventually lead to a call to ip_defrag, which ends
up taking a regular spin_lock around the call to ip_frag_queue.

Is this a risk of deadlock, and if not, why?

What if you're running a system with one CPU and a packet fragment
arrives on a NAPI interface, then, while the spin_lock is held,
another fragment somehow arrives on another interface which does
its processing in hard IRQ context?

-- 
  Jacob S. Moroni
  m...@jakemoroni.com


Re: DPAA TX Issues

2018-04-08 Thread Jacob S. Moroni
On Sun, Apr 8, 2018, at 7:46 PM, Jacob S. Moroni wrote:
> Hello Madalin,
> 
> I've been experiencing some issues with the DPAA Ethernet driver,
> specifically related to frame transmission. Hopefully you can point
> me in the right direction.
> 
> TLDR: Attempting to transmit faster than a few frames per second causes
> the TX FQ CGR to enter into the congested state and remain there forever,
> even after transmission stops.
> 
> The hardware is a T2080RDB, running from the tip of net-next, using
> the standard t2080rdb device tree and corenet64_smp_defconfig kernel
> config. No changes were made to any of the files. The issue occurs
> with 4.16.1 stable as well. In fact, the only time I've been able
> to achieve reliable frame transmission was with the SDK 4.1 kernel.
> 
> For my tests, I'm running iperf3 both with and without the -R
> option (send/receive). When using a USB Ethernet adapter, there
> are no issues.
> 
> The issue is that it seems like the TX frame queues are getting
> "stuck" when attempting to transmit at rates greater than a few frames
> per second. Ping works fine, but it seems like anything that could
> potentially cause multiple TX frames to be enqueued causes issues.
> 
> If I run iperf3 in reverse mode (with the T2080RDB receiving), then
> I can achieve ~940 Mbps, but this is also somewhat unreliable.
> 
> If I run it with the T2080RDB transmitting, the test will never
> complete. Sometimes it starts transmitting for a few seconds then stops,
> and other times it never even starts. This also seems to force the
> interface into a bad state.
> 
> The ethtool stats show that the interface has entered
> congestion a few times, and that it's currently congested. The fact
> that it's currently congested even after stopping transmission
> indicates that the FQ somehow stopped being drained. I've also
> noticed that whenever this issue occurs, the TX confirmation
> counters are always less than the TX packet counters.
> 
> When it gets into this state, I can see that the memory usage is
> climbing, up until about the point of where the CGR threshold
> is (about 100 MB).
> 
> Any idea what could prevent the TX FQ from being drained? My first
> guess was flow control, but it's completely disabled.
> 
> I tried messing with the egress congestion threshold, workqueue
> assignments, etc., but nothing seemed to have any effect.
> 
> If you need any more information or want me to run any tests,
> please let me know.
> 
> Thanks,
> -- 
>   Jacob S. Moroni
>   m...@jakemoroni.com

It turns out that irqbalance was causing all of the issues. After
disabling it and rebooting, the interfaces worked perfectly.

Perhaps there's an issue with how the qman/bman portals are defined
as per-cpu variables.

During the portal's probe, the CPUs are assigned one-by-one and
subsequently passed into request_irq as the argument.
However, it seems like if the IRQ affinity changes, then the ISR could be
passed a reference to a per-cpu variable belonging to another CPU.

At least I know where to look now.

- Jake


DPAA TX Issues

2018-04-08 Thread Jacob S. Moroni
Hello Madalin,

I've been experiencing some issues with the DPAA Ethernet driver,
specifically related to frame transmission. Hopefully you can point
me in the right direction.

TLDR: Attempting to transmit faster than a few frames per second causes
the TX FQ CGR to enter into the congested state and remain there forever,
even after transmission stops.

The hardware is a T2080RDB, running from the tip of net-next, using
the standard t2080rdb device tree and corenet64_smp_defconfig kernel
config. No changes were made to any of the files. The issue occurs
with 4.16.1 stable as well. In fact, the only time I've been able
to achieve reliable frame transmission was with the SDK 4.1 kernel.

For my tests, I'm running iperf3 both with and without the -R
option (send/receive). When using a USB Ethernet adapter, there
are no issues.

The issue is that it seems like the TX frame queues are getting
"stuck" when attempting to transmit at rates greater than a few frames
per second. Ping works fine, but it seems like anything that could
potentially cause multiple TX frames to be enqueued causes issues.

If I run iperf3 in reverse mode (with the T2080RDB receiving), then
I can achieve ~940 Mbps, but this is also somewhat unreliable.

If I run it with the T2080RDB transmitting, the test will never
complete. Sometimes it starts transmitting for a few seconds then stops,
and other times it never even starts. This also seems to force the
interface into a bad state.

The ethtool stats show that the interface has entered
congestion a few times, and that it's currently congested. The fact
that it's currently congested even after stopping transmission
indicates that the FQ somehow stopped being drained. I've also
noticed that whenever this issue occurs, the TX confirmation
counters are always less than the TX packet counters.

When it gets into this state, I can see that the memory usage is
climbing, up until about the point of where the CGR threshold
is (about 100 MB).

Any idea what could prevent the TX FQ from being drained? My first
guess was flow control, but it's completely disabled.

I tried messing with the egress congestion threshold, workqueue
assignments, etc., but nothing seemed to have any effect.

If you need any more information or want me to run any tests,
please let me know.

Thanks,
-- 
  Jacob S. Moroni
  m...@jakemoroni.com