Re: [lwip-users] Transmission stall after ARP broadcast

2018-02-19 Thread goldsi...@gmx.de

On 20.02.2018 08:33, Stephan Hilchenbach wrote:

Hello,

this problem was not caused by the LwIP stack, but by the Ethernet driver. It 
was solved with the help of the Ti support:
https://e2e.ti.com/support/arm/sitara_arm/f/791/t/663155
The address lookup engine (ALE) processes all received packets to determine 
which port(s) if any that the packet should the forwarded to. Configured as a 
switch, the port state of all ports was set to forwarding, even if a port was 
not connected. Setting the unconnected ports to blocking by default and setting 
them to forward after Phy connect, solved the problem.


OK, good to know :-)
Thanks for sharing the answer.

Simon

___
lwip-users mailing list
lwip-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/lwip-users


Re: [lwip-users] Transmission stall after ARP broadcast

2018-02-19 Thread Stephan Hilchenbach
Hello,

this problem was not caused by the LwIP stack, but by the Ethernet driver. It 
was solved with the help of the Ti support:
https://e2e.ti.com/support/arm/sitara_arm/f/791/t/663155
The address lookup engine (ALE) processes all received packets to determine 
which port(s) if any that the packet should the forwarded to. Configured as a 
switch, the port state of all ports was set to forwarding, even if a port was 
not connected. Setting the unconnected ports to blocking by default and setting 
them to forward after Phy connect, solved the problem.

Best Regards,
Stephan


___
lwip-users mailing list
lwip-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/lwip-users


Re: [lwip-users] Transmission stall after ARP broadcast

2018-02-16 Thread Stephan Hilchenbach
>> So is there *any* communication from the device after this? Maybe on the 2nd 
>> port?
After this incident receiving goes on. This means, the DMA handles incoming 
packets but does not accept any outgoing packets. The second port is not 
connected. The phy is not active.

>> I don't know how the software on that processor is adapted to lwIP but 
>> strictly speaking, this is not a switch but 2 MACs connected via the PRUs. 
>> They might have bugs, too :-)
The PRU's are not used. It's the CPSW_3G switch. What I found is, that I can 
avoid (so far) this problem, when I reduce the MDIO communication to the second 
port's phy to a minimum. Initially the driver tried to auto negotiate with the 
phy, even when there was no link. I changed this now and wait for a link first. 
I checked all code lines and in these procedures the LwIP is not involved. At 
this point I get confused. Somehow the LwIP is influenced by the phy 
communication and the state of the switch. Because there is no delay between 
the last sent data packet an the unexpected ARP broadcast, which is then the 
last transmission of the DMA. Probably the memory is overwritten somewhere. 
This problem also disappears as soon as the second port is connected too.

>> Have you tried to debug what's going on in the processor after it stops?
Yes I did. The integrated DMA (CPDMA) stops sending without showing any error. 
The DMA state is idle, but not responding to new descriptors. I created a 
thread in the Ti forum:
https://e2e.ti.com/support/arm/sitara_arm/f/791/p/663155/2442288
The fact that this behavior is unknown indicates that my driver is doing 
something stupid.

Best Regards,
Stephan

-Ursprüngliche Nachricht-
Von: lwip-users [mailto:lwip-users-bounces+hilchenbach=ish-gmbh@nongnu.org] 
Im Auftrag von Simon Goldschmidt
Gesendet: Freitag, 16. Februar 2018 08:55
An: lwip-users@nongnu.org
Betreff: Re: [lwip-users] Transmission stall after ARP broadcast

Stephan Hilchenbach wrote:
>>> 1.4.1 is rather old. There have been numerous bugs fixed since then.
> Yes I know. I would prefer to update, but I can't make this decision on my 
> own.

Then talk to whoever is in a position to decide. From the pcap, I can't tell 
what's wrong.
It does not seem like an lwIP issue, but maybe it is.

If I were you, I'd first check with a newer version of lwIP before digging into 
the hardware drivers. Especially with that processor ;-)

>>> No. In general, lwIP has *nothing* to do with your hardware. The netif 
>>> driver is responsible for that.
> I expected this answer, but was not sure about this. It was very curious that 
> the port always stops with transmission of an ARP broadcast.
> The device is configured as a switch with 2 ports. One port is connected, the 
> other one is not. A task cyclically checks the second port for phy activity.

So is there *any* communication from the device after this? Maybe on the 2nd 
port?
I don't know how the software on that processor is adapted to lwIP but strictly 
speaking, this is not a switch but 2 MACs connected via the PRUs. They might 
have bugs, too :-)

Have you tried to debug what's going on in the processor after it stops?

Simon

___
lwip-users mailing list
lwip-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/lwip-users



-
E-Mail ist virenfrei.
Von AVG uberpruft - www.avg.de
Version: 2013.0.3556 / Virendatenbank: 4793/15405 - Ausgabedatum: 15.02.2018 






___
lwip-users mailing list
lwip-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/lwip-users


Re: [lwip-users] Transmission stall after ARP broadcast

2018-02-15 Thread Stephan Hilchenbach
Hello Simon,

thank you for the fast response.

>> 1.4.1 is rather old. There have been numerous bugs fixed since then.
Yes I know. I would prefer to update, but I can't make this decision on my own.

>> A screenshot? WTF? If this is a screenshot of a wireshark log, please attach 
>> a pcap file instead.
Of course I have the log files. I was not sure if someone would look inside. In 
the Ti forum I get less attention. I attached 2 files. I cut about 100.000 
frames to make them small.
The device with LwIP has the IP 192.168.211 and my connected notebook the IP 
192.168.1.31.

>> No. In general, lwIP has *nothing* to do with your hardware. The netif 
>> driver is responsible for that.
I expected this answer, but was not sure about this. It was very curious that 
the port always stops with transmission of an ARP broadcast.
The device is configured as a switch with 2 ports. One port is connected, the 
other one is not. A task cyclically checks the second port for phy activity.


Best Regards,
Stephan



Von: lwip-users [mailto:lwip-users-bounces+hilchenbach=ish-gmbh@nongnu.org] 
Im Auftrag von goldsi...@gmx.de
Gesendet: Donnerstag, 15. Februar 2018 17:11
An: Mailing list for lwIP users
Betreff: Re: [lwip-users] Transmission stall after ARP broadcast

On 15.02.2018 16:05, Stephan Hilchenbach wrote:
Hello Experts,

I have a problem with my Ethernet driver connecting a Ti AM335x CPSW switch to 
the LwIP stack v1.4.1.

1.4.1 is rather old. There have been numerous bugs fixed since then.


The port stops transmitting after some minutes or hours. The DMA hardware 
register shows no errors but the transmission stalled. The DMA does not process 
further packets, but the content of the next packets looks OK.
When I run Wireshark I observe always the same sequence. Every time before the 
port stops transmission, the last packet sent was an ARP broadcast to the 
connected host "TexasIns_e4:2a:20 Broadcast ARP Who has 192.168.1.31? Tell 
192.168.1.211". Curiously this is the only time the LwIP generates this request 
after connection was established. There are no other ARP broadcasts until the 
Tx stall. Attached is a screenshot.

A screenshot? WTF? If this is a screenshot of a wireshark log, please attach a 
pcap file instead.


I have two questions about this:
1. What is the reason for the LwIP to generate this ARP broadcast during 
transmission?

Don't know. Attach a pcap including an explanation of which IPs we see, which 
device has which address and what they do. I'm too lazy to try to find that out 
myself. And I could be wrong.


2. Can the LwIP cause a hardware Tx port to stall (because of the packet 
content)?

No. In general, lwIP has *nothing* to do with your hardware. The netif driver 
is responsible for that.

Simon


2018-02-14_tx_fehler_4_cut.pcapng
Description: 2018-02-14_tx_fehler_4_cut.pcapng


2018-02-14_tx_fehler_5_cut.pcapng
Description: 2018-02-14_tx_fehler_5_cut.pcapng
___
lwip-users mailing list
lwip-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/lwip-users

Re: [lwip-users] Transmission stall after ARP broadcast

2018-02-15 Thread goldsi...@gmx.de

On 15.02.2018 16:05, Stephan Hilchenbach wrote:


Hello Experts,

I have a problem with my Ethernet driver connecting a Ti AM335x CPSW 
switch to the LwIP stack v1.4.1.




1.4.1 is rather old. There have been numerous bugs fixed since then.

The port stops transmitting after some minutes or hours. The DMA 
hardware register shows no errors but the transmission stalled. The 
DMA does not process further packets, but the content of the next 
packets looks OK.


When I run Wireshark I observe always the same sequence. Every time 
before the port stops transmission, the last packet sent was an ARP 
broadcast to the connected host "TexasIns_e4:2a:20 Broadcast ARP Who 
has 192.168.1.31? Tell 192.168.1.211". Curiously this is the only time 
the LwIP generates this request after connection was established. 
There are no other ARP broadcasts until the Tx stall. Attached is a 
screenshot.




A screenshot? WTF? If this is a screenshot of a wireshark log, please 
attach a pcap file instead.



I have two questions about this:

1. What is the reason for the LwIP to generate this ARP broadcast 
during transmission?




Don't know. Attach a pcap including an explanation of which IPs we see, 
which device has which address and what they do. I'm too lazy to try to 
find that out myself. And I could be wrong.


2. Can the LwIP cause a hardware Tx port to stall (because of the 
packet content)?




No. In general, lwIP has *nothing* to do with your hardware. The netif 
driver is responsible for that.


Simon
___
lwip-users mailing list
lwip-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/lwip-users