Re: [lwip-users] Poor RX performance, misconfigured lwipopts?

2019-03-08 Thread josephjah
Sergio,

Thank you for taking a look and thank you for the suggestions. After
considering your idea about testing with just UDP I decided to try with ICMP
packet loss measurements and found out that indeed it was my driver that was
dropping frames on occasion. Ugh.

Fixing my driver-side code has resolved all of the issues previously
mentioned.

 - Joseph



--
Sent from: http://lwip.100.n7.nabble.com/lwip-users-f3.html

___
lwip-users mailing list
lwip-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/lwip-users


Re: [lwip-users] TCP state machine problem? LWIP 1.4.1

2019-03-08 Thread Simon Goldschmidt

On 08.03.2019 15:47, Sergio R. Caprile wrote:

mmm... the ACK number..., I think I've seen this one or two years ago,
search the list and or the patches for "one less" or something like that.
I'm not fresh on this, but I think that is the problem, the ACK to the
RST has the wrong number and causes a retransmission. I can't remember
if this is also related to the half-closed connection; you might check
on that too.


I know this might not be an option, but 1.4.1 is *really* old and this 
one as well as numerous other things might already be fixed in one of 
the newer versions.


Regards,
Simon

___
lwip-users mailing list
lwip-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/lwip-users


Re: [lwip-users] TCP state machine problem? LWIP 1.4.1

2019-03-08 Thread Sergio R. Caprile
mmm... the ACK number..., I think I've seen this one or two years ago,
search the list and or the patches for "one less" or something like that.
I'm not fresh on this, but I think that is the problem, the ACK to the
RST has the wrong number and causes a retransmission. I can't remember
if this is also related to the half-closed connection; you might check
on that too.

___
lwip-users mailing list
lwip-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/lwip-users


[lwip-users] TCP state machine problem? LWIP 1.4.1

2019-03-08 Thread Fabian Koch
Hey all,

we had some weird behavior with a TCP connection on LWIP 1.4.1 when the peer 
(non-LWIP) has a cable disconnect:


  *   LWIP has an established TCP connection #1 running fine
  *   Peer has a cable disconnect
  *   Our application on top of LWIP runs into a receive timeout and closes the 
socket (500ms)
  *   Peer reconnects cable
  *   Our application opens a new connection #2 which again is established and 
running fine
  *   The FINACK+PSHACK re-sends of connection #1 also reaches the peer which 
answers RSTACK
  *   This keeps on looping until we restart the whole machine with LWIP

Also, I have a sort of "netstat" implemented on top of the LWIP socket API 
which runs over all possible sockets we have and if it finds a valid conn 
pointer there, prints infos (local addr, remot addr, port, TCP state and such). 
And connection #1 does not show up anymore in this view!

In my mind, the TCP state machine should be in FIN_WAIT_1 while the peer cable 
is disconnected?
And it should just jump to either CLOSED or TIME_WAIT when receiving the RSTs 
upon cable reconnect?

I attached a clipped pcap with only connection #1 shown and the problem 
starting at packet #19. Image the final exchange going on forever to understand 
the problem ;o)

Any comments or debugging ideas appreciated.

Kind regards
Fabian



LWIP_1_4_1_TCP_state.pcapng
Description: LWIP_1_4_1_TCP_state.pcapng
___
lwip-users mailing list
lwip-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/lwip-users

Re: [lwip-users] Poor RX performance, misconfigured lwipopts?

2019-03-08 Thread Sergio R. Caprile
Your msg is too long for me, I'm too lazy to read it and too dumb to
keep focus at the same time.
Your capture file is long too, but fortunately retransmissions happen
right at the beginning.

I see you are ACKing 100ms later, several frames later.
I see (at least once) that you ACK a frame and ms later you ACK again
and even several times in a row (frame #177 and starting at #182).
That looks (to me) like a time base problem, check your sys_now() and
your port. I'm more of the bare metal type so I can't tell you much more
on how to setup an OS port. I've seen the unix port long ago and used it
as bare metal, don't know how it will handle timing info to lwIP (nor
sockets, btw).

And... 2814 bytes per frame ? Jumbo frames ? Can you try with more
common MTUs over the Internet ? Just in case.

Try to run some perf test over UDP, this will move the timers out of the
scenario and you can check for possible frame loss. UDP datagrams should
be numbered, though.

In any case, violating threading rules causes lots of strange artifacts,
make sure you don't.

___
lwip-users mailing list
lwip-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/lwip-users


Re: [lwip-users] lwIP Connection Initiation Issue

2019-03-08 Thread Sergio R. Caprile
You forgot to include your device address. If it is 192.168.1.x it is
fine (unless it coincides with any other device in the network...)

The error callback is called when the connection request is rejected or
lwIP gets tired of waiting for an answer. Do you see the SYN in
wireshark or equivalent ? (I believe you say you don't)
The timeout mechanism requires a timing source, pings are answered on
the fly. In fact, most TCP behavior relies on timers. Do you provide
your time source ?
However, the SYN should be able to get out... If SYN is not coming out,
then you are preventing the driver from running, perhaps one of the
locks is not correct.
You seem to have a broken port.
Though you are calling RAW API functions, you have an OS underneath. Are
you running in NO_SYS=1 or NO_SYS=0 ?
All calls to RAW API functions must be on the same thread that calls the
lwIP core.

Make sure you read and understand this:
https://www.nongnu.org/lwip/2_1_x/group__lwip.html
https://www.nongnu.org/lwip/2_1_x/pitfalls.html


___
lwip-users mailing list
lwip-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/lwip-users