I agree with you Kieran, but the problem is that I don't know where to look for.
I used the lwIP 1.3.2 port for avr32 and I didn't touch almost anything.
In one my long (and boring) previous post I have added the description of all
the tasks that uses the lwIP with netconn api. I can reply it if you wish.
About other... I have looked for some timers and I have seen that in the lwip
core there are a lot of them that I suppose correct. I said "I suppose" because
I don't really know how to investigate.
Can you please suggest where to look for?
Test 1.
I have connected the device to my computer with a cross Ethernet cable so that
I haven't any wireless, switch, ... in the middle.
The situation is pretty the same except the fact that the lock is harder to
create. After a lot of F5 reload, everything locks while, in the normal
situation, I need only 5-10 fast reload.
This could suggest the heavy traffic managed by the lwIP itself could interfere
with the normal management. I don't know if it is really a timer; probably
something related to the MAC itself but, as you said, at interrupt level. But I
don't know where
What I have seen in this test is that the key is really the TCP_SEG: when there
is at least an empty block there could be communication even if the lfree ram
pointer is not in the top of the area, otherwise there is the lock.
About SYS_TIMEOUT: Everytime I ask a page (or at least a connection) a timeout
is created. I have set 6 SYS_TIMEOUT. If I reload the page 5 times and wait, no
error occurs. If 6 or more, the error counter is increased. This seems to have
no relationship with the TCP_SEG. Anyway, after a lot of error, the lwIP
continues to function. So, let's forget it for the moment.
Test 2:
I have put a Relais toggle in the web server task
WebServer task
...
for (;;)
{
iRestartBinding = 0;
pxHTTPListener = netconn_new( NETCONN_TCP );
netconn_bind(pxHTTPListener, NULL, webHTTP_PORT );
netconn_listen( pxHTTPListener );
int iTimeout = 1000;
//for( ; (iRestartBinding < 10) && (gucRestartWebServer == FALSE);
iRestartBinding++)
for( ; ; ) // <<-- for this test purpose; In the real case the above
line is present
{
REL_TGL; // <<-- for this test purpose
xLastFocusTime = xTaskGetTickCount();
vTaskDelayUntil( &xLastFocusTime, xDelayLength );
if (iGlobalWtdBomb == FALSE) // TRUE I am waiting for a WDT suicide
{
// Wait for a first connection.
#if LWIP_SO_RCVTIMEO
pxHTTPListener->recv_timeout = iTimeout;
#endif
pxNewConnection = netconn_accept(pxHTTPListener);
if (xTaskCreate(WebServerAnswerTask,
( signed portCHAR * ) "WebServerAnswer",
WEB_SERVER_STACK_SIZE,
pxNewConnection,
ethWEBSERVER_PRIORITY,
( xTaskHandle * ) NULL ) != pdPASS)
{
// Task not correctly created!!!
netconn_write( pxNewConnection, (char *)
webHTTP_HTM_INTERNAL_ERROR, (u16_t) strlen( webHTTP_HTM_INTERNAL_ERROR ),
NETCONN_COPY ); // error HTTP 500
netconn_close( pxNewConnection );
netconn_delete( pxNewConnection );
}
iRestartBinding = 0;
iTimeout = 5000;
}
} // end acquisition loop
gucRestartWebServer = FALSE;
netconn_close(pxHTTPListener);
while(netconn_delete(pxHTTPListener) != 0)
{
vTaskDelay(20);
}
pxHTTPListener = NULL;
}
...
Result...
When I reload the page slowly everything is ok almost forever.
When I reload the page faster I see that both firefox and explorer process the
TCP connection, the GET request and immediately after they send [RST, ACK] to
close the connection except the last one that waits for the device answer. I
suppose that, due to the fact the browser hasn't received any answer and the
user requests a reload they would like only the last one to be processed.
Every netconn_accept (time out or not) I can hear the relais toggle. If I press
F5 5 times I hear 5 toggle. That's what I expect.
Sometimes one toggle misses (5 press of F5, 4 toggle!). Exactly in this case, I
lose a TCP_SEG block and a portion of mem area.
1 toggle lost means also that the netconn_accept doesn't recognize the
connection and, from web server task point of view, I cannot see the problem.
Again, this happens if there are lots of requests (connection, GET, [RST,ACK]
from browser, close connection) before a (connection, GET, answer, [RST,ACK],
close connection).
Sometimes I have seen this transaction in the middle of a reload
(Firefox) Connection [SYN]
(device) Connection [SYN, ACK]
(Firefox) Connection [ACK]
(Firefox) GET request
(Firefox) [TCP Retransmission] of the GET request
(device) [ACK] of the HTTP
(Firefox) [RST, ACK] without any answer form the device
It seems that this is one case of TCP_SEC lost. It is not easy to say because I
don't know exactly when the loss happens and how I can relate it with the
wireshark sniffing.
It seems also that the loss often (but not always) happens when a [TCP
Retransmission] is present
Anyway, it seems there is something in the inner management of the [RST,ACK],
the retransmission or something like that is probably not related to the code I
have written.
How can I handle this? Where do I have to look for? I have no idea at the
moment.
My milestone is that the lwIP port is correct but at this point I am not so
sure. I still hope that I wrote the wrong piece of code but, as I have said, I
have no idea where to look at.
I hope my new analysis can help
Best regards
Davide
_______________________________________________
lwip-users mailing list
[email protected]
http://lists.nongnu.org/mailman/listinfo/lwip-users