Sounds like a priority inversion problem that could be solved by a suitable set of mutexes…?
Skickat från min iPhone > 26 mars 2017 kl. 10:40 skrev Noam Weissman <n...@silrd.com>: > > Hi Simon, Jan and everyone else that was following this, > > I think I have found the problem I am facing with. This is not an LwIP or > mbedTLS error. > > It is a system design issue on my part. The problem is simple and I am sure > many other > Developers have faced it or will face it. > > The problem is a classic task starvation issue. Let me explain… > > I have a task that has a priority 6 and the LwIP task is running at priority > 10 (higher). > FreeRTOS is in preemptive mode. > > When my module (priority 6) calls mbedTLS mbedtls_ssl_read function, like > this: > Ret = mbedTLS mbedtls_ssl_read(&ssl, pData, ReadData); > > It may return a value that is less than or equal ReadData. > > The problem is that in contrast to a simple read call, the SSL library will > first read the entire > SSL record that can be as much as 16K. Only after reading the entire record > it will decipher > it and return to the calling code with the ReadData length that was requested > from it. > > Collecting the data and deciphering is a long process that is done in a loop. > No OS functions > are used and as a result the TCP stack and other task are not running, they > are starved !!! > > Instead of defining that the code uses the mbedtls_net_recv function I > changed it to > mbedtls_net_recv_timeout function. > > I have modified the mbedTLS net_socket.c porting and added vTaskDelay calls > inside the code > For function mbedtls_net_recv_timeout. > > Original code: > int mbedtls_net_recv_timeout( void *ctx, unsigned char *buf, size_t len, > uint32_t > timeout ) > { > return mbedtls_net_recv( ctx, buf, len ); > } > > > Modified code: > int mbedtls_net_recv_timeout( void *ctx, unsigned char *buf, size_t len, > uint32_t > timeout ) > { > int RetVal, TimeOutInc, AdvancedTimeOut; > > > TimeOutInc = 10; > AdvancedTimeOut = 0; > > // add a small delay before trying to read data > vTaskDelay(5 / portTICK_RATE_MS); > > do > { > // try to read data from connection > RetVal = mbedtls_net_recv( ctx, buf, len ); > > // if function returns with an error, put a small delay > // and try again... > if(RetVal < 0) > { > AdvancedTimeOut += TimeOutInc; > vTaskDelay(TimeOutInc / portTICK_RATE_MS); > } > else > { > break; > } > > } while(AdvancedTimeOut < timeout); > > return RetVal; > } > ------------------------------------------------------------------------------------- > > Preliminary testing shows that the above code works. What the code does > Is by calling the vTaskDelay it triggers the OS scheduler and actually gives > time > to the LwIP own task. > > Thanks for everyone answering and giving ideas J > > Great work LwIP team. > > BR, > Noam. > > From: lwip-users [mailto:lwip-users-bounces+noam=silrd....@nongnu.org] On > Behalf Of Noam Weissman > Sent: Thursday, March 16, 2017 5:58 PM > To: Mailing list for lwIP users > Subject: Re: [lwip-users] PolarSSL and mbedTLS > > Simon, > > I am not saying that LwIP has bugs because I am not sure… It feels like that … > > I asked you why when I set debug to on and all the printouts causes lots of > delays … file is transferred ? > > I have set LwIP: > > #define TCP_MSS 536 > #define MEMP_NUM_PBUF 100 > #define MEM_SIZE (25 * 1024) > #define MEMP_NUM_TCP_PCB 20 > #define TCP_SND_BUF (2 * TCP_MSS) > #define TCP_WND (4 * TCP_MSS) > #define MEMP_NUM_TCP_SEG TCP_SND_QUEUELEN > > As far as I see LwIP has sufficient RAM to read the data. > > I added 100ms delay after every 4K read > > I changed original code to read with timeout of 1000 ms > > But with all the changes I made I still am able get just 8 x 1K blocks from > calling the mbedtls_ssl_read function > > For some reason I see TCP ZeroWindow in WireShark, attached here > > > BR, > Noam > > From: lwip-users [mailto:lwip-users-bounces+noam=silrd....@nongnu.org] On > Behalf Of goldsimon > Sent: Thursday, March 16, 2017 5:37 PM > To: Mailing list for lwIP users > Subject: Re: [lwip-users] PolarSSL and mbedTLS > > From all information given so far, I fail to see how this would be an lwip > problem. > > Did you test your SSL application on a different platform and it worked or > what makes you think of an lwip problem instead of an application problem? > > Don't get me wrong, lwip can have bugs. I just don't see that here and by > now, an application problem seems much more likely to me ;-) > > Simon > > Am 16. März 2017 13:54:15 MEZ schrieb Noam Weissman <n...@silrd.com>: > Hi Jan, > > No the error I am seeing is MBEDTLS_ERR_NET_RECV_FAILED > > Actually I found something interesting in my code. > > Normally when you call read (fd, buf, len) the underlying TCP will fetch the > amount you need. > > With the mbedtls_ssl_read it is a bit more complicated. As it internally > collects a record to its > own buffer before it returns to the calling part with the requested block of > data. If you read less > than the internal SSL buffer size you may have more data to read from the > internal buffer but NOT > from the socket !!. > > Because in my code, after every mbedtls_ssl_read I called select it would > have failed on the last > fragment even so that the SSL internal buffer still had some data. I added > code to check that > ssl.in_msglen == 0 before I call select again. This solved one problem but > NOT the overall reading > problem. > > If I also added large delays in code so now I am able to read 8 x 1K chunks > before I get again the > MBEDTLS_ERR_NET_RECV_FAILED > > This is a combined problem... misunderstanding how the SSL works and probably > something related > to the LwIP layer. > > If I print LwIP debug messages I have no problems reading the file. ... > delays ??? > > I also changed the call to mbedtls_ssl_set_bio to use the > mbedtls_net_recv_timeout instead of > mbedtls_net_recv function. With this change I am able to read the first SSL > record without problems > > Thanks for all the help so far :-) > > > BR, > Noam. > > -----Original Message----- > From: lwip-users [mailto:lwip-users-bounces+noam=silrd....@nongnu.org] On > Behalf Of Jan Menzel > Sent: Wednesday, March 15, 2017 10:54 PM > To: lwip-users@nongnu.org > Subject: Re: [lwip-users] PolarSSL and mbedTLS > > Hi Noam! > Did you follow the error code through mbedtls's net.c? In my code its > translated into "MBEDTLS_ERR_SSL_WANT_READ" as follows: > > int mbedtls_net_recv( void *ctx, unsigned char *buf, size_t len ) [...] > ret = (int) read( fd, buf, len ); > > if( ret < 0 ) > { > if( net_would_block( ctx ) != 0 ) > return( MBEDTLS_ERR_SSL_WANT_READ ); [...] > > with > > static int net_would_block( const mbedtls_net_context *ctx ) [...] > switch( errno ) > { > #if defined EAGAIN > case EAGAIN: > #endif > #if defined EWOULDBLOCK && EWOULDBLOCK != EAGAIN > case EWOULDBLOCK: > #endif > return( 1 ); > } > return( 0 ); > } > > Jan > > On 15.03.2017 20:30, Noam Weissman wrote: > Hi Simon, > > I have triad debugging my code and added : > #define LWIP_DEBUG LWIP_DBG_ON > #define SOCKETS_DEBUG LWIP_DBG_ON > > Strange that with this switches on I am able to get a file of about 38K but > it fails at the last part, always?. > > Without the debug prints it never even starts, it fails on first read. > > I have attached my debug printout if that helps. > > The text is mixed with my own debug prints, sorry: > > File transfer starts at line 438 with: From WssHandleReadData: > PayloadLen = 38032, DataLen = 1020 > > The server sends chunks of 4K, my code reads 1K at a time from the ssl layer > hence the 1024 chunks. > You can see that PayloadLen reduces by the DataLen chunk ... > > The last part received is PayloadLen 1172 DataLen 1024 ... on line > 1512 > > It should read one 1024 block and then 148 bytes and finish... This > never happens and it fails on last read This is consistent on every test I > did ?. > > If I turn off the two debug switches the file transfer never starts, > actually fails on first read and the lwip_recvfrom returns with -1 and > set_errno(EWOULDBLOCK); on line 773 in sockets.c (lwip ver 2.02) > > > Any ideas ? > > > Many thanks, > Noam. > > > > -----Original Message----- > From: lwip-users [mailto:lwip-users-bounces+noam=silrd....@nongnu.org] > On Behalf Of Simon Goldschmidt > Sent: Friday, March 10, 2017 10:36 AM > To: lwip-users@nongnu.org > Subject: Re: [lwip-users] PolarSSL and mbedTLS > > Noam Weissman wrote: > I get a read error inside lwip_recvfrom function. > [..] > If anyone has any ideas on what more to check or test please respond. > > 1: Get an idea of the error (if recvfrom returns -1, what's the > corrent errno?) > 2: Get a debugger and try to find out why recvfrom returns an error. Without > that information, there's no way of knowing where the error is. > > Simon > > > lwip-users mailing list > lwip-users@nongnu.org > https://lists.nongnu.org/mailman/listinfo/lwip-users > > > > > lwip-users mailing list > lwip-users@nongnu.org > https://lists.nongnu.org/mailman/listinfo/lwip-users > > > > lwip-users mailing list > lwip-users@nongnu.org > https://lists.nongnu.org/mailman/listinfo/lwip-users > > lwip-users mailing list > lwip-users@nongnu.org > https://lists.nongnu.org/mailman/listinfo/lwip-users > _______________________________________________ > lwip-users mailing list > lwip-users@nongnu.org > https://lists.nongnu.org/mailman/listinfo/lwip-users
_______________________________________________ lwip-users mailing list lwip-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/lwip-users