Hello Trampas, thanks for the hints. I initialized the sys ticks with 2^32 - 120 seconds, and I got mqtt pbuf=NULL in around 120 seconds + 120 keep alive seconds.
The ChibiOs sys_arch.c port includes sys_now() (current time in milliseconds) following simplified implementation: return ((u32_t)chVTGetSystemTimeX() - 1) / 10 + 1; Since it ticks at 100 uS. I guess it might cause the problems as it overflows back to 0 leaving the lwip timers waiting for value higher than (2^32)/10. To support my guess, I turned on another debug option and last lwip timer message I see is: sys_timeout: 2000C5DC abs_time=429497730 handler=ip_reass_tmr arg=805B28C Adam pá 28. 5. 2021 v 13:45 odesílatel Trampas Stern <[email protected]> napsal: > Increase the counter to a uint64_t. > > You can also start the counter at something other than zero to prove root > cause faster. > > Trampas > > On Fri, May 28, 2021 at 7:08 AM Adam Baron <[email protected]> wrote: > >> Czesc Tomek :), >> >> I'll try to add it. Thanks. >> >> However, I feel like it is rather related to the problem of overflowing a >> uint32 counter of some kind. Since the TCP_PCBs are not freed after 2^32 >> ticks. >> >> Adam >> >> pá 28. 5. 2021 v 9:44 odesílatel Tomasz W <[email protected]> napsal: >> >>> Hi (Cześć) >>> Lok for this >>> https://lists.nongnu.org/archive/html/lwip-devel/2020-12/msg00014.html >>> In my case it solved the problem of the web server dying after a few days >>> >>> >>> pt., 28 maj 2021 o 08:58 Adam Baron <[email protected]> napisał(a): >>> > >>> > Hello all, >>> > >>> > I'm having a small STM32F4 application running on devel branch of >>> lwip, It includes httpd, sntp, smtp client, and mqtt client. All is running >>> well until the fifth day, when mqtt client starts to receive pbuf=NULL and >>> disconnects. My reconnect routine reconnects it in some short time, but it >>> receives pbuf=NULL shortly after. >>> > >>> > Also later on I noticed in log: memp_malloc: out of memory in pool >>> TCP_PCB. >>> > I'm having defined MEMP_NUM_TCP_PCB as 30 and it seems enough for >>> normal operation, I also upped it to 50, but ended with the same problem >>> > In statistics the NUM_TCP_PCB increases and decreases as it should, >>> but after uptime past 5 days it stays high with an error flag triggered. >>> > >>> > Quite interestingly it happens exactly after 2^32 milliseconds uptime. >>> I tried to keep OpenOCD connected to start to peek in, but yet I did not >>> manage to keep the openOCD running for so long without dropping the >>> connection. >>> > >>> > Does anyone have any ideas please? >>> > >>> > Thanks in advance, >>> > -- >>> > 731435556 >>> > Adam Baron >>> > _______________________________________________ >>> > lwip-users mailing list >>> > [email protected] >>> > https://lists.nongnu.org/mailman/listinfo/lwip-users >>> >>> >>> >>> -- >>> Pozdrawiam >>> Tomek >>> >>> _______________________________________________ >>> lwip-users mailing list >>> [email protected] >>> https://lists.nongnu.org/mailman/listinfo/lwip-users >> >> >> >> -- >> 731435556 >> Adam Baron >> _______________________________________________ >> lwip-users mailing list >> [email protected] >> https://lists.nongnu.org/mailman/listinfo/lwip-users > > _______________________________________________ > lwip-users mailing list > [email protected] > https://lists.nongnu.org/mailman/listinfo/lwip-users -- 731435556 Adam Baron
_______________________________________________ lwip-users mailing list [email protected] https://lists.nongnu.org/mailman/listinfo/lwip-users
