Hello Trampas, yes, well, I can only agree with you. But still I consider ChibiOs to be well designed and supported. That means I still put a bit of respect and trust into the code and libraries I start to use. But of course I try to understand them first.
And nice well thought article, thank you. Adam pá 28. 5. 2021 v 22:24 odesílatel Trampas Stern <tram...@gmail.com> napsal: > As far as the ChibiOs time issues I have a simple rule: > > *On my embedded systems every line of code I put into the project becomes > my problem! * > > That is if I use LWIP and it has a bug, customers do not care if it is in > LWIP or not, it is my problem to fix. Hence every line of code becomes my > problem. As such I try not to use code I do not understand. Often (LWIP > as example) you have to use libraries but do so knowing that their problems > become yours. Yes, LWIP has bitten me more than once where it did not work > the way I thought it should/would. That was my fault and my problem to > fix. > > I often go to extremes and I will not use processor vendor defined drivers > until I have done a code review and understand them. I have been bitten > more than once where vendor's drivers are just "example code." One > vendor told me that their code should never be used in production, one > vendor had drivers full of bugs and corner cases where it would fail, but > insisted their code was production ready. I have seen vendor drivers > violate the datasheet. So detailed code reviews of *all *code is > required. > > Hence you use ChibiOs and it has a bug, well it is now your bug to fix. > That is every line of code in ChibiOs is now your problem.. > > Trampas > > > > > > > > > On Fri, May 28, 2021 at 4:05 PM Trampas Stern <tram...@gmail.com> wrote: > >> So a trick I use in my code and libraries is to use >> typedef's for variables. >> >> typedef uint32_t milliseconds_t; >> milliseconds_t getMillis(); >> >> Then I use milliseconds_t to define all variables. This allows me to >> change it to uint64_t in one location depending on the project. >> >> I have started using more typedef's like this as a form of >> documentation. That is code is easier to read and follow when variables >> are defined based on the use/type. >> >> A neat fixed point unsigned math trick is when doing comparisons... >> >> milliseconds_t start= getMillis(); >> >> // This is bad >> while( getMillis()<(start +10) ){ //wait for 10ms >> .... >> } >> >> To understand why assume milliseconds_t is uint8_t. Now we get start and >> say it is 255, this means (start+10) = 9, now getMillis() on the first >> loop is still 255... So the comparison becomes while (255<9). So you exit >> while loop early >> >> A better way to do this is >> milliseconds_t start= getMillis(); >> >> // This is good >> while( (getMillis()-start)<10 ){ //wait for 10ms >> .... >> } >> >> Here you if start and getMills() are 255 the first loop is while(0<10). >> Now next millisecond we have (getMillis()-start) = (0-255) =1 to >> understand this look at the math as in binary: >> 0000 0000 >> -1111 1111 >> = 1 0000 0001 where the first 1 is the negative bit, but since we are 8 >> bit unsigned the value is 1. This means when doing unsigned subtraction >> you end up with a modulo absolute difference. >> >> Now with that said the code works but other developers might not >> understand it, and you risk them adding code or modifying that breaks >> things. Therefore often I just use uint64_t just to make sure other >> developers do not break the code. If speed becomes an issue I can optimize >> the code to use the fixed point math tricks, but only as a last resort. >> >> Note I know many developers that refuse to use unsigned variables due to >> math issues like above. So they try to use signed integers for most >> everything. You still have overflow issues but you do not have math >> issues. >> >> Here is a blog article I wrote on embedded systems and time: >> https://bitvolatile.com/?p=303 >> >> Trampas >> >> >> >> >> >> On Fri, May 28, 2021 at 3:25 PM Adam Baron <vysoca...@gmail.com> wrote: >> >>> Hello Trampas, >>> >>> thanks for the hints. I initialized the sys ticks with 2^32 - 120 >>> seconds, and I got mqtt pbuf=NULL in around 120 seconds + 120 keep alive >>> seconds. >>> >>> The ChibiOs sys_arch.c port includes sys_now() (current time in >>> milliseconds) following simplified implementation: >>> return ((u32_t)chVTGetSystemTimeX() - 1) / 10 + 1; >>> Since it ticks at 100 uS. >>> >>> I guess it might cause the problems as it overflows back to 0 leaving >>> the lwip timers waiting for value higher than (2^32)/10. >>> >>> To support my guess, I turned on another debug option and last lwip >>> timer message I see is: >>> sys_timeout: 2000C5DC abs_time=429497730 handler=ip_reass_tmr arg=805B28C >>> >>> >>> Adam >>> >>> pá 28. 5. 2021 v 13:45 odesílatel Trampas Stern <tram...@gmail.com> >>> napsal: >>> >>>> Increase the counter to a uint64_t. >>>> >>>> You can also start the counter at something other than zero to prove >>>> root cause faster. >>>> >>>> Trampas >>>> >>>> On Fri, May 28, 2021 at 7:08 AM Adam Baron <vysoca...@gmail.com> wrote: >>>> >>>>> Czesc Tomek :), >>>>> >>>>> I'll try to add it. Thanks. >>>>> >>>>> However, I feel like it is rather related to the problem of >>>>> overflowing a uint32 counter of some kind. Since the TCP_PCBs are not >>>>> freed >>>>> after 2^32 ticks. >>>>> >>>>> Adam >>>>> >>>>> pá 28. 5. 2021 v 9:44 odesílatel Tomasz W <wil...@gmail.com> napsal: >>>>> >>>>>> Hi (Cześć) >>>>>> Lok for this >>>>>> https://lists.nongnu.org/archive/html/lwip-devel/2020-12/msg00014.html >>>>>> In my case it solved the problem of the web server dying after a few >>>>>> days >>>>>> >>>>>> >>>>>> pt., 28 maj 2021 o 08:58 Adam Baron <vysoca...@gmail.com> napisał(a): >>>>>> > >>>>>> > Hello all, >>>>>> > >>>>>> > I'm having a small STM32F4 application running on devel branch of >>>>>> lwip, It includes httpd, sntp, smtp client, and mqtt client. All is >>>>>> running >>>>>> well until the fifth day, when mqtt client starts to receive pbuf=NULL >>>>>> and >>>>>> disconnects. My reconnect routine reconnects it in some short time, but >>>>>> it >>>>>> receives pbuf=NULL shortly after. >>>>>> > >>>>>> > Also later on I noticed in log: memp_malloc: out of memory in pool >>>>>> TCP_PCB. >>>>>> > I'm having defined MEMP_NUM_TCP_PCB as 30 and it seems enough for >>>>>> normal operation, I also upped it to 50, but ended with the same problem >>>>>> > In statistics the NUM_TCP_PCB increases and decreases as it should, >>>>>> but after uptime past 5 days it stays high with an error flag triggered. >>>>>> > >>>>>> > Quite interestingly it happens exactly after 2^32 milliseconds >>>>>> uptime. I tried to keep OpenOCD connected to start to peek in, but yet I >>>>>> did not manage to keep the openOCD running for so long without dropping >>>>>> the >>>>>> connection. >>>>>> > >>>>>> > Does anyone have any ideas please? >>>>>> > >>>>>> > Thanks in advance, >>>>>> > -- >>>>>> > 731435556 >>>>>> > Adam Baron >>>>>> > _______________________________________________ >>>>>> > lwip-users mailing list >>>>>> > lwip-users@nongnu.org >>>>>> > https://lists.nongnu.org/mailman/listinfo/lwip-users >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Pozdrawiam >>>>>> Tomek >>>>>> >>>>>> _______________________________________________ >>>>>> lwip-users mailing list >>>>>> lwip-users@nongnu.org >>>>>> https://lists.nongnu.org/mailman/listinfo/lwip-users >>>>> >>>>> >>>>> >>>>> -- >>>>> 731435556 >>>>> Adam Baron >>>>> _______________________________________________ >>>>> lwip-users mailing list >>>>> lwip-users@nongnu.org >>>>> https://lists.nongnu.org/mailman/listinfo/lwip-users >>>> >>>> _______________________________________________ >>>> lwip-users mailing list >>>> lwip-users@nongnu.org >>>> https://lists.nongnu.org/mailman/listinfo/lwip-users >>> >>> >>> >>> -- >>> 731435556 >>> Adam Baron >>> _______________________________________________ >>> lwip-users mailing list >>> lwip-users@nongnu.org >>> https://lists.nongnu.org/mailman/listinfo/lwip-users >> >> _______________________________________________ > lwip-users mailing list > lwip-users@nongnu.org > https://lists.nongnu.org/mailman/listinfo/lwip-users -- 731435556 Adam Baron
_______________________________________________ lwip-users mailing list lwip-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/lwip-users