Hello Trampas,
yes, well, I can only agree with you. But still I consider ChibiOs to be
well designed and supported. That means I still put a bit of respect and
trust into the code and libraries I start to use. But of course I try to
understand them first.

And nice well thought article, thank you.

Adam

pá 28. 5. 2021 v 22:24 odesílatel Trampas Stern <tram...@gmail.com> napsal:

> As far as the  ChibiOs  time issues I have a simple rule:
>
> *On my embedded systems every line of code I put into the project becomes
> my problem! *
>
> That is if I use LWIP and it has a bug, customers do not care if it is in
> LWIP or not, it is my problem to fix.  Hence every line of code becomes my
> problem.  As such I try not to use code I do not understand.  Often (LWIP
> as example) you have to use libraries but do so knowing that their problems
> become yours.  Yes, LWIP has bitten me more than once where it did not work
> the way I thought it should/would.  That was my fault and my problem to
> fix.
>
> I often go to extremes and I will not use processor vendor defined drivers
> until I have done a code review and understand them.  I have been bitten
> more than once where vendor's drivers are just "example code."    One
> vendor told me that their code should never be used in production, one
> vendor had drivers full of bugs and corner cases where it would fail, but
> insisted their code was production ready.  I have seen vendor drivers
> violate the datasheet.  So detailed code reviews of *all *code is
> required.
>
> Hence you use ChibiOs  and it has a bug,  well it is now your bug to fix.
> That is every line of code in ChibiOs is now your problem..
>
> Trampas
>
>
>
>
>
>
>
>
> On Fri, May 28, 2021 at 4:05 PM Trampas Stern <tram...@gmail.com> wrote:
>
>> So a trick I use in my code and libraries is to use
>> typedef's for variables.
>>
>> typedef uint32_t milliseconds_t;
>> milliseconds_t getMillis();
>>
>> Then I use milliseconds_t to define all variables.  This allows me to
>> change it to uint64_t in one location depending on the project.
>>
>> I have started using more typedef's like this as a form of
>> documentation.   That is code is easier to read and follow when variables
>> are defined based on the use/type.
>>
>> A neat fixed point unsigned math trick is when doing comparisons...
>>
>> milliseconds_t start=   getMillis();
>>
>> // This is bad
>> while( getMillis()<(start +10) ){  //wait for 10ms
>> ....
>> }
>>
>> To understand why assume milliseconds_t is uint8_t.  Now we get start and
>> say it is 255,  this means (start+10) = 9, now getMillis() on the first
>> loop is still 255... So the comparison becomes while (255<9).  So you exit
>> while loop early
>>
>> A better way to do this is
>> milliseconds_t start=   getMillis();
>>
>> // This is good
>> while( (getMillis()-start)<10 ){  //wait for 10ms
>> ....
>> }
>>
>> Here you if start and getMills() are 255 the first loop is while(0<10).
>> Now next millisecond we have (getMillis()-start)  = (0-255) =1  to
>> understand this look at the math as in binary:
>>  0000 0000
>> -1111 1111
>> = 1 0000 0001 where the first 1 is the negative bit, but since we are 8
>> bit unsigned the value is 1.  This means when doing unsigned subtraction
>> you end up with a modulo absolute difference.
>>
>> Now with that said the code works but other developers might not
>> understand it, and you risk them adding code or modifying that breaks
>> things.  Therefore often I just use uint64_t just to make sure other
>> developers do not break the code.  If speed becomes an issue I can optimize
>> the code to use the fixed point math tricks, but only as a last resort.
>>
>> Note I know many developers that refuse to use unsigned variables due to
>> math issues like above.  So they try to use signed integers for most
>> everything.  You still have overflow issues but you do not have math
>> issues.
>>
>> Here is a blog article I wrote on embedded systems and time:
>> https://bitvolatile.com/?p=303
>>
>> Trampas
>>
>>
>>
>>
>>
>> On Fri, May 28, 2021 at 3:25 PM Adam Baron <vysoca...@gmail.com> wrote:
>>
>>> Hello Trampas,
>>>
>>> thanks for the hints. I initialized the sys ticks with 2^32 - 120
>>> seconds, and I got mqtt pbuf=NULL in around 120 seconds + 120 keep alive
>>> seconds.
>>>
>>> The ChibiOs sys_arch.c port includes sys_now() (current time in
>>> milliseconds) following simplified implementation:
>>>   return ((u32_t)chVTGetSystemTimeX() - 1) / 10 + 1;
>>> Since it ticks at 100 uS.
>>>
>>> I guess it might cause the problems as it overflows back to 0 leaving
>>> the lwip timers waiting for value higher than (2^32)/10.
>>>
>>> To support my guess, I turned on another debug option and last lwip
>>> timer message I see is:
>>> sys_timeout: 2000C5DC abs_time=429497730 handler=ip_reass_tmr arg=805B28C
>>>
>>>
>>> Adam
>>>
>>> pá 28. 5. 2021 v 13:45 odesílatel Trampas Stern <tram...@gmail.com>
>>> napsal:
>>>
>>>> Increase the counter to a uint64_t.
>>>>
>>>> You can also start the counter at something other than zero to prove
>>>> root cause faster.
>>>>
>>>> Trampas
>>>>
>>>> On Fri, May 28, 2021 at 7:08 AM Adam Baron <vysoca...@gmail.com> wrote:
>>>>
>>>>> Czesc Tomek :),
>>>>>
>>>>> I'll try to add it. Thanks.
>>>>>
>>>>> However, I feel like it is rather related to the problem of
>>>>> overflowing a uint32 counter of some kind. Since the TCP_PCBs are not 
>>>>> freed
>>>>> after 2^32 ticks.
>>>>>
>>>>> Adam
>>>>>
>>>>> pá 28. 5. 2021 v 9:44 odesílatel Tomasz W <wil...@gmail.com> napsal:
>>>>>
>>>>>> Hi (Cześć)
>>>>>> Lok for this
>>>>>> https://lists.nongnu.org/archive/html/lwip-devel/2020-12/msg00014.html
>>>>>> In my case it solved the problem of the web server dying after a few
>>>>>> days
>>>>>>
>>>>>>
>>>>>> pt., 28 maj 2021 o 08:58 Adam Baron <vysoca...@gmail.com> napisał(a):
>>>>>> >
>>>>>> > Hello all,
>>>>>> >
>>>>>> > I'm having a small STM32F4 application running on devel branch of
>>>>>> lwip, It includes httpd, sntp, smtp client, and mqtt client. All is 
>>>>>> running
>>>>>> well until the fifth day, when mqtt client starts to receive pbuf=NULL 
>>>>>> and
>>>>>> disconnects. My reconnect routine reconnects it in some short time, but 
>>>>>> it
>>>>>> receives pbuf=NULL shortly after.
>>>>>> >
>>>>>> > Also later on I noticed in log: memp_malloc: out of memory in pool
>>>>>> TCP_PCB.
>>>>>> > I'm having defined MEMP_NUM_TCP_PCB as 30 and it seems enough for
>>>>>> normal operation, I also upped it to 50, but ended with the same problem
>>>>>> > In statistics the NUM_TCP_PCB increases and decreases as it should,
>>>>>> but after uptime past 5 days it stays high with an error flag triggered.
>>>>>> >
>>>>>> > Quite interestingly it happens exactly after 2^32 milliseconds
>>>>>> uptime. I tried to keep OpenOCD connected to start to peek in, but yet I
>>>>>> did not manage to keep the openOCD running for so long without dropping 
>>>>>> the
>>>>>> connection.
>>>>>> >
>>>>>> > Does anyone have any ideas please?
>>>>>> >
>>>>>> > Thanks in advance,
>>>>>> > --
>>>>>> > 731435556
>>>>>> > Adam Baron
>>>>>> > _______________________________________________
>>>>>> > lwip-users mailing list
>>>>>> > lwip-users@nongnu.org
>>>>>> > https://lists.nongnu.org/mailman/listinfo/lwip-users
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Pozdrawiam
>>>>>> Tomek
>>>>>>
>>>>>> _______________________________________________
>>>>>> lwip-users mailing list
>>>>>> lwip-users@nongnu.org
>>>>>> https://lists.nongnu.org/mailman/listinfo/lwip-users
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> 731435556
>>>>> Adam Baron
>>>>> _______________________________________________
>>>>> lwip-users mailing list
>>>>> lwip-users@nongnu.org
>>>>> https://lists.nongnu.org/mailman/listinfo/lwip-users
>>>>
>>>> _______________________________________________
>>>> lwip-users mailing list
>>>> lwip-users@nongnu.org
>>>> https://lists.nongnu.org/mailman/listinfo/lwip-users
>>>
>>>
>>>
>>> --
>>> 731435556
>>> Adam Baron
>>> _______________________________________________
>>> lwip-users mailing list
>>> lwip-users@nongnu.org
>>> https://lists.nongnu.org/mailman/listinfo/lwip-users
>>
>> _______________________________________________
> lwip-users mailing list
> lwip-users@nongnu.org
> https://lists.nongnu.org/mailman/listinfo/lwip-users



-- 
731435556
Adam Baron
_______________________________________________
lwip-users mailing list
lwip-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/lwip-users

Reply via email to