+1 for efficiency.

Regards,
Vipul Rahane

> On Jun 23, 2016, at 2:35 PM, chris collins <[email protected]> wrote:
> 
> I would also favor efficiency over genericness in this case since
> cputime is fundamental to time-critical tasks.  It will mean more
> configuration for the application developer, but I don't see a way
> around that.
> 
> On Thu, Jun 23, 2016 at 1:33 PM, will sanfilippo <[email protected]> wrote:
>> Hello:
>> 
>> I wanted to post a question to the dev list to see if folks had opinions 
>> regarding the following topic. As others have stated “this will be a long 
>> and dry email” so be forewarned…
>> 
>> HAL cputime was developed to provide application developers access to a 
>> generic, high resolution timer. The API provided by the hal allows 
>> developers to create “timers” that can be added to a timer queue. The API 
>> also provides a set of routines to convert “normal” time units to hw timer 
>> “ticks”. The timer queue is used to provide applications with a callback 
>> that will occur at a given ‘cputime’. The term ‘cputime’ refers to the 
>> underlying timebase that is kept by the hal. Cputime always counts in tick 
>> increments, with the time per tick dependent on the underlying HW timer 
>> resolution/configuration.
>> 
>> The main impetus behind creating this HAL was for use in networking stacks. 
>> BLE (bluetooth low energy) is a good example of such a stack. The 
>> specification requires actions to occur at particular times and many of 
>> these actions are relatlive to the transmission or reception time of a 
>> packet. The cputime HAL provides a consistent timebase for the BLE 
>> controller stack to interface to the underlying HW and should provide a 
>> handy abstraction when porting to various BLE transceivers/socs.
>> 
>> Using the current nimBLE stack (mynewt’s BLE stack) as example, the stack 
>> instantiates cputime using a 1 MHz clock. This means that each cputime tick 
>> is 1 usec. This timebase was chosen as it provides enough (more than 
>> enough!) resolution for the BLE stack and is in a time unit that is a common 
>> factor of any time interval used in the specification. For example, 
>> advertising events are in units of 625 usecs and connection intervals are in 
>> units of 1250 usecs.
>> 
>> While using a 1 usec timebase has its advantages, there are disadvantages as 
>> well. The main drawback is that on some HW this timebase would require use 
>> of a higher power timer. For example, the nrf52 has a low power timer (they 
>> call it the RTC) but this timer has a minimum resolution of 30.517 usecs as 
>> it is based on a 32.768kHz crystal. In its current incarnation, hal cputime 
>> cannot support this timer as the minimum clock frequency accepted by this 
>> hal is 1 MHz.
>> 
>> So, this (finally!) leads to the question I want to ask the community: how 
>> does the community feel about sacrificing “genericness” for “efficiency”? If 
>> it were up to me, I would sacrifice genericness for efficiency in a 
>> microsecond (forgive the bad pun!) in this case. Let me go into a bit more 
>> detail here. It should be obvious to the reader that there are neat tricks 
>> you can play when dividing by a power of 2 (it is a simple shift right). In 
>> the case of a 32.768 kHz crystal, each tick is 1/32768 seconds in length 
>> (this is where we get the ~30.517 usec tick interval). What I would like to 
>> do is have a compile time definition specifying use of a 32.768 kHz crystal 
>> for cputime. How this gets defined is outside the scope of this email. It 
>> may be a target variable, something in a pkg.yml file or a newt feature. 
>> With this definition the API that converts ticks to usecs (and vice versa) 
>> does a shift instead of a divide or multiply. On the nrf51 this can lead to 
>> quite a large savings in time. Using the C library 64-bit divide routine 
>> that mynewt uses, it takes about 60 usecs to perform this divide. When we 
>> shift a 64-bit number to perform the divide this time gets down to 4 or 5 
>> usecs (slightly more than an order of magnitude savings!). Of course, on 
>> faster processors or processors that support faster divides this might be a 
>> moot point, but for those using the nrf51 it is not.
>> 
>> Now you may say “you could have done the same thing in your current HAL 
>> cputime with a 1 MHz clock”. In this case, the routine to “convert” ticks to 
>> usecs (and vice versa) would simply return the number passed in. I would 
>> like to make this change as well personally. Seems quite a big win (and 
>> would also save some code space too!).
>> 
>> Comments?
>> 
>> Will
>> 
>> 

Reply via email to