+1 for efficiency over genericness for me, every time

Lower power and a shorter delay to sleep mode will always win for me with anything wireless as long as the timing constraints can be respected.

Kevin


On 23/06/16 22:33, will sanfilippo wrote:
Hello:

I wanted to post a question to the dev list to see if folks had opinions 
regarding the following topic. As others have stated “this will be a long and 
dry email” so be forewarned…

HAL cputime was developed to provide application developers access to a 
generic, high resolution timer. The API provided by the hal allows developers 
to create “timers” that can be added to a timer queue. The API also provides a 
set of routines to convert “normal” time units to hw timer “ticks”. The timer 
queue is used to provide applications with a callback that will occur at a 
given ‘cputime’. The term ‘cputime’ refers to the underlying timebase that is 
kept by the hal. Cputime always counts in tick increments, with the time per 
tick dependent on the underlying HW timer resolution/configuration.

The main impetus behind creating this HAL was for use in networking stacks. BLE 
(bluetooth low energy) is a good example of such a stack. The specification 
requires actions to occur at particular times and many of these actions are 
relatlive to the transmission or reception time of a packet. The cputime HAL 
provides a consistent timebase for the BLE controller stack to interface to the 
underlying HW and should provide a handy abstraction when porting to various 
BLE transceivers/socs.

Using the current nimBLE stack (mynewt’s BLE stack) as example, the stack 
instantiates cputime using a 1 MHz clock. This means that each cputime tick is 
1 usec. This timebase was chosen as it provides enough (more than enough!) 
resolution for the BLE stack and is in a time unit that is a common factor of 
any time interval used in the specification. For example, advertising events 
are in units of 625 usecs and connection intervals are in units of 1250 usecs.

While using a 1 usec timebase has its advantages, there are disadvantages as 
well. The main drawback is that on some HW this timebase would require use of a 
higher power timer. For example, the nrf52 has a low power timer (they call it 
the RTC) but this timer has a minimum resolution of 30.517 usecs as it is based 
on a 32.768kHz crystal. In its current incarnation, hal cputime cannot support 
this timer as the minimum clock frequency accepted by this hal is 1 MHz.

So, this (finally!) leads to the question I want to ask the community: how does 
the community feel about sacrificing “genericness” for “efficiency”? If it were 
up to me, I would sacrifice genericness for efficiency in a microsecond 
(forgive the bad pun!) in this case. Let me go into a bit more detail here. It 
should be obvious to the reader that there are neat tricks you can play when 
dividing by a power of 2 (it is a simple shift right). In the case of a 32.768 
kHz crystal, each tick is 1/32768 seconds in length (this is where we get the 
~30.517 usec tick interval). What I would like to do is have a compile time 
definition specifying use of a 32.768 kHz crystal for cputime. How this gets 
defined is outside the scope of this email. It may be a target variable, 
something in a pkg.yml file or a newt feature. With this definition the API 
that converts ticks to usecs (and vice versa) does a shift instead of a divide 
or multiply. On the nrf51 this can lead to quite a large savings in time. Using 
the C library 64-bit divide routine that mynewt uses, it takes about 60 usecs 
to perform this divide. When we shift a 64-bit number to perform the divide 
this time gets down to 4 or 5 usecs (slightly more than an order of magnitude 
savings!). Of course, on faster processors or processors that support faster 
divides this might be a moot point, but for those using the nrf51 it is not.

Now you may say “you could have done the same thing in your current HAL cputime 
with a 1 MHz clock”. In this case, the routine to “convert” ticks to usecs (and 
vice versa) would simply return the number passed in. I would like to make this 
change as well personally. Seems quite a big win (and would also save some code 
space too!).

Comments?

Will



Reply via email to