> > +static inline odp_time_t time_hw_cur(void)
> > {
> > - odp_time_t cur;
> > + odp_time_t time;
> >
> > - do {
> > - cur = time_local();
> > - } while (time_cmp(time, cur) > 0);
> > + time.hw.count = cpu_global_time() - global.hw_start;
>
> Computing the offset is unnecessarily expensive. The simplest and lowest
> overhead solution is to just store the value read from HW and convert
> at a later point in time. But, this no longer represents what odp_time_t
> represents. That is why I introduced odp_tick_t in the timer RFC and
> design doc posted to the list *several* times.
Purpose of this set is not to change the API, but optimize the implementation.
The most optimal solution would be to zero the HW counter in ODP time init and
then here return the register value.
If this function is called often, global.hw_start stays in L1 cache (it's a
constant) and overhead of the subtract is a matter of a CPU cycle or two. API
change for a CPU cycle or two is not economical. It does not matter too much in
practice, if the subtract is done here or during conversion to nsec. The other
changes of this set matter more: TSC counter vs system call, and 128 bits vs 64
bits storage (for memory foot print). Also further optimizations are always
possible, but this level of changes are needed for the current API to use HW
counter and pack timespec into 64 bits. For example, a next step could be to
inline these functions.
-Petri