> > +static inline odp_time_t time_hw_cur(void)
> >  {
> > -   odp_time_t cur;
> > +   odp_time_t time;
> >
> > -   do {
> > -           cur = time_local();
> > -   } while (time_cmp(time, cur) > 0);
> > +   time.hw.count = cpu_global_time() - global.hw_start;
> 
> Computing the offset is unnecessarily expensive. The simplest and lowest
> overhead solution is to just store the value read from HW and convert
> at a later point in time. But, this no longer represents what odp_time_t
> represents. That is why I introduced odp_tick_t in the timer RFC and
> design doc posted to the list *several* times.

Purpose of this set is not to change the API, but optimize the implementation.

The most optimal solution would be to zero the HW counter in ODP time init and 
then here return the register value.

If this function is called often, global.hw_start stays in L1 cache (it's a 
constant) and overhead of the subtract is a matter of a CPU cycle or two. API 
change for a CPU cycle or two is not economical. It does not matter too much in 
practice, if the subtract is done here or during conversion to nsec. The other 
changes of this set matter more: TSC counter vs system call, and 128 bits vs 64 
bits storage (for memory foot print). Also further optimizations are always 
possible, but this level of changes are needed for the current API to use HW 
counter and pack timespec into 64 bits. For example, a next step could be to 
inline these functions. 

-Petri


Reply via email to