>>>    ld4.acq r28 = [r29]     // xtime_lock.sequence. Must come first for 
 >>> locking purposes
 >>> +  ;;
 >>>  (p8)      mov r2 = ar.itc         // CPU_TIMER. 36 clocks latency!!!

The .acq only causes ordering w.r.t. data accesses.  The read from ar.itc
isn't a data access, so potentially it could still float before the
ld4.acq.  Consuming the value loaded into r28 presumably has to
ensure that the load completes though.

I'm guessing here ... I haven't cross-checked with the architects.

Does moving the "and r28 = ~1,r28" up into this slot hurt latency
for a single call to gettimeofday()?  Presumably it will if
xtime_lock.sequence is not in the cache.

-Tony
-
To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to