On Tue, 2010-10-19 at 21:03 +0200, Stephane Eranian wrote: > >> Ok, I missed that. But I don't understand why you need the lock to > >> udpate the time. The lower-level clock is lockless if I recall. Can't you > >> use an atomic ops in update_context_time()? > > > > atomic ops would slow down those code paths, also, I don't think you can > > fully get the ordering between ->tstamp_$foo and ->total_time_$foo just > > right. > > > > I don't get that. Could you give an example?
Take update_context_time(), it has: now = perf_clock(); ctx->time += now - ctx->timestamp; ctx->timestamp = now; If you interleave two of those you get: ctx->timestamp = T0; now = perf_clock(); /* T1 */ ctx->time += now - ctx->timestamp; now = perf_clock(); /* T2 */ ctx->time += now - ctx->timestamp; ctx->timestamp = now; ctx->timestamp = now; So at this point you would expect timestamp = T2 and time += T2-T0. Except that: time += T1 - T0 + T2 - T0 != T2 - T0 and timestamp = T1 You can of course write it as something like x86_perf_event_update(), but then there's trying to keep total_time_running and total_time_enabled in sync. > > Not sure, but barring 64bit atomics for all these, 32bit archs and NMI > > are going to be 'interesting' > > > > Every sample needs to be correct, otherwise you run the risk of introducing > bias. > > I think if the tradeoffs is correctness vs. speed, I'd choose correctness. Well, yes, but it sucks, esp. since its only relevant for PERF_SAMPLE_READ. ------------------------------------------------------------------------------ Download new Adobe(R) Flash(R) Builder(TM) 4 The new Adobe(R) Flex(R) 4 and Flash(R) Builder(TM) 4 (formerly Flex(R) Builder(TM)) enable the development of rich applications that run across multiple browsers and platforms. Download your free trials today! http://p.sf.net/sfu/adobe-dev2dev _______________________________________________ perfmon2-devel mailing list perfmon2-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/perfmon2-devel