Re: [PATCH] [RFC] profiler updates

Amos Jeffries Wed, 06 Jun 2012 18:25:47 -0700

On 07.06.2012 10:29, Alex Rousskov wrote:

On 06/05/2012 06:21 PM, Amos Jeffries wrote:
I have an incomplete patch to fix bug 1961 (CPU profiler crashingsquid
whenever AIO or other threads are used).
I am not sure whether to treat this as a [PATCH] or [RFC] so I will
treat this as [PREVIEW].
So far I have a timer class whose constructor/destructor perform the
timer setup and stats recording. That is attached.
+void
+XProfilerNode::stop()
+{
+    stopped_ = get_tick();
Please make sure that stop() does nothing when called twice. I thinkyou
can use the "running" member for that. Currently, all stop()s will be
called twice.

IIRC, there were a few exceptions where declaring an XProfilerNode
object using old macros did not work because it added two objectswiththe same name. Perhaps there are no more such occurrences but justkeepin mind that sometimes you might need to declare or stopXProfilerNodes
manually.


Added a PROF_restart(X) macro.

+    if (delta < head->best)
+        head->best = delta;
+    if (delta > head->worst)
+        head->worst = delta;
+    head->summ += delta;
+    head->count++;
+    head->overheads += get_tick() - stopped_;

The above XProfilerNode::stop() code is not thread-safe becausemultiple

threads may update the same "head" and you are not using atomics for
*head members.


Ok, for now this is the second major TODO.
I'm torn between GCC __sync_* and C++11 atomic_* types.

There is maybe a wrapper compat layer needed temporarily I think.

+        hrtime_t overheads;
         int64_t summ;
If "summ" is int64_t then perhaps overheads should be as well? Theyboth
seem to accumulate time values.


Done.

+ // init stopped to prevent negative times occuring ifanything goes wrong.
+        stopped_ = started_ = get_tick();
Please use "running_" to detect still-running nodes instead ofassigning
bogus values in hope to avoid other bogus values.


Replaced using started_ as suggested below.

+    hrtime_t stopped_;    ///< when this timer was stopped
+ hrtime_t overheads_; ///< how much time has measurably beenspent on profiler internal code.
Do you actually need these two data members?

stopped_ we don't. overheads_ is to account for timer management cycleswhich need to be reported as they may inflate very-short timer runs.Rather than estimating removals and adjusting the actual details I'mplanning to just report the total running time and the overheadswastage.

 If we find it useless or difficult to use it can be dropped as well.

+ bool running_; ///< whether this timer is still active,or been stopped already
+    hrtime_t started_;    ///< when this timer was started


You may want to declare "running_" last. It should not hurt and might
save a few bytes.

Also, please consider removing running_ altogether and use positive
started_ as an "Am I running?" test instead. Inline that test as a
running() method.


Done.

+// explicitly stop the timer. This is now optional.
+#define PROF_stop(probename) XProfTimer_##probename.stop()
Not only it is optional, it is wasteful and leads to
incomplete/incorrect measurements in many cases.
Please consider removing nearly all PROF_stop() calls from Squidsourcesas a separate followup patch. Please do not do it now though, as itwill
screw up the pending ACL and bump-server-first patches.


Will do after these core changes are finalized.

The old profiler was done in a way which allowed live run-timesnapshot
cachemgr report to be taken without stopping any timers. I've hooked
these classes into that reporting structure, but at present theywill
result in the snapshots only recording *finished* timers, not the
currently active incomplete ones.
Not a big deal in most cases, IMHO.

Ack. The few worrying ones are the "unaccounted" and other long-livedoperations which I'm going to swap to being time in main() and time inmain loop.

I'm thinking keeping the stack/list of timers from the old designbutonly using it to dump a list of "currently incomplete" functioncalls
instead of using it to add incremental bits of timers to the report.
I am not sure I understand: If you are willing to write thread-safecodeto safely iterate a stack of running timers, then why not reporttheirtime spent so far? It does not add complexity or much overhead afteryou
are done with the code that safely lists those timers.

I was wondering if its worth the effort to write that thread-safeiteration, or if anyone has a better idea such as ...

Does anyone have any other inspirational thoughts about how to keepthatkind of detail peeking using these classes in a multi-threaded soupof
code?
If you want to prevent iteration of current timers, then you can make
all start() methods increment a global (Starts) by their start timeandall stop methods subtract their start time from that same global. Atany
time, to approximate the time already spent in all currently running
timers, you can then:

    expected_minimum_contribution_of_all_running_timers =
      Starts - number_of_running_timers*get_tick();

I am not sure that anticipated addition from running timers is the
information you are looking for though.


Hmm. Per-type this is exactly what I'm looking for. Thank you.

Amos

Re: [PATCH] [RFC] profiler updates

Reply via email to