On 07.06.2012 10:29, Alex Rousskov wrote:
On 06/05/2012 06:21 PM, Amos Jeffries wrote:
I have an incomplete patch to fix bug 1961 (CPU profiler crashing squid
whenever AIO or other threads are used).

I am not sure whether to treat this as a [PATCH] or [RFC] so I will
treat this as [PREVIEW].


So far I have a timer class whose constructor/destructor perform the
timer setup and stats recording. That is attached.

+void
+XProfilerNode::stop()
+{
+    stopped_ = get_tick();

Please make sure that stop() does nothing when called twice. I think you
can use the "running" member for that. Currently, all stop()s will be
called twice.

IIRC, there were a few exceptions where declaring an XProfilerNode
object using old macros did not work because it added two objects with the same name. Perhaps there are no more such occurrences but just keep in mind that sometimes you might need to declare or stop XProfilerNodes
manually.


Added a PROF_restart(X) macro.


+    if (delta < head->best)
+        head->best = delta;
+    if (delta > head->worst)
+        head->worst = delta;
+    head->summ += delta;
+    head->count++;
+    head->overheads += get_tick() - stopped_;

The above XProfilerNode::stop() code is not thread-safe because multiple
threads may update the same "head" and you are not using atomics for
*head members.

Ok, for now this is the second major TODO.
I'm torn between GCC __sync_* and C++11 atomic_* types.

There is maybe a wrapper compat layer needed temporarily I think.




+        hrtime_t overheads;
         int64_t summ;

If "summ" is int64_t then perhaps overheads should be as well? They both
seem to accumulate time values.

Done.



+ // init stopped to prevent negative times occuring if anything goes wrong.
+        stopped_ = started_ = get_tick();

Please use "running_" to detect still-running nodes instead of assigning
bogus values in hope to avoid other bogus values.

Replaced using started_ as suggested below.



+    hrtime_t stopped_;    ///< when this timer was stopped
+ hrtime_t overheads_; ///< how much time has measurably been spent on profiler internal code.

Do you actually need these two data members?

stopped_ we don't. overheads_ is to account for timer management cycles which need to be reported as they may inflate very-short timer runs. Rather than estimating removals and adjusting the actual details I'm planning to just report the total running time and the overheads wastage.
 If we find it useless or difficult to use it can be dropped as well.



+ bool running_; ///< whether this timer is still active, or been stopped already
+    hrtime_t started_;    ///< when this timer was started

You may want to declare "running_" last. It should not hurt and might
save a few bytes.

Also, please consider removing running_ altogether and use positive
started_ as an "Am I running?" test instead. Inline that test as a
running() method.


Done.


+// explicitly stop the timer. This is now optional.
+#define PROF_stop(probename) XProfTimer_##probename.stop()

Not only it is optional, it is wasteful and leads to
incomplete/incorrect measurements in many cases.


Please consider removing nearly all PROF_stop() calls from Squid sources as a separate followup patch. Please do not do it now though, as it will
screw up the pending ACL and bump-server-first patches.


Will do after these core changes are finalized.


The old profiler was done in a way which allowed live run-time snapshot
cachemgr report to be taken without stopping any timers. I've hooked
these classes into that reporting structure, but at present they will
result in the snapshots only recording *finished* timers, not the
currently active incomplete ones.

Not a big deal in most cases, IMHO.

Ack. The few worrying ones are the "unaccounted" and other long-lived operations which I'm going to swap to being time in main() and time in main loop.



I'm thinking keeping the stack/list of timers from the old design but only using it to dump a list of "currently incomplete" function calls
instead of using it to add incremental bits of timers to the report.

I am not sure I understand: If you are willing to write thread-safe code to safely iterate a stack of running timers, then why not report their time spent so far? It does not add complexity or much overhead after you
are done with the code that safely lists those timers.


I was wondering if its worth the effort to write that thread-safe iteration, or if anyone has a better idea such as ...


Does anyone have any other inspirational thoughts about how to keep that kind of detail peeking using these classes in a multi-threaded soup of
code?

If you want to prevent iteration of current timers, then you can make
all start() methods increment a global (Starts) by their start time and all stop methods subtract their start time from that same global. At any
time, to approximate the time already spent in all currently running
timers, you can then:

    expected_minimum_contribution_of_all_running_timers =
      Starts - number_of_running_timers*get_tick();

I am not sure that anticipated addition from running timers is the
information you are looking for though.

Hmm. Per-type this is exactly what I'm looking for. Thank you.

Amos

Reply via email to