Re: [bug #40639] GNU Make with profiling information

Eddy Petrișor Tue, 14 Jan 2014 05:49:17 -0800

2014/1/12 Paul Smith <psm...@gnu.org>:
> On Wed, 2013-12-18 at 13:28 +0200, Eddy Petrișor wrote:
>> Could you please confirm if the general direction of the the is OK in
>> the latest patch I sent?
>
> Conceptually it seems OK.  I'm still not jazzed about having any more
> than one output format, and I'd prefer that format to be in a
> more-or-less readable form, more like the "long" form than the others.


If that will be the only format, then it would mean always imposing a lot
more more work on the information processing stage due to necessary
filtering and transformations to fit a format accepted or easily parsed in
a tool such as Oocalc, Gnumeric or Excel.

Although human readable seems nice and understandable, it is the least
machine parseable, hence my choice for the 'simple' format to be default.

>
> I think the output should go in the standard make output format, so
> something like:
>
>   make[<LVL>]: <target>: <details...>
>
> Or, alternatively:
>
>   <target>[<LVL>]: <details>
>
> Also I think it's enough to show the start offset and the elapsed time.
> End offset is not necessary IMO.

Unfortunately, depending on the used tools, when processing the information
the end time is necessary. For instance, in Microsoft Excel the only way to
display graphs for intervals is via a graph designed for visualising
stocks' variation (and it even forces the insertion of an extra field).

Oocalc is fine with start and stop, but for the graphs look awful and are
unusable with absolute values (it scales so the entire 0-timestamp interval
is visible, so a difference of a few seconds is invisible on the graph), so
relative values are better here.

OTOH, relative time stamps or just durations are useful for a human eye
examination, since is easy to spot offenders that way.


These are three different scenarios which I myself encountered and had in
mind when designing the code the way I did.

>
> I'm unsure about the PID.  This is the pid of the make process so I'm
> not sure what the goal is.  Is it to be able to collect all the times
> together maybe?

The goal is to be able to:
 - spot targets evaluated redundantly in a recursive makefile in different
processes but on the same recursion level (these targets could be
candidates to be moved in a parent make invocation)
 - spot wasteful/needless recursive calls (e.g. several targets called in
different make processes when they could be grouped in a single call)
 - be able to analyse a single make invocation or a call sub-tree

>
> Is it necessary to dump all the output times at the end?  Doing so
> requires that we increase the size of the file structure to hold the
> information, and this is already large AND the most common structure in
> memory; there's one for every single target which for non-recursive
> builds can get really big.  I'm trying to keep memory usage under
> control.
>
> If instead of that we print the information after each target is
> complete we can shift the storage of this information out of the file
> structure and into the commands structure or similar.  To me it seems
> more useful to keep the elapsed time info right next to the command
> output rather than dumping it all at the end.

I'm afraid I am missing some details of the implementation so I can't
answer that question in any meaningful way.
I will have to look into the code, but if a single target does NOT have
multiple commands structures, it should work.

Any pointers to the appropriate code area or  suggestions would be welcome.

> Some other comments:
>      1. In general remember that GNU make code must conform to ANSI
>         C89 / ISO C90.  We shouldn't be using newer features of the
>         language or runtime library unless we need to, and most of those
>         require some kind of autoconf test.

I'm sure you had some specific code in mind when you wrote this. I assumed
the build system would have the appropriate compiler options for the
desired compliance level. Should I compile with '-std=c99 -pedantic-errors'
to check, or do you have other options in mind?

>      2. Let's avoid float and double (and struct timeval).  There's no
>         reason why we can't fit enough precision in a uint32 to count
>         elapsed time in milliseconds for a build: that gives 50 days or
>         so.  GNU make still supports running on systems where there is
>         no floating point support (see the NO_FLOAT #define).  Although
>         I haven't tested it in a while.

I was aware of this since your first email. I wanted to know of the general
idea is OK.
I will change this, too.

>      3. The use of "$" tokens in printf() statements is likely
>         problematic from a portability standpoint.  It seems like this
>         should be relatively easy to avoid.

I'll change them into simple format specifiers, in spite of the repetitions.

>      4. If the printed string contains text then it needs to be marked
>         for translation (with the "_(...)" macro).

Since the profiling info should be machine parsable, I think the only
translatable string would be the long format. I will change this, although
the inconsistency rubs me the wrong way.

>      5. We don't want to be using fprintf() here.  All output needs to
>         go through the output.c module, so that it's properly managed
>         via output sync.

I was aware of the output sync issue, I will look into it.

>      6. gettimeofday is not portable.  Also, it's not really the best
>         option for timing things because (due to NTP etc.) it can change
>         (by that I mean that if it reports 10s elapsed it might not be

Hmm, didn't thought of that.

>         that 10s really elapsed).  Using a monotonic clock is better,
>         although that's also not portable.  But if we have to be
>         non-portable maybe we should try to get an accurate accounting.
>         On the gripping hand maybe it's not that important to be
>         absolutely accurate.

Is there a portable timing API that has us resolution? Not sure if ms
resolution is enough given the speed of current system and Moore's Law
predicting even faster systems.

>
> You mentioned something about trying to send the start time through the
> environment but I don't see any code to that effect here; how were you
> doing that?
>

As said in my previous mail, I wanted to avoid confusion regarding what
code to review.

====

In reply to your newest mail:

> Sorry, I was not clear: I wasn't suggesting that it would be better to
> display the absolute time for the start time.  I was wondering why we
> display the start time at all.  Why not just show the elapsed time, and
> nothing else?  That would avoid all of these issues.
>
> However Tim makes a reasonable point in his response so if it can be
> done without too much difficulty it would be good to show a relative
> start time.

Also, the start time is necessary to be able to see which target was
waiting for which in a graph. This way one can see a target starting later
than it should logically start, making it a good avenue for performance
improvement investigations.

-- 
Eddy Petrișor

_______________________________________________
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Re: [bug #40639] GNU Make with profiling information

Reply via email to