On Thu, Oct 20, 2011 at 7:11 PM, Xinliang David Li <davi...@google.com> wrote:
> On Thu, Oct 20, 2011 at 1:21 AM, Richard Guenther
> <richard.guent...@gmail.com> wrote:
>> On Thu, Oct 20, 2011 at 1:33 AM, Andi Kleen <a...@firstfloor.org> wrote:
>>> x...@google.com (Rong Xu) writes:
>>>
>>>> After some off-line discussion, we decided to use a more general approach
>>>> to control the printing of optimization messages/warnings. We will
>>>> introduce a new option -fopt-info:
>>>>  * fopt-info=0 or fno-opt-info: no message will be emitted.
>>>>  * fopt-info or fopt-info=1: emit important warnings and optimization
>>>>    messages with large performance impact.
>>>>  * fopt-info=2: warnings and optimization messages targeting power users.
>>>>  * fopt-info=3: informational messages for compiler developers.
>>
>> This doesn't look scalable if you consider that each pass would print
>> as much of a mess like -fvectorizer-verbose=5.
>
> What is not scalable? For level 1 dump, only the summary of
> vectorization will be printed just like other loop transformations.
>
>>
>> I think =2 and =3 should be omitted - we do have dump-files for a reason.
>
> Dump files are not easy to use -- it is big, and slow especially for
> people with large distributed build systems.  Having both level 2 and
> 3 is debatable, but it will be useful to have a least one level above
> level 1. Dump files are mainly for compiler developers, while
> -fopt-info are for compiler developers *and* power users who know
> performance tuning.
>>
>> Also the coverage/profile cases you changed do not at all match
>> "... with large performance impact".  In fact the impact is completely
>> unknown (as it would be the case usually).
>
> Impact of any transformations is just 'potential', coverage problems
> are no different from that.
>
>>
>> I'd rather have a way to make dump-files more structured (so, following
>> some standard reporting scheme) than introducing yet another way
>> of output.  [after making dump-files more consistent it will be easy
>> to revisit patches like this, there would be a natural general central
>> way to implement it]
>
> Yes, I remember we have discussed about this before -- currently dump
> files are a big mess -- debug tracing, IR are all mixed up, but as I
> said above, this is a different matter -- it is for compiler
> developers.
>
> For more structured optimization report, we should use option
> -fopt-report which dump optimization information based on category --
> the info data base can also be shared across modules:
>
> Example:
>
> [Loop Interchange]
> File a, line x,   yyyyyyy
> File b, line xx, yyyyyyy
> ....
> File c, line z,   It is beneficial to interchange the loop, but not
> done because of possible carried dependency (caused by false aliasing
> ...)
>
> [Loop Vectorization]
> ....
>
> [Loop Unroll]
> ...
>
> [SRA]
>
> [Alias summary]
>  [Global Vars]
>   a: addr exposed
>   b: add not exposed
>   ..
>  [Global Pointers]
>    ..
>  ...

I very well understand the intent.  But I disagree with where you start
to implement this.  Dump files are _not_ only for developers - after
all we don't have anything else.  -fopt-report can get as big and unmanagable
to read as dump files - in fact I argue it will be worse than dump files if
you go beyond very very coarse reporting.

Yes, dump files are a "mess".  So - why not clean them up, and at the
same time annotate dump file pieces so _automatic_ filtering and
redirecting to stdout with something like -fopt-report would do something
sensible?  I don't see why dump files have to stay messy while you at
the same time would need to add _new_ code to dump to stdout for
-fopt-report.

So, no, please do it the right way that benefits both compiler developers
and your "power users".

And yes, the right way is not to start adding that -fopt-report switch.
The right way is to make dump-files consumable by mere mortals first.

Thanks,
Richard.

>
> Thanks,
>
> David
>
>>
>> So, please fix dump-files instead.  And for coverage/profiling, fill
>> in stuff in a dump-file!
>>
>> Richard.
>>
>>> It would be interested to have some warnings about missing SRA
>>> opportunities in =1 or =2. I found that sometimes fixing those can give a
>>> large speedup.
>>>
>>> Right now a common case that prevents SRA on structure field
>>> is simply a memset or memcpy.
>>>
>>> -Andi
>>>
>>>
>>> --
>>> a...@linux.intel.com -- Speaking for myself only
>>>
>>
>

Reply via email to