>> Regarding C++ templates, the compiler doesn't use them. If u_vector
>> (Dave Airlie?) provides the same functionality as your array, I
>> suggest we use u_vector instead.
> Let me repeat what you just wrote, because it is unbelievable: You are
> advising the use of non-templated collection types in C++ code.

Are you able to find any templates anywhere in the GLSL compiler?  I
don't think his statement was ambiguous.

>> If you can't use u_vector, you should
>> ask for approval from GLSL compiler leads (e.g. Ian Romanick or
>> Kenneth Graunke) to use C++ templates.
> - You are talking about coding rules some Mesa developers agreed upon
> and didn't bother writing down for other developers to read

It was mostly written down, but it's not documented in the code base.
It seems impossible to even get current, de facto practices documented.
It's one of the few things in Mesa that really does get bike shedded.

Before the current GLSL compiler, there was no C++ in Mesa at all.
While developing the compiler, I found that I was re-implementing
numerous C++ features by hand in C.  It felt pretty insane.  Why am I
filling out all of these virtual function tables by hand?

At the same time, I also observed that almost 100% of shipping,
production-quality compilers were implemented using C++.  The single
exception was GCC.  The need for GCC to bootstrap on minimal, sometimes
dire, C compilers was the one thing keeping C++ out of the GCC code
base.  It wasn't even that long ago that core parts of GCC had to
support pre-C89 compilers.  As far as I am aware, they have since
started using C++ too.  Who am I to be so bold as to declare that
everyone shipping a C compiler is wrong?

In light of that, I opened a discussion about using C++ in the compiler.

Especially at that time (2008-ish), nobody working on Mesa was
particularly skilled at C++.  I had used it some, and, in the mid-90's,
had some really, really bad experiences with the implementations and
side-effects of various language features.  I still have nightmares
about trying to use templates in GCC 2.4.2.  There are quite a few C++
features that are really easy to misuse.  There are also a lot of
subtleties in the language that very few people really understand.

I don't mean this in a pejorative way, but there was and continues to be
a lot of FUD around C++.  I think a lot of this comes from the "Old
Woman Who Swallowed a Fly" nature of solving C++ development problems.
You have a problem.  The only way to solve that problem is to use
another language feature that you may or may not understand how to use
safely.  You use that feature to solve your problem.  Use of that
feature presents a new problem.  The only way to solve the new problem
is to use yet another language feature that you may or may not
understand how to use safely.  Pretty soon nobody knows how anything in
the code works.

After quite a bit of discussion on the mesa-dev list, on #dri-devel, and
face-to-face at XDC, we decided to use C++ with some restrictions.  The
main restriction was that C++ would be limited to the GLSL compiler
stack.  The other restrictions were roughly similar to the embedded C++

    - No exceptions.

    - No RTTI.

    - No multiple inheritance.

    - No operator overloading.  It could be argued that our use of
      placement new deviates from this.  In the previous metaphor, I
      think this was either the spider or the bird.

    - No templates.

There are other restrictions (e.g., no STL) that come as natural
consequences of these.

Our goal was that any existing Mesa developer should be able to read any
piece of new C++ code and know what it was doing.

I feel like, due to our collective ignorance about the language, we may
have been slightly too restrictive.  It seems like we could have used
templates in some very, very restricted ways to enable things like
iterators that would have saved typing, encouraged refactoring, and made
the code more understandable.  Instead we have a proliferation of
foreach macros (or callbacks), and every data structure is a linked
list.  It's difficult to say whether it would have made things strictly
better or led us to swallow a bird, a cat, a dog...

I also feel like that ship has sailed.  When NIR was implemented using
pure C, going so far as to re-invent constructors using macros, the
chances of using more C++ faded substantially.  If, and that's a really,
really big if, additional C++ were to be used, it would have to be
preceded by patches to docs/devinfo.html that documented:

    - What features were to be used.

    - Why use of those features benefit the code base.  Specifically,
      why use of the new feature is substantially better than a
      different implementation that does not use the feature.

    - Any restrictions on the use of those features.

Such a discussion may produce additional alternatives.

> - I am not willing to use u_vector in C++ code

Here's the thing... Mesa is a big code base.  Maintenance is a big deal.
 Fixing bugs, refactoring code, and tuning performance account for most
of the time people spend working on Mesa.  If a tool exists that fits a
need, it should be used.  Having multiple implementations of similar,
basic functionality is a hassle for everyone involved.  A lot of work
has been done over the last couple years to move things up into
src/util.  Duplicate implementations of hash tables, sets, math
functions, and other things have all been reduced.  This is a good
trend, and it should continue.

I've never seen any of the u_vector code or interfaces, but here is what
I know.  If you re-invent u_vector now with a single user, it just means
that someone will come along and refactor your code to use it later.  Is
fast_list really substantially better than the thing that already has
users?  If it is, why can those ideas not be applied to u_vector to make
it better for the existing users?

>> I'll repeat some stuff about profiling here but also explain my perspective.
> So far (which may be a year or so), there is no indication that you
> are better at optimizing code than me.
>> Never profile with -O0 or disabled function inlining.
> Seriously?

If you don't let the compiler do it's job, you can really only measure
the O() of your algorithm.  The data about malloc and free calls is
useful.  You can't really draw many conclusions about the real
performance with -O0.  It's pretty fundamental to the process.

>> Mesa uses -g -O2
>> with --enable-debug, so that's what you should use too. Don't use any
>> other -O* variants.

At least at one time Fedora built with

    -O2 -g -pipe -fstack-protector-strong --param=ssp-buffer-size=4 \
    -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -grecord-gcc-switches \
    -m64 -mtune=generic


    -O2 -g -pipe -fstack-protector-strong --param=ssp-buffer-size=4 \
    -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -grecord-gcc-switches \
    -m32 -march=i686 -mtune=atom -fasynchronous-unwind-tables

Those are what I use for performance testing.

> What if I find a case where -O2 prevents me from easily seeing
> information necessary to optimize the source code?

Then you have rediscovered the Heisenberg uncertainty principle.  It's
one of the things that makes real performance work really hard.  By and
large, the difficult performance problems require you to infer things
from various bits of collected data rather than directly observing them.

>> The only profiling tools reporting correct results are perf and
>> sysprof.
> I used perf on Metro 2033 Redux and saw do_dead_code() there. Then I
> used callgrind to see some more code.
>> (both use the same mechanism) If you don't enable dwarf in
>> perf (also sysprof can't use dwarf), you have to build Mesa with
>> -fno-omit-frame-pointer to see call trees. The only reason you would
>> want to enable dwarf-based call trees is when you want to see libc
>> calls. Otherwise, they won't be displayed or counted as part of call
>> trees. For Mesa developers who do profiling often,
>> -fno-omit-frame-pointer should be your default.
>> Callgrind counts calls (that one you can trust), but the reported time
>> is incorrect,
> Are you nuts? You cannot be seriously be assuming that I didn't know about 
> that.
>> because it uses its own virtual model of a CPU. Avoid it
>> if you want to measure time spent in functions.
> I will *NOT* avoid callgrind because I know how to use it to optimize code.
>> Marek
> As usual, I would like to notify reviewers&mergers of this path that I
> am not willing to wait months to learn whether the code will be merged
> or rejected.
> If it isn't merged by Thursday (2016-oct-20) I will mark it as
> rejected (rejected based on personal rather than scientific grounds).
> Jan
