On Thu, Oct 17, 2013 at 6:49 PM, Ben Karel <[email protected]> wrote:

> Last time I checked, LLVM provided *zero* assist for this.
>
> According to my 40 second Internet Archive query, LLVM has had documented
> stack map support since 2007. You knew this once upon a time ;) *
> https://groups.google.com/d/msg/llvm-dev/M4HOyteR4J4/VNOQX0Q7YysJ*
>

OK. So not quite zero. But if the story is that I can't registerize a lot
of stuff, that's not a story. And the LLVM/alloca trick *is* a variant on
the Henderson method.


>
> My objection isn't to the idea of registers -- I'm quite fond of $ebp
> myself -- but the semantic confusion that comes from the different notion
> of the word "register" in the context of a concrete ISA versus LLVM's IR.
>

In abstract, we should view each pass of the compiler as emitting register
names drawn from a new namespace. In practice we cheat in a real compiler
design, but that's the theory view.

I've been blurring the terms intentionally, though. When all's done, you
get to the back end and *some* of the IR registers get mapped to hard
registers while others get mapped to memory locations. The point is that
you need to track the "class" of each of these registers across each such
mapping.

But your point is well taken, so maybe I should be more careful. The bottom
line is that when it's all done I need to know, for every location, whether
that location does or does not hold a reference. Hardware registers, for
this purpose, are locations.



> So that's why I suggested that saying "register" alone, without being
> clearer about ISA register vs SSA binding, can muddy the waters and lead to
> confusion.
>

For most conversations I agree with you. But in this case what we are
really talking about is the type of a value denoted by either an ISA
register or a hardware register. What I'm really arguing for here is that
we want a very low-level type system associated with both kinds of
registers.


> Do you have a specific example of LLVM IR which is sub-optimally compiled
> due to the lack of what you have proposed? Attempting to find such an
> example might be a valuable exercise for the skeptical reader. ;-)
>

No. But I have a fairly long list of (a) optimizations that cannot be done
correctly without this low-level type information, and (b) conventional
optimizations that are likely to be done *incorrectly* without this
low-level type information.  Start with the list in the paper by Boehm and
Chase about interactions between optimization and GC.:

  A Proposal for Garbage-Collector-Safe C Compilation

In modern compilers, some of the things they worry about have been dealt
with (provided the front end is willing to do enough work) and some have
not.

Let me give just one naive example: there are expression eliminations you
can legally do during loop unrolling with C pointers that you really
*don't* want
to do with object references. If you don't know which is which, then you
don't have enough information to apply the optimizations (or not) under the
proper conditions.

Something else to say here: LLVM could be a lot better than any other safe
language compiler and still not be a good safe language compiler. These
languages haven't been a focus of optimizer research at the same scale as
explicit storage languages. There's *lots* of room for improvement.


shap
_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev

Reply via email to