Re: [bitc-dev] LLVM and GC

Ben Karel Mon, 11 Apr 2011 08:37:51 -0700

Gmail is as Gmail does...

On Mon, Apr 11, 2011 at 1:21 AM, Jonathan S. Shapiro <[email protected]>
 wrote:


> On Sun, Apr 10, 2011 at 8:36 PM, Ben Karel <[email protected]> wrote:
>
>  In any case, it's an example of the kind of thing where an optimization
>>> pass written by someone who is thinking in terms of non-managed source
>>> languages will be written in such a way that you end up re-writing the pass
>>> when you have to accommodate the new case. We are then presented with the
>>> difficulty of designing test cases that will let us determine which passes
>>> are currently "relocating GC safe" and which passes are not.
>>>
>>
>> I don't fully understand the example, so I don't know what LLVM winds up
>> doing. But I don't see anything an optimization pass must do for GC
>> correctness beyond preserving the same semantics that needed to be preserved
>> without GC primitives.
>>
>
> Then indeed you don't understand the example. I think it's worth the time
> to get this point clear, so let me work an example. First, let's start with
> the source code. Suppose we have:
>
>   struct S { ... }; // something size > 16 bytes
>
>   ...
>
>   void f(S *s_vec, int vecSize) {
>     for (int i = 0; i < vecSize; i++)
>       s_vec[i].i = 5
>   }
>
> Note first that because sizeof(S) > 16, we cannot use scaled-index
> addressing modes on hardware that provides it, and we don't want to multiply
> /i/ by /sizeof(S)/ in each iteration (because multiply is expensive). In
> consequence, the *standard* optimization for this will convert into
> something that looks like:
>
>   void f(S *s_vec, int vecSize) {
>     int i = 0;
>     while(i < vecSize)
>       s_vec->i = 5;
>       s_vec++;  // that is: increment by sizeof(S)
>   }
>
> It will also unroll the loop, but that's not important to explain the issue
> at hand.
>
> Note than when i>=1, s_vec does not point to the base of the vector, which
> is the base of the object that we need to mark for GC.
>
> First problem: a typical GC runtime relies on using a negative offset from
> the object base pointer in order to find the object header for mark
> purposes, but there is no live object base pointer when i >= 1.
>

This is, I think, the root of your misunderstanding. There is a live object
base pointer if and only if the front-end declares that it needs it by
passing the SSA variable to llvm.gcwrite. See below...


> And it is altogether possible that the argument passed from above the call
> to f() is dead on return from the call, and therefore does not appear in any
> GC register map above this one. The simple solution in this case is to save
> a base pointer and use that for mark purposes, but a conventional optimizer
> will see no need to do this.
>
> Second problem: even if we save a base pointer for mark purposes, we still
> need to ensure that s_vec gets updated by the proper delta whenever the
> vector is relocated by GC. In order to do this, we need to know that s_vec
> is an interior reference into the object named by s_vec_save.
>
> Either that, or we need to do interior pointer lookup, which is (a) *
> extremely* expensive, and (b) based on heuristic assumptions about things
> the optimizer will not do in the course of loop unrolling - when loop
> unrolling is applied aggressively, it's very possible that s_vec will end up
> pointing substantially *past* the body of the vector.
>
>
> Is that expansion enough to see the problem? Also, do you see why an
> unrolling algorithm written for an unmanaged language has no need to record
> this information?
>

So the example you're showing is what I originally understood it to be, but
I fail to see why you think it's a problem. LLVM IR is SSA-based. Regardless
of whether the first or second implementation is used, the original value of
the s_vec parameter is available by definition, and the front end is
responsible for passing it to llvm.gcwrite if needed by the language's GC
runtime. So I don't see the problem with LLVM-to-LLVM IR optimizations being
in any special danger of violating the semantics used by the GC. And if
machine lowerings of the optimized LLVM IR do not respect the IR's
semantics, that is a miscompilation that has nothing to do with GC support.

As for the second problem: until LLVM gets liveness for stack roots, that's
handled by the front-end emitting code to reload GC roots after potential GC
points. Again, this is a question of respecting IR semantics: SSA registers
are never mutated; it's the stack slot that is mutated by the relocating GC.

The first and second variants generate the exact same X86 code, despite
different LLVM IRs representations (when compiled with clang++, at least;
llvm-gcc emits the same IR).

For illustration, try pasting the first snippet into http://llvm.org/demo/,
with and without optimization. Your example corresponds to this optimized
LLVM IR:

%struct.S = type { i32, i32, i32, i32, i32, i32, i32 }

define void @_Z1fP1Si(%struct.S* %s_vec, i32 %vecSize) {
entry:
  %0 = icmp sgt i32 %vecSize, 0
  br i1 %0, label %bb.nph, label %return

bb.nph:
  %tmp = zext i32 %vecSize to i64
  br label %bb

bb:
  %indvar = phi i64 [ 0, %bb.nph ], [ %indvar.next, %bb ]
  %scevgep = getelementptr %struct.S* %s_vec, i64 %indvar, i32 5
  store i32 5, i32* %scevgep, align 4
  %indvar.next = add i64 %indvar, 1
  %exitcond = icmp eq i64 %indvar.next, %tmp
  br i1 %exitcond, label %return, label %bb

return:
  ret void
}


Note that the s_vec pointer is not destructively updated, and is available
to pass to gcwrite (in place of store) if the GCed language front-end deems
it necessary.

I'm not clear whether the IR needs to somehow distinguish between "anchor"
> pointers and "derived" pointers or not - I haven't thought about it enough,
> and there may be an easier way. But looking at how C# handles "foreign"
> objects, there are clearly two low-level pointer types that need to be in
> play at the IR level. Adding a second type isn't that hard, but it probably
> touches a lot of code in LLVM.
>
>  It's also worth noting that there are neither required nor default
>>>> optimization passes, at either the LLVM or MC levels. It's all opt-in...
>>>>
>>>
>>> I agree, but ultimately that is a cop-out. The point of going to LLVM is
>>> that we get to take advantage existing infrastructure. If we are forced to
>>> turn the existing infrastructure off, then what's the point of using LLVM. I
>>> do understand that it isn't that black and white, but you see my point.
>>>
>>
>> More importantly, giving clients complete control over optimization passes
>> helps diagnose whatever miscompilations might occur.
>>
>
> Oh absolutely! I definitely agree that being in control of the passes is a
> good thing. But what you seemed to be suggesting was that we could, in
> effect, turn off the optimizer in order to avoid GC hazards in production.
> Which we could, but if we are forced to do that, the value of LLVM starts to
> decay quickly...
>
>  Having said that, Chris is a very smart guy, so maybe he will find a way.
>>>
>>
>> Out of curiosity, would you have predicted that they could build a full
>> C++ compiler, from scratch, in less than 4 years? I certainly wouldn't
>> have...
>>
>
> I would definitely have predicted that. Both of my C++ compiler groups did
> that, but I grant that C++ was a simpler language at the time.
>

Several hundred pages simpler, I should wager! ;-)



>  But the permanent vs. transitional distinction is quite real.
>>>
>>
>> There's also the difference that you're building a specific compiler,
>> while they're building a library for compiler writers. By building in C++
>> with C bindings, they maximize their potential client base.
>>
>
> Not quite. They are building an infrastructure for compiler writers who
> choose to implement their compilers in C/C++. It does not follow that this
> maximizes the client base. It may or may not. What it *does* do is maximize
> the immediate utility to Apple, which is important.
>

The only functionality that must be implemented in C++ to use LLVM is GC
plugins and analysis or optimization passes. The heavy lifting of generating
LLVM IR can be done via serialization or via the C bindings.

The decision to use C++ was made well before Apple got involved with LLVM.
And in any case, the reasons that C++ maximizes utility to Apple are the
exact same reasons it maximizes utility to the general population! It's
quite possible that LLVM would be less buggy if it was implemented in
Haskell instead of C++, but I think it goes without saying that that
decision would have crippled the project's adoption.

_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev

Re: [bitc-dev] LLVM and GC

Reply via email to