hfinkel added inline comments.

================
Comment at: lib/CodeGen/CGDecl.cpp:1291
+// where the pointer to the local variable is the key in the map.
+void CodeGenFunction::EmitAutoVarNoAlias(const AutoVarEmission &emission) {
+  assert(emission.Variable && "emission was not valid!");
----------------
rjmccall wrote:
> This should really be a subroutine of EmitAutoVarAlloca; that will implicitly 
> pick this logic up for a bunch of less standard places in IRGen that create 
> local variables.  Please make it a static function (if possible) called 
> EmitNoAliasScope and call it immediately after EmitVarAnnotations.
Done (I didn't make it static, which is certainly possible, but didn't look 
much cleaner to me - let me know if you want it static anyway).


================
Comment at: lib/CodeGen/CGStmt.cpp:537
+                                      llvm::LLVMContext::MD_noalias),
+                                      NewScopeList));
+
----------------
rjmccall wrote:
> rjmccall wrote:
> > hfinkel wrote:
> > > rjmccall wrote:
> > > > This is a very strange representation.  Every memory operation in the 
> > > > lexical block is annotated with a list of all of the scopes that were 
> > > > entered within the block, even if they were entered after the 
> > > > operation.  But for some reason, not with nested scopes?
> > > > 
> > > > What's the right patch for me to read about this representation?
> > > Perhaps unfortunately, this is an artifact of the way that restrict is 
> > > defined in C. It applies to all accesses in the block in which the 
> > > variable is declared, even those before the declaration of the 
> > > restrict-qualified local itself.
> > > 
> > > It should work with nested scopes, in the sense that we add these things 
> > > as we complete each scope. So we add things to the inner scope, and then 
> > > when we complete the outer scope, we go back over the instructions 
> > > (including those in the inner scope because the scope recording recurses 
> > > up the scope hierarchy), and adds the outer scopes - it concatenates them 
> > > to any added by the inner (nested) scopes.
> > > 
> > > The noalias intrinsic's LangRef updates are in D9375.
> > Ah, I see the recursion now.  Please add a comment explaining that 
> > expectation here.
> I would still like this comment.  It should be enough to explain how 
> MemoryInsts actually gets filled: there's a callback from inserting an 
> instruction which adds all memory instructions to every active lexical scope 
> that's recording them.
Added.


================
Comment at: lib/CodeGen/CodeGenFunction.cpp:1986
+  if (I->mayReadOrWriteMemory())
+    recordMemoryInstruction(I);
 }
----------------
rjmccall wrote:
> This condition will include calls, which I assume is unnecessary (except 
> maybe memory intrinsics?).  More seriously, it will also include loads and 
> stores into allocas, which will sweep in all sorts of bookkeeping temporaries 
> that IRGen uses in more complex code-generation situations.  You might want 
> to consider ways to avoid recording this kind of instruction that are very 
> likely to just get immediately optimized away by e.g. mem2reg.
> 
> Please also modify LexicalScope so that it records whether there's any active 
> recording scope in the CodeGenFunction.  Lexical scopes are nested, so that 
> should be as easy as saving the current state when you enter a scope and 
> restoring it when you leave.  That will allow you to optimize this code by 
> completely bypassing the recursion and even the check for whether the 
> instruction is a memory instruction in the extremely common case of a 
> function with no restrict variables.
> 
> In general, a lot of things about this approach seem to have worryingly poor 
> asymptotic performance in the face of large functions or (especially) many 
> restrict variables.  (You could, for example, have chosen to have each memory 
> instruction link to its enclosing noalias scope, which would contain a link 
> to its enclosing scope and a list of the variables it directly declares.  
> This would be a more complex representation to consume, but it would not 
> require inherently quadratic frontend work building all these lists.)  The 
> only saving grace is that we expect very little code to using restrict, so it 
> becomes vital to ensure that your code is immediately short-circuited when 
> nothing is using restrict.
> This condition will include calls, which I assume is unnecessary (except 
> maybe memory intrinsics?). 

No, getting calls here was intentional. The AA implementation can specifically 
deal with calls (especially those that are known only to access memory given by 
their arguments).

> More seriously, it will also include loads and stores into allocas, which 
> will sweep in all sorts of bookkeeping temporaries that IRGen uses in more 
> complex code-generation situations. You might want to consider ways to avoid 
> recording this kind of instruction that are very likely to just get 
> immediately optimized away by e.g. mem2reg.

I agree, but this might be difficult to do in general. We should be able to 
avoid annotating accesses to allocas that don't escape. Do we have a sound way 
to determine at this stage whether a local variable has had its address taken?

> Please also modify LexicalScope so that it records whether there's any active 
> recording scope in the CodeGenFunction. Lexical scopes are nested, so that 
> should be as easy as saving the current state when you enter a scope and 
> restoring it when you leave. That will allow you to optimize this code by 
> completely bypassing the recursion and even the check for whether the 
> instruction is a memory instruction in the extremely common case of a 
> function with no restrict variables.

I added a variable to the scope to cache whether or not anything is being 
recorded. This avoids the quadratic recording-check behavior (unless some scope 
is actually recording something).

> In general, a lot of things about this approach seem to have worryingly poor 
> asymptotic performance in the face of large functions or (especially) many 
> restrict variables. (You could, for example, have chosen to have each memory 
> instruction link to its enclosing noalias scope, which would contain a link 
> to its enclosing scope and a list of the variables it directly declares. This 
> would be a more complex representation to consume, but it would not require 
> inherently quadratic frontend work building all these lists.) The only saving 
> grace is that we expect very little code to using restrict, so it becomes 
> vital to ensure that your code is immediately short-circuited when nothing is 
> using restrict.

We're definitely on the same page here. The reasons I didn't worry about this 
too much is that, as you note, we expect this to be rare, and moreover, the 
underlying metadata representation being created is itself quadratic 
(#restricts x #accesses). The good news is that this new representation is 
actually less verbose than the existing noalias metadata. The bad news is that, 
if this turns out to be a problem, and it could, then we'll need to either add 
cutoffs or design some different representation (or both).



================
Comment at: lib/CodeGen/CodeGenFunction.h:597
+    // those blocks) so that we can later add the appropriate metadata. Record
+    // this instruction and so the same in any parent scopes.
+    void recordMemoryInstruction(llvm::Instruction *I) {
----------------
rjmccall wrote:
> I would suggest this wording:
> 
>   /// Record the given memory access instruction in this and all of its 
> enclosing scopes.
>   /// When we close this scope, we'll add the list of all the 
> restrict-qualified local variables
>   /// it declared to all the memory accesses which occurred within it.  
> Recording is only
>   /// enabled under optimization, and it's disabled in a scope unless it 
> actually declares
>   /// any local restrict variables.
Sounds good (updated).


https://reviews.llvm.org/D9403



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
  • [PATCH] D9403: llvm.noalias - C... Hal Finkel via Phabricator via cfe-commits

Reply via email to