hfinkel added inline comments.
================ Comment at: lib/CodeGen/CGDecl.cpp:1291 +// where the pointer to the local variable is the key in the map. +void CodeGenFunction::EmitAutoVarNoAlias(const AutoVarEmission &emission) { + assert(emission.Variable && "emission was not valid!"); ---------------- rjmccall wrote: > This should really be a subroutine of EmitAutoVarAlloca; that will implicitly > pick this logic up for a bunch of less standard places in IRGen that create > local variables. Please make it a static function (if possible) called > EmitNoAliasScope and call it immediately after EmitVarAnnotations. Done (I didn't make it static, which is certainly possible, but didn't look much cleaner to me - let me know if you want it static anyway). ================ Comment at: lib/CodeGen/CGStmt.cpp:537 + llvm::LLVMContext::MD_noalias), + NewScopeList)); + ---------------- rjmccall wrote: > rjmccall wrote: > > hfinkel wrote: > > > rjmccall wrote: > > > > This is a very strange representation. Every memory operation in the > > > > lexical block is annotated with a list of all of the scopes that were > > > > entered within the block, even if they were entered after the > > > > operation. But for some reason, not with nested scopes? > > > > > > > > What's the right patch for me to read about this representation? > > > Perhaps unfortunately, this is an artifact of the way that restrict is > > > defined in C. It applies to all accesses in the block in which the > > > variable is declared, even those before the declaration of the > > > restrict-qualified local itself. > > > > > > It should work with nested scopes, in the sense that we add these things > > > as we complete each scope. So we add things to the inner scope, and then > > > when we complete the outer scope, we go back over the instructions > > > (including those in the inner scope because the scope recording recurses > > > up the scope hierarchy), and adds the outer scopes - it concatenates them > > > to any added by the inner (nested) scopes. > > > > > > The noalias intrinsic's LangRef updates are in D9375. > > Ah, I see the recursion now. Please add a comment explaining that > > expectation here. > I would still like this comment. It should be enough to explain how > MemoryInsts actually gets filled: there's a callback from inserting an > instruction which adds all memory instructions to every active lexical scope > that's recording them. Added. ================ Comment at: lib/CodeGen/CodeGenFunction.cpp:1986 + if (I->mayReadOrWriteMemory()) + recordMemoryInstruction(I); } ---------------- rjmccall wrote: > This condition will include calls, which I assume is unnecessary (except > maybe memory intrinsics?). More seriously, it will also include loads and > stores into allocas, which will sweep in all sorts of bookkeeping temporaries > that IRGen uses in more complex code-generation situations. You might want > to consider ways to avoid recording this kind of instruction that are very > likely to just get immediately optimized away by e.g. mem2reg. > > Please also modify LexicalScope so that it records whether there's any active > recording scope in the CodeGenFunction. Lexical scopes are nested, so that > should be as easy as saving the current state when you enter a scope and > restoring it when you leave. That will allow you to optimize this code by > completely bypassing the recursion and even the check for whether the > instruction is a memory instruction in the extremely common case of a > function with no restrict variables. > > In general, a lot of things about this approach seem to have worryingly poor > asymptotic performance in the face of large functions or (especially) many > restrict variables. (You could, for example, have chosen to have each memory > instruction link to its enclosing noalias scope, which would contain a link > to its enclosing scope and a list of the variables it directly declares. > This would be a more complex representation to consume, but it would not > require inherently quadratic frontend work building all these lists.) The > only saving grace is that we expect very little code to using restrict, so it > becomes vital to ensure that your code is immediately short-circuited when > nothing is using restrict. > This condition will include calls, which I assume is unnecessary (except > maybe memory intrinsics?). No, getting calls here was intentional. The AA implementation can specifically deal with calls (especially those that are known only to access memory given by their arguments). > More seriously, it will also include loads and stores into allocas, which > will sweep in all sorts of bookkeeping temporaries that IRGen uses in more > complex code-generation situations. You might want to consider ways to avoid > recording this kind of instruction that are very likely to just get > immediately optimized away by e.g. mem2reg. I agree, but this might be difficult to do in general. We should be able to avoid annotating accesses to allocas that don't escape. Do we have a sound way to determine at this stage whether a local variable has had its address taken? > Please also modify LexicalScope so that it records whether there's any active > recording scope in the CodeGenFunction. Lexical scopes are nested, so that > should be as easy as saving the current state when you enter a scope and > restoring it when you leave. That will allow you to optimize this code by > completely bypassing the recursion and even the check for whether the > instruction is a memory instruction in the extremely common case of a > function with no restrict variables. I added a variable to the scope to cache whether or not anything is being recorded. This avoids the quadratic recording-check behavior (unless some scope is actually recording something). > In general, a lot of things about this approach seem to have worryingly poor > asymptotic performance in the face of large functions or (especially) many > restrict variables. (You could, for example, have chosen to have each memory > instruction link to its enclosing noalias scope, which would contain a link > to its enclosing scope and a list of the variables it directly declares. This > would be a more complex representation to consume, but it would not require > inherently quadratic frontend work building all these lists.) The only saving > grace is that we expect very little code to using restrict, so it becomes > vital to ensure that your code is immediately short-circuited when nothing is > using restrict. We're definitely on the same page here. The reasons I didn't worry about this too much is that, as you note, we expect this to be rare, and moreover, the underlying metadata representation being created is itself quadratic (#restricts x #accesses). The good news is that this new representation is actually less verbose than the existing noalias metadata. The bad news is that, if this turns out to be a problem, and it could, then we'll need to either add cutoffs or design some different representation (or both). ================ Comment at: lib/CodeGen/CodeGenFunction.h:597 + // those blocks) so that we can later add the appropriate metadata. Record + // this instruction and so the same in any parent scopes. + void recordMemoryInstruction(llvm::Instruction *I) { ---------------- rjmccall wrote: > I would suggest this wording: > > /// Record the given memory access instruction in this and all of its > enclosing scopes. > /// When we close this scope, we'll add the list of all the > restrict-qualified local variables > /// it declared to all the memory accesses which occurred within it. > Recording is only > /// enabled under optimization, and it's disabled in a scope unless it > actually declares > /// any local restrict variables. Sounds good (updated). https://reviews.llvm.org/D9403 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits