bader added inline comments.

================
Comment at: clang/lib/Sema/Sema.cpp:1514
+  void visitUsedDecl(SourceLocation Loc, Decl *D) {
+    if (auto *TD = dyn_cast<TranslationUnitDecl>(D)) {
+      for (auto *DD : TD->decls()) {
----------------
yaxunl wrote:
> rjmccall wrote:
> > yaxunl wrote:
> > > rjmccall wrote:
> > > > erichkeane wrote:
> > > > > rjmccall wrote:
> > > > > > erichkeane wrote:
> > > > > > > Note that when recommitting this (if you choose to), this needs 
> > > > > > > to also handle NamespaceDecl.  We're a downstream and discovered 
> > > > > > > that this doesn't properly handle functions or records handled in 
> > > > > > > a namespace.
> > > > > > > 
> > > > > > > It can be implemented identically to TranslationUnitDecl.
> > > > > > Wait, what?  We shouldn't be doing this for TranslationUnitDecl 
> > > > > > either.   I don't even know how we're "using" a 
> > > > > > TranslationUnitDecl, but neither this case not the case for 
> > > > > > `NamespaceDecl` should be recursively using every declaration 
> > > > > > declared inside it.  If there's a declaration in a namespace that's 
> > > > > > being used, it should be getting visited as part of the actual use 
> > > > > > of it.
> > > > > > 
> > > > > > The logic for `RecordDecl` has the same problem.  
> > > > > Despite the name, this seems to be more of a home-written ast walking 
> > > > > class.  The entry point is the 'translation unit' which seems to walk 
> > > > > through everything in an attempt to find all the functions (including 
> > > > > those that are 'marked' as used by an attribute).
> > > > > 
> > > > > You'll see the FunctionDecl section makes this assumption as well 
> > > > > (not necessarily that we got to a function via a call). IMO, this 
> > > > > approach is strange, and we should register entry points in some 
> > > > > manner (functions marked as emitted to the device in some fashion), 
> > > > > then just follow its call-graph (via the clang::CallGraph?) to emit 
> > > > > all of these functions.
> > > > > 
> > > > > It seemed really odd to see this approach here, but it seemed well 
> > > > > reviewed by the time I noticed it (via a downstream bug) so I figured 
> > > > > I'd lost my chance to disagree with the approach.
> > > > > 
> > > > > 
> > > > Sure, but `visitUsedDecl` isn't the right place to be entering the 
> > > > walk.  `visitUsedDecl` is supposed to be the *callback* from the walk.  
> > > > If they need to walk all the global declarations to find kernels 
> > > > instead of tracking the kernels as they're encountered (which would be 
> > > > a *much* better approach), it should be done as a separate function.
> > > > 
> > > > I just missed this in the review.
> > > The deferred diagnostics could be initiated by non-kernel functions or 
> > > even host functions.
> > > 
> > > Let's consider a device code library where no kernels are defined. A 
> > > device function is emitted, which calls a host device function which has 
> > > a deferred diagnostic. All device functions that are emitted need to be 
> > > checked.
> > > 
> > > Same with host functions that are emitted, which may call a host device 
> > > function which has deferred diagnostic.
> > > 
> > > Also not just function calls need to be checked. A function address may 
> > > be taken then called through function pointer. Therefore any reference to 
> > > a function needs to be followed.
> > > 
> > > In the case of OpenMP, the initialization of a global function pointer 
> > > which refers a function may trigger a deferred diangostic. There are 
> > > tests for that.
> > Right, I get that emitting deferred diagnostics for a declaration D needs 
> > to trigger any deferred diagnostics in declarations used by D, recursively. 
> >  You essentially have a graph of lazily-emitted declarations (which may or 
> > may not have deferred diagnostics) and a number of eagerly-emitted "root" 
> > declarations with use-edges leading into that graph.  Any declaration 
> > that's reachable from a root will need to be emitted and so needs to have 
> > any deferred diagnostics emitted as well.  My question is why you're 
> > finding these roots with a retroactive walk of the entire translation unit 
> > instead of either building a list of roots as you go or (better yet) 
> > building a list of lazily-emitted declarations that are used by those 
> > roots.  You can unambiguously identify at the point of declaration whether 
> > an entity will be eagerly or lazily emitted, right?  If you just store 
> > those initial edges into the lazily-emitted declarations graph and then 
> > initiate the recursive walk from them at the end of the translation unit, 
> > you'll only end up walking declarations that are actually relevant to your 
> > compilation, so you'll have much better locality and (if this matters to 
> > you) you'll naturally work a lot better with PCH and modules.
> I will try the approach you suggested. Basically I will record the emitted 
> functions and variables during parsing and use them as starting point for the 
> final traversal.
> 
> This should work for CUDA/HIP. However it may be tricky for OpenMP since the 
> emission of some entities depending on pragmas. Still it may be doable. If I 
> encounter difficulty I will come back for discussion.
> 
> I will post the change for review.
> 
> Thanks.
FYI: SYCL is also using deferred diagnostics engine to emit device side 
diagnostics, although this part hasn't been up-streamed yet, but we are 
tracking changes in this area.
SYCL support implementation should be quite similar to CUDA/HIP.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70172/new/

https://reviews.llvm.org/D70172



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to