bader added inline comments.
================ Comment at: clang/lib/Sema/Sema.cpp:1514 + void visitUsedDecl(SourceLocation Loc, Decl *D) { + if (auto *TD = dyn_cast<TranslationUnitDecl>(D)) { + for (auto *DD : TD->decls()) { ---------------- yaxunl wrote: > rjmccall wrote: > > yaxunl wrote: > > > rjmccall wrote: > > > > erichkeane wrote: > > > > > rjmccall wrote: > > > > > > erichkeane wrote: > > > > > > > Note that when recommitting this (if you choose to), this needs > > > > > > > to also handle NamespaceDecl. We're a downstream and discovered > > > > > > > that this doesn't properly handle functions or records handled in > > > > > > > a namespace. > > > > > > > > > > > > > > It can be implemented identically to TranslationUnitDecl. > > > > > > Wait, what? We shouldn't be doing this for TranslationUnitDecl > > > > > > either. I don't even know how we're "using" a > > > > > > TranslationUnitDecl, but neither this case not the case for > > > > > > `NamespaceDecl` should be recursively using every declaration > > > > > > declared inside it. If there's a declaration in a namespace that's > > > > > > being used, it should be getting visited as part of the actual use > > > > > > of it. > > > > > > > > > > > > The logic for `RecordDecl` has the same problem. > > > > > Despite the name, this seems to be more of a home-written ast walking > > > > > class. The entry point is the 'translation unit' which seems to walk > > > > > through everything in an attempt to find all the functions (including > > > > > those that are 'marked' as used by an attribute). > > > > > > > > > > You'll see the FunctionDecl section makes this assumption as well > > > > > (not necessarily that we got to a function via a call). IMO, this > > > > > approach is strange, and we should register entry points in some > > > > > manner (functions marked as emitted to the device in some fashion), > > > > > then just follow its call-graph (via the clang::CallGraph?) to emit > > > > > all of these functions. > > > > > > > > > > It seemed really odd to see this approach here, but it seemed well > > > > > reviewed by the time I noticed it (via a downstream bug) so I figured > > > > > I'd lost my chance to disagree with the approach. > > > > > > > > > > > > > > Sure, but `visitUsedDecl` isn't the right place to be entering the > > > > walk. `visitUsedDecl` is supposed to be the *callback* from the walk. > > > > If they need to walk all the global declarations to find kernels > > > > instead of tracking the kernels as they're encountered (which would be > > > > a *much* better approach), it should be done as a separate function. > > > > > > > > I just missed this in the review. > > > The deferred diagnostics could be initiated by non-kernel functions or > > > even host functions. > > > > > > Let's consider a device code library where no kernels are defined. A > > > device function is emitted, which calls a host device function which has > > > a deferred diagnostic. All device functions that are emitted need to be > > > checked. > > > > > > Same with host functions that are emitted, which may call a host device > > > function which has deferred diagnostic. > > > > > > Also not just function calls need to be checked. A function address may > > > be taken then called through function pointer. Therefore any reference to > > > a function needs to be followed. > > > > > > In the case of OpenMP, the initialization of a global function pointer > > > which refers a function may trigger a deferred diangostic. There are > > > tests for that. > > Right, I get that emitting deferred diagnostics for a declaration D needs > > to trigger any deferred diagnostics in declarations used by D, recursively. > > You essentially have a graph of lazily-emitted declarations (which may or > > may not have deferred diagnostics) and a number of eagerly-emitted "root" > > declarations with use-edges leading into that graph. Any declaration > > that's reachable from a root will need to be emitted and so needs to have > > any deferred diagnostics emitted as well. My question is why you're > > finding these roots with a retroactive walk of the entire translation unit > > instead of either building a list of roots as you go or (better yet) > > building a list of lazily-emitted declarations that are used by those > > roots. You can unambiguously identify at the point of declaration whether > > an entity will be eagerly or lazily emitted, right? If you just store > > those initial edges into the lazily-emitted declarations graph and then > > initiate the recursive walk from them at the end of the translation unit, > > you'll only end up walking declarations that are actually relevant to your > > compilation, so you'll have much better locality and (if this matters to > > you) you'll naturally work a lot better with PCH and modules. > I will try the approach you suggested. Basically I will record the emitted > functions and variables during parsing and use them as starting point for the > final traversal. > > This should work for CUDA/HIP. However it may be tricky for OpenMP since the > emission of some entities depending on pragmas. Still it may be doable. If I > encounter difficulty I will come back for discussion. > > I will post the change for review. > > Thanks. FYI: SYCL is also using deferred diagnostics engine to emit device side diagnostics, although this part hasn't been up-streamed yet, but we are tracking changes in this area. SYCL support implementation should be quite similar to CUDA/HIP. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D70172/new/ https://reviews.llvm.org/D70172 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits