Hahnfeld added inline comments.

================
Comment at: include/clang/Basic/TargetInfo.h:944
+  /// \brief Whether target supports variable-length arrays.
+  bool isVLASupported() const { return VLASupported; }
+
----------------
rjmccall wrote:
> Hahnfeld wrote:
> > rjmccall wrote:
> > > Hahnfeld wrote:
> > > > rjmccall wrote:
> > > > > ABataev wrote:
> > > > > > ABataev wrote:
> > > > > > > Hahnfeld wrote:
> > > > > > > > rjmccall wrote:
> > > > > > > > > Hahnfeld wrote:
> > > > > > > > > > rjmccall wrote:
> > > > > > > > > > > Hahnfeld wrote:
> > > > > > > > > > > > rjmccall wrote:
> > > > > > > > > > > > > The way you've written this makes it sound like "does 
> > > > > > > > > > > > > the target support VLAs?", but the actual semantic 
> > > > > > > > > > > > > checks treat it as "do OpenMP devices on this target 
> > > > > > > > > > > > > support VLAs?"  Maybe there should be a more specific 
> > > > > > > > > > > > > way to query things about OpenMP devices instead of 
> > > > > > > > > > > > > setting a global flag for the target?
> > > > > > > > > > > > Actually, the NVPTX and SPIR targets never support 
> > > > > > > > > > > > VLAs. So I felt like it would be more correct to make 
> > > > > > > > > > > > this a global property of the target.
> > > > > > > > > > > > 
> > > > > > > > > > > > The difference is that the other programming models 
> > > > > > > > > > > > (OpenCL and CUDA) error out immediatelyand regardless 
> > > > > > > > > > > > of the target because this limitation is reflected in 
> > > > > > > > > > > > the standards that disallow VLAs (see SemaType.cpp). 
> > > > > > > > > > > > For OpenMP we might have target devices that support 
> > > > > > > > > > > > VLA so we shouldn't error out for those.
> > > > > > > > > > > If you want to make it a global property of the target, 
> > > > > > > > > > > that's fine, but then I don't understand why your 
> > > > > > > > > > > diagnostic only fires when 
> > > > > > > > > > > (S.isInOpenMPDeclareTargetContext() || 
> > > > > > > > > > > S.isInOpenMPTargetExecutionDirective()).
> > > > > > > > > > That is because of how OpenMP offloading works and how it 
> > > > > > > > > > is implemented in Clang. Consider the following snippet 
> > > > > > > > > > from the added test case:
> > > > > > > > > > ```lang=c
> > > > > > > > > > int vla[arg];
> > > > > > > > > > 
> > > > > > > > > > #pragma omp target map(vla[0:arg])
> > > > > > > > > > {
> > > > > > > > > >    // more code here...
> > > > > > > > > > }
> > > > > > > > > > ```
> > > > > > > > > > 
> > > > > > > > > > Clang will take the following steps to compile this into a 
> > > > > > > > > > working binary for a GPU:
> > > > > > > > > > 1. Parse and (semantically) analyze the code as-is for the 
> > > > > > > > > > host and produce LLVM Bitcode.
> > > > > > > > > > 2. Parse and analyze again the code as-is and generate code 
> > > > > > > > > > for the offloading target, the GPU in this case.
> > > > > > > > > > 3. Take LLVM Bitcode from 1., generate host binary and 
> > > > > > > > > > embed target binary from 3.
> > > > > > > > > > 
> > > > > > > > > > `OpenMPIsDevice` will be true for 2., but the complete 
> > > > > > > > > > source code is analyzed. So to not throw errors for the 
> > > > > > > > > > host code, we have to make sure that we are actually 
> > > > > > > > > > generating code for the target device. This is either in a 
> > > > > > > > > > `target` directive or in a `declare target` region.
> > > > > > > > > > Note that this is quite similar to what CUDA does, only 
> > > > > > > > > > they have `CUDADiagIfDeviceCode` for this logic. If you 
> > > > > > > > > > want me to add something of that kind for OpenMP target 
> > > > > > > > > > devices, I'm fine with that. However for the given case, 
> > > > > > > > > > it's a bit different because this error should only be 
> > > > > > > > > > thrown for target devices that don't support VLAs...
> > > > > > > > > I see.  So the entire translation unit is re-parsed and 
> > > > > > > > > re-Sema'ed from scratch for the target?  Which means you need 
> > > > > > > > > to avoid generating errors about things in the outer 
> > > > > > > > > translation unit that aren't part of the target directive 
> > > > > > > > > that you actually want to compile.  I would've expected there 
> > > > > > > > > to be some existing mechanism for that, to be honest, as 
> > > > > > > > > opposed to explicitly trying to suppress target-specific 
> > > > > > > > > diagnostics one by one.
> > > > > > > > Yes, that is my understanding. For errors, we don't need to 
> > > > > > > > take anything special as the first `cc1` invocation will exit 
> > > > > > > > with a non-zero status so that the driver stops the 
> > > > > > > > compilation. For warnings, there seems to be no mechanism in 
> > > > > > > > place as I see them duplicated, even in code that is not 
> > > > > > > > generate for the target device (verified with an unused 
> > > > > > > > variable).
> > > > > > > > 
> > > > > > > > @ABataev @gtbercea Do I miss something here?
> > > > > > > I'm not aware of any.
> > > > > > John, target-specific checks require some special flags (like 
> > > > > > LangOpts.Cuda) that are not set when we re-compile the code for 
> > > > > > OpenMP devices. That's why errors are not emitted for the 
> > > > > > non-target code. But also because of that, we need some special 
> > > > > > OpenMP checks for target-specific code inside the target regions. 
> > > > > > For example, code in lib/Sema/SemaType.cpp, lines 2184, 2185 (see 
> > > > > > this file in this patch) checks for Cuda compilation and prohibits 
> > > > > > using of VLAs in Cuda mode. We also should prohibit using of VLAs 
> > > > > > in target code for NVPTX devices or other devices that do not 
> > > > > > support VLAs in OpenMP mode.
> > > > > I think it would be cleaner here, and better for our OpenMP support 
> > > > > overall, if we found a more general way to suppress unwanted 
> > > > > diagnostics in the second invocation for code outside of the target 
> > > > > directive.  This check (and several others) would then just implement 
> > > > > a more general target feature disabling VLA support instead of being 
> > > > > awkwardly OpenMP-specific.
> > > > I think to get this we would need to make `Diag` a no-op `if 
> > > > (Context.getLangOpts().OpenMPIsDevice && 
> > > > !(isInOpenMPDeclareTargetContext() || 
> > > > isInOpenMPTargetExecutionDirective()))`. This would ignore all 
> > > > diagnostics outside of the code is really generated in the end...
> > > I mean, the danger of this approach is that you don't really want to 
> > > suppress diagnostics for top-level declarations: it can leave you with an 
> > > invalid AST, and it is not valid to generate IR from an invalid AST.
> > > 
> > > Sorry for the ignorant questions that follow, but I assume the OpenMP 
> > > spec must bless this double-translation somehow, and I'd like to 
> > > understand more about that in order to advise how to proceed.  How does 
> > > OpenMP handle the possibility that the code will be processed 
> > > substantially differently for different targets?  Is there some rule in 
> > > the spec saying that the code has to expand "the same" in both targets?  
> > > How does that work when e.g. size_t might have a different size or use a 
> > > completely different type?  More generally, how do expect that this 
> > > feature will work in the more complicated language modes, like OpenMP + 
> > > C++?
> > So if there is an error, the analysis will already fail on the host. I 
> > think that guarantees that we don't end up with an invalid AST and will at 
> > most suppress duplicate warnings.
> > 
> > Regarding the OpenMP spec: I think the unsatisfying answer is that the spec 
> > doesn't say what it expects on that questions. So I think the compiler has 
> > to do what seems reasonable...
> Well, what I'm worried about is the possibility that something changes about 
> the translation unit when it's reprocessed for the target — e.g. there's a 
> target-dependent #if that causes an error in the target TU, but not in the 
> original TU, so that the suppressed error is the only reason that the build 
> fails.
> 
> If the spec is unclear about this, then we just have to muddle through.  Is 
> this "reparse the whole translation unit for the target" the prevailing 
> implementation technique for target directives?
If we are worried about that scenario we have the preserve the current state: 
Do nothing, diagnose everything and let the user figure out if there is an 
error in the code.

I can't really comment on what other compilers (GCC, Intel) do, but at least 
for GCC you compile a complete compiler for the target, so I suppose they kind 
of do the same...


https://reviews.llvm.org/D39505



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to