darkbuck wrote:

> > > > We added sema check @ 
> > > > https://github.com/llvm/llvm-project/blob/8378a6fa4f5c83298fb0b5e240bb7f254f7b1137/clang/lib/Sema/SemaCUDA.cpp#L83
> > > > 
> > > > to generate error message on HIP based on Sam's request as HIP 
> > > > currently doesnt' support device-side kernel calls. I don't follow how 
> > > > we could have `CUDAKernelCallExpr` in the device compilation. Could you 
> > > > elaborate in details?
> > > 
> > > 
> > > The sema check doesn't work as is for `hipstdpar`, because it's gated on 
> > > the current target being either a `__global__` function or a `__device__` 
> > > function. What happens is that we do the parsing on a normal function, 
> > > the <<<>>> expression is semantically valid, and then we try to 
> > > `EmitCUDAKernelCallExpr`, because at CodeGen that is gated on whether the 
> > > entire compilation is host or device, not on whether or not the caller is 
> > > `__global__` or `__device__`. So either the latter check should actually 
> > > establish the caller's context, or we should bypass this altogether when 
> > > compiling for hipstdpar. This is the simplest NFC workaround to unbreak 
> > > things.
> > 
> > 
> > Why not add `getLangOpts().HIPStdPar` check in sema to skip generating 
> > device-side kernel call? So that we have a central place to make that 
> > decision?
> 
> Because, as far as I can ascertain, the `Sema` check is insufficient / the 
> separate assert in `EmitCUDAKernelCallExpr` is disjoint. Here's what would 
> happen:
> 
> 1. In Sema what we see is that `IsDeviceKernelCall` is false - this is fine, 
> but we still would emit a `CudaKernelCallExpr` for the `<<<>>>` callsite, 
> which was the case anyways before this change;

You mean that so far we could generate `CudaKernelCallExpr` in the device 
compilation but it's not a device-side kernel call. I don't follow how that 
could happen. You mean, under hipstdpar, `<<<>>>` could be used in the device 
side but not being treated as a device kernel call. What's the semantics of 
that?

> 2. Later on, when we get to `CodeGen`, we see the `CudaKernelCallExpr`, and 
> try to handle it, except now the assumption is that if we're compiling for 
> device and we see that, it must be a device side launch, and go look up a 
> non-existent symbol, and run into the bug.



https://github.com/llvm/llvm-project/pull/171043
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to