darkbuck wrote: > > > > We added sema check @ > > > > https://github.com/llvm/llvm-project/blob/8378a6fa4f5c83298fb0b5e240bb7f254f7b1137/clang/lib/Sema/SemaCUDA.cpp#L83 > > > > > > > > to generate error message on HIP based on Sam's request as HIP > > > > currently doesnt' support device-side kernel calls. I don't follow how > > > > we could have `CUDAKernelCallExpr` in the device compilation. Could you > > > > elaborate in details? > > > > > > > > > The sema check doesn't work as is for `hipstdpar`, because it's gated on > > > the current target being either a `__global__` function or a `__device__` > > > function. What happens is that we do the parsing on a normal function, > > > the <<<>>> expression is semantically valid, and then we try to > > > `EmitCUDAKernelCallExpr`, because at CodeGen that is gated on whether the > > > entire compilation is host or device, not on whether or not the caller is > > > `__global__` or `__device__`. So either the latter check should actually > > > establish the caller's context, or we should bypass this altogether when > > > compiling for hipstdpar. This is the simplest NFC workaround to unbreak > > > things. > > > > > > Why not add `getLangOpts().HIPStdPar` check in sema to skip generating > > device-side kernel call? So that we have a central place to make that > > decision? > > Because, as far as I can ascertain, the `Sema` check is insufficient / the > separate assert in `EmitCUDAKernelCallExpr` is disjoint. Here's what would > happen: > > 1. In Sema what we see is that `IsDeviceKernelCall` is false - this is fine, > but we still would emit a `CudaKernelCallExpr` for the `<<<>>>` callsite, > which was the case anyways before this change;
You mean that so far we could generate `CudaKernelCallExpr` in the device compilation but it's not a device-side kernel call. I don't follow how that could happen. You mean, under hipstdpar, `<<<>>>` could be used in the device side but not being treated as a device kernel call. What's the semantics of that? > 2. Later on, when we get to `CodeGen`, we see the `CudaKernelCallExpr`, and > try to handle it, except now the assumption is that if we're compiling for > device and we see that, it must be a device side launch, and go look up a > non-existent symbol, and run into the bug. https://github.com/llvm/llvm-project/pull/171043 _______________________________________________ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
