Anastasia added a comment.

>> I feel it would be much simpler if we could always generate the kernel 
>> metadata for blocks. A lot of special case code would be removed if we do 
>> this. OpenCL doesn't prevent kernel functions to be used just as normal 
>> functions (6.7.1) so it should be a perfectly valid thing to do. Do you seen 
>> any issues with that?
> 
> The special cases in metadata generation code is due to the first argument of 
> LLVM block invoke function is not defined in BlockDecl. Emitting metadata for 
> all block invoke functions does not help here.

To be more specific. I am just wondering what do we need for blocks to be used 
as kernels pragmatically. I feel it is essentially kernel calling convention 
and kernel metadata? The kernel arguments metadata however can be omitted 
because their type is fixed to be `local void*` and the number of arguments is 
passed into `enqueue_kernel` call so it is known at the enqueueing side too. 
The block descriptor can be passed as a generic pointer `generic void*` as it 
is cast to the right struct type inside the block invoke function anyway. So if 
we do this we can avoid adding a lot of extra code. Because blocks have reduced 
functionality compared to kernel functions. Also OpenCL allows kernel functions 
to be called just as normal functions so this way we can support second use 
case for blocks too. What do you think about it?



================
Comment at: lib/CodeGen/CGOpenCLRuntime.cpp:113
+
+llvm::Value *CGOpenCLRuntime::emitOpenCLEnqueuedBlock(CodeGenFunction &CGF,
+                                                      const Expr *E) {
----------------
yaxunl wrote:
> Anastasia wrote:
> > I am not particularly in favour of duplicating CodeGen functionality as it 
> > typically has so many special cases that are hard to catch. Is this logic 
> > needed in order to pass to block literal information  that the block is 
> > enqueued?
> This code is needed to emit separate functions for a block which is directly 
> called and also enqueued as a kernel. Since the kernel needs to have proper 
> calling convention and ABI, it cannot be emitted as the same function as when 
> the block is called directly. Since it is OpenCL specific code, I found it is 
> cleaner to separate this code as member of CGOpenCLRuntime instead of fitting 
> it into CGF.EmitBlockLiteral.
This part is replacing standard `EmitScalarExpr` call which is doing several 
things before calling into block generation. That's why I am a bit worried we 
are covering all the corner cases here.

So if we transform all blocks into kernels unconditionally we won't need this 
special handling then?

Do we generate two versions of the blocks now: one for enqueue and one for call?


================
Comment at: lib/CodeGen/CodeGenFunction.cpp:535
+    if (i == 0 && IsBlock) {
+      ty = CGF.CGM.getTargetCodeGenInfo().getEnqueuedBlockArgumentType(
+          ASTCtx, *CGF.BlockInfo);
----------------
yaxunl wrote:
> Anastasia wrote:
> > I don't quite understand why we need to special case this? As far as I 
> > undertsnad block argument is a `generic void* ` type but it's being cast to 
> > a concrete block struct inside the block function. Do we gain anything from 
> > having it a specific type here?
> This argument is not part of BlockDecl. BlockDecl only has arguments shown up 
> in the source code. The first argument in the LLVM block invoke function is 
> generated by codegen and there is no correspondence in AST, so it has to be 
> handled as a special case.
Considering that enqueued kernel always takes the same type of the arguments 
(`local void*`) and # of args is specified in `enqueue_kernel`, I was wondering 
whether we need to generate the information on the kernel parameters at all? 
The enqueueing side will have the information provided in the `enqueue_kernel` 
code.

As for the block itself it can be passed as `generic void*` and then cast to 
the block struct type inside the block itself.


================
Comment at: lib/CodeGen/CodeGenFunction.cpp:667
                   llvm::MDNode::get(Context, argTypeQuals));
-  if (CGM.getCodeGenOpts().EmitOpenCLArgMetadata)
+  if (CGF.CGM.getCodeGenOpts().EmitOpenCLArgMetadata)
     Fn->setMetadata("kernel_arg_name",
----------------
Why this change?


================
Comment at: test/CodeGenOpenCL/amdgpu-enqueue-kernel.cl:3
+
+// CHECK: %[[S1:struct.__amdgpu_block_arg_t.*]] = type { [3 x i64], [1 x i8] }
+// CHECK: %[[S2:struct.__amdgpu_block_arg_t.*]] = type { [5 x i64], [1 x i8] }
----------------
This struct is not identical to block literal struct?


https://reviews.llvm.org/D38134



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to