jdoerfert marked an inline comment as not done.
jdoerfert added inline comments.


================
Comment at: openmp/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h:73
+/// Note: Only the team master is allowed to call non-const functions!
+struct shared_bytes_buffer {
+
----------------
> What is this buffer used for? Transferring pointers to the shread variables 
> to the parallel regions? If so, it must be handled by the compiler. There are 
> several reasons to do this:
> 1) You're using malloc/free functions for large buffers. The fact is that the 
> size of this buffer is known at the compile time and compiler can generate 
> the fixed size buffer in the global memory if required. We already have 
> similar implementation for target regions, globalized variables etc. You can 
> take a look and adapt it for your purpose.
> 2) Malloc/free are not very fast on the GPU, so it will get an additional 
> performance with the preallocated buffers.
> 3) Another one problem with malloc/free is that they are using preallocated 
> memory and the size of this memory is limited by 8Mb (if I do recall 
> correctly). This memory is required for the correct support of the local 
> variables globalization and we alredy ran into the situation when malloc 
> could not allocate enough memory for it with some previous implementations.
> 4) You can reused the shared memory buffers already generated by the compiler 
> and save shared memory.

[Quote by ABataev copied from 
https://reviews.llvm.org/D59319?id=190767#inline-525900 after the patch was 
split.]


This buffer is supposed to be used to communicate variables in shared and 
firstprivate clauses between threads in a team. In this patch it is simply used 
to implement the old `void**` buffer. How, when, if we use it is part of the 
interface implementation. For now, this buffer simply serves the users of the 
`omptarget_nvptx_globalArgs` global.

If you want to provide compiler allocated memory to avoid the buffer use, no 
problem,
the `__kmpc_target_region_kernel_parallel` function allows to do so, see the 
`SharedMemPointers` flag. I wouldn't want to put the logic to generate these 
buffers in the front-end though.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59424/new/

https://reviews.llvm.org/D59424



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to