Hi John

Thanks for looking into this patch!

Sure, let me expand on the host-target communication. Just a little bit of 
context before I do that:

During code generation, the target frontend has to decide whether a given 
declaration or target region has to be emitted or not. Without any information 
communicated from the host frontend, this decision become complicated for cases 
like:

- #pragma omp target regions in static functions or class members;
- static declarations delimitted by #pragma omp declare target regions that end 
up not being used;
- #pragma omp target in template functions

In order for the target frontend to correctly identify all the declarations 
that need to be emitted it would have to, somehow, emulate the actions done by 
the host frontend which would turn the code generation messy in places that do 
not even relate with OpenMP.

On top of that, in order to have an efficient mapping between host and target 
entries (global declarations/target regions)
 table (this is discussed in the document, in section 5.1, where 
__tgt_offload_entry is introduced) the compiler would have to emit the 
corresponding entries in the host and target side in the same order. This is 
useful for devices whose toolchain maintain the order of the symbols given that 
the order of the entries in the host and target tables will be the same after 
linking. So knowing an index would be enough to do the mapping. In order for 
that to happen, the target frontend would have to know that order, which would 
be also hard to extract if no information is communicated form the host.

So, the information that needs to be propagated to make what I described above 
possible is basically i) declaration mangled names and ii) order they were 
emitted. This information could be communicated in the form of metadata that is 
emitted by the host frontend when the module is released and loaded by the 
target frontend when CGOpenMPRuntime is created. This information has however 
to be coded in slightly different ways for different kinds of declarations. Let 
me explain this with an example:

//######################################
#pragma omp declare target
struct MyClass{

  ...
  
  MyClass &add(MyClass &op){...}
  
  MyClass &add(int (&op)[N]){...}
  
  bool eq(MyClass &op){...}
  
  MyClass() {...}
  
  ~MyClass() {...}

};

MyClass C;
MyClass D;
#pragma omp end declare target

void foo(){

  int AA[N];
  MyClass H, T;
  MyClass HC;
  
  ...
  
  #pragma omp target
  {
    MyClass TC;
    T.add(AA).add(HC);
  }
  
  if (H.eq(T)) {...}
  
  #pragma omp target
  {
    T.add(AA);
  } 

}
//######################################

I was planning the metadata for this example to look more or less like this:

; Named metadata that encloses all the offloading information
!openmp.offloading.info = !{!1, !2, !3, !4, !5, !6, !7, !8, !9, !10}

; Global variables that require a map between host and target:
; Entry 0 -> ID for this type of metadata (0)
; Entry 1 -> Mangled name of the variable
; Entry 2 -> Order it was emitted
!1 = !{i32 0, !"C", i32 0}
!2 = !{i32 0, !"D", i32 2}

; Functions with target regions
; Entry 0 -> ID for this type of metadata (1)
; Entry 1 -> Mangled name of the function that was emitted for the host and 
encloses target regions
; Entry 2-n -> Order the target regions in the functions (in the same sequence 
the statements are found) are emitted 
!3 = !{i32 1, !"_Z3foov", i32 4, i32 5}

; Global initializers
; Entry 0 -> ID for this type of metadata (2)
; Entry 1-n -> Order the initializers are emitted in descending order of 
priority (we will require a target region per set of initializers with the same 
priority)
!4 = !{i32 2, i32 6}

; Global Dtors
; Entry 0 -> ID for this type of metadata (3)
; Entry 1 -> Mangled name of the variable to be destructed 
; Entry 2 -> Order the destructor was emitted (we will have a target region per 
variable being destroyed - this can probably be optimized)
!5 = !{i32 3, !"C", i32 1}
!6 = !{i32 3, !"D", i32 3}

; Other functions that should be emitted in the target but do not require to be 
mapped to the host
; Entry 0 -> ID for this type of metadata (4)
; Entry 1 -> Mangled name of the function that has to be emitted.
!7 = !{i32 4, !"_ZN7MyClass3addERA64_i"}
!8 = !{i32 4, !"_ZN7MyClass3addERS_"}
!9 = !{i32 4, !"_ZN7MyClassC2Ev"}
!10 = !{i32 4, !"_ZN7MyClassD2Ev"}

I realize this is the kind of information I should propose as a patch to the 
codegen part of offloading, but I think it makes sense to discuss it now as the 
driver has to enable it.

I also foresee the communication between target and host to be useful for other 
cases, like the propagation of alias information from host to target. I don’t 
have have however a proposal for that at this moment.

Hope I haven’t been either too brief or too exhaustive! Let me know if I can 
clarify anything else for you.

Thanks!
Samuel


http://reviews.llvm.org/D9888

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/



_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Reply via email to