grokos created this revision.
grokos added reviewers: ABataev, vzakhari.
grokos added projects: OpenMP, clang.
Herald added a reviewer: jdoerfert.

Currently, the offload-wrapper tool inserts `__tgt_register_lib` to the list of 
global ctors of a target module with `Priority=0`. This means that it's got the 
same priority as `__tgt_register_requires` and the order in which these two 
functions are called in not guaranteed. Ideally, we'd like to call 
`__tgt_register_requires` BEFORE loading a libomptarget plugin (which is one of 
the actions happening inside `__tgt_register_lib`). The reason is that we want 
to know which requirements the user has asked for so that upon loading the 
plugin libomptarget can report how many devices there are that can satisfy the 
requirements.

E.g. with the current implementation we can run into the following problem:

1. The user requests `unified_shared_memory` but the available devices on the 
system do not support this feature.
2. Initially, the offload policy is set to `tgt_default`.
3. `__tgt_register_lib` is called and the plugin for the specific target device 
reports there are N>0 available devices.
4. Consequently, the offload policy is set to `tgt_mandatory`.
5. `__tgt_register_requires` is called and we find out that the 
`unified_shared_memory` requirement cannot be satisfied.
6. Offload fails and because the offload policy had been set to mandatory 
libomptarget terminates the application.

With the proposed change things will proceed as follows:

1. The user requests `unified_shared_memory` but the available devices on the 
system do not support this feature.
2. Initially, the offload policy is set to `tgt_default`.
3. `__tgt_register_requires` is called and registers the 
`unified_shared_memory` requirement with libomptarget.
4. `__tgt_register_lib` is called and the plugin for the specific target device 
reports that the `unified_shared_memory` requirement cannot be satisfied, so 
there are N=0 available devices.
5. Consequently, the offload policy is set to `tgt_disabled`.
6. Execution falls back on the host instead of terminating the application.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D75223

Files:
  clang/test/Driver/clang-offload-wrapper.c
  clang/tools/clang-offload-wrapper/ClangOffloadWrapper.cpp


Index: clang/tools/clang-offload-wrapper/ClangOffloadWrapper.cpp
===================================================================
--- clang/tools/clang-offload-wrapper/ClangOffloadWrapper.cpp
+++ clang/tools/clang-offload-wrapper/ClangOffloadWrapper.cpp
@@ -262,7 +262,12 @@
     Builder.CreateRetVoid();
 
     // Add this function to constructors.
-    appendToGlobalCtors(M, Func, 0);
+    // Set priority to 1 so that __tgt_register_lib is executed AFTER
+    // __tgt_register_requires (we want to know what requirements have been
+    // asked for before we load a libomptarget plugin so that by the time the
+    // plugin is loaded it can report how many devices there are which can
+    // satisfy these requirements).
+    appendToGlobalCtors(M, Func, /*Priority*/ 1);
   }
 
   void createUnregisterFunction(GlobalVariable *BinDesc) {
Index: clang/test/Driver/clang-offload-wrapper.c
===================================================================
--- clang/test/Driver/clang-offload-wrapper.c
+++ clang/test/Driver/clang-offload-wrapper.c
@@ -39,7 +39,7 @@
 
 // CHECK-IR: [[DESC:@.+]] = internal constant [[DESCTY]] { i32 1, [[IMAGETY]]* 
getelementptr inbounds ([1 x [[IMAGETY]]], [1 x [[IMAGETY]]]* [[IMAGES]], i64 
0, i64 0), [[ENTTY]]* [[ENTBEGIN]], [[ENTTY]]* [[ENTEND]] }
 
-// CHECK-IR: @llvm.global_ctors = appending global [1 x { i32, void ()*, i8* 
}] [{ i32, void ()*, i8* } { i32 0, void ()* [[REGFN:@.+]], i8* null }]
+// CHECK-IR: @llvm.global_ctors = appending global [1 x { i32, void ()*, i8* 
}] [{ i32, void ()*, i8* } { i32 1, void ()* [[REGFN:@.+]], i8* null }]
 // CHECK-IR: @llvm.global_dtors = appending global [1 x { i32, void ()*, i8* 
}] [{ i32, void ()*, i8* } { i32 0, void ()* [[UNREGFN:@.+]], i8* null }]
 
 // CHECK-IR: define internal void [[REGFN]]()


Index: clang/tools/clang-offload-wrapper/ClangOffloadWrapper.cpp
===================================================================
--- clang/tools/clang-offload-wrapper/ClangOffloadWrapper.cpp
+++ clang/tools/clang-offload-wrapper/ClangOffloadWrapper.cpp
@@ -262,7 +262,12 @@
     Builder.CreateRetVoid();
 
     // Add this function to constructors.
-    appendToGlobalCtors(M, Func, 0);
+    // Set priority to 1 so that __tgt_register_lib is executed AFTER
+    // __tgt_register_requires (we want to know what requirements have been
+    // asked for before we load a libomptarget plugin so that by the time the
+    // plugin is loaded it can report how many devices there are which can
+    // satisfy these requirements).
+    appendToGlobalCtors(M, Func, /*Priority*/ 1);
   }
 
   void createUnregisterFunction(GlobalVariable *BinDesc) {
Index: clang/test/Driver/clang-offload-wrapper.c
===================================================================
--- clang/test/Driver/clang-offload-wrapper.c
+++ clang/test/Driver/clang-offload-wrapper.c
@@ -39,7 +39,7 @@
 
 // CHECK-IR: [[DESC:@.+]] = internal constant [[DESCTY]] { i32 1, [[IMAGETY]]* getelementptr inbounds ([1 x [[IMAGETY]]], [1 x [[IMAGETY]]]* [[IMAGES]], i64 0, i64 0), [[ENTTY]]* [[ENTBEGIN]], [[ENTTY]]* [[ENTEND]] }
 
-// CHECK-IR: @llvm.global_ctors = appending global [1 x { i32, void ()*, i8* }] [{ i32, void ()*, i8* } { i32 0, void ()* [[REGFN:@.+]], i8* null }]
+// CHECK-IR: @llvm.global_ctors = appending global [1 x { i32, void ()*, i8* }] [{ i32, void ()*, i8* } { i32 1, void ()* [[REGFN:@.+]], i8* null }]
 // CHECK-IR: @llvm.global_dtors = appending global [1 x { i32, void ()*, i8* }] [{ i32, void ()*, i8* } { i32 0, void ()* [[UNREGFN:@.+]], i8* null }]
 
 // CHECK-IR: define internal void [[REGFN]]()
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to