jhuber6 added a comment.

In D141717#4052514 <https://reviews.llvm.org/D141717#4052514>, @tra wrote:

> Textual output for "-S -emit-llvm" is the canonical behavior, so I would 
> prefer it working that way in as many cases as possible and only override it 
> when necessary.
>
> Would it be possible to enforce binary IR generation in cases you need it? Or 
> to prove that this is equivalent to what the patch does now?

Well you'll get textual output for the host section, but the device code 
embedded in the host module will be bitcode instead. So the final output from 
the compiler is still textual IR. It just won't be some weird global like this

  @llvm.embedded.object = private constant [138032 x i8] 
c"\10\FF\10\AD\01\00\00\000\1B\02\00\00\00\00\00 
\00\00\00\00\00\00\00(\00\00\00\00\00\00\00\02\00\01\00\00\00\00\00H\00\00\00\00\00\00\00\02\00\00\00\0
  
0\00\00\00\90\00\00\00\00\00\00\00\9D\1A\02\00\00\00\00\00n\00\00\00\00\00\00\00u\00\00\00\00\00\00\00i\00\00\00\00\00\00\00\87\00\00\00\00\00\00\00\00arch\00triple\00amdgcn-amd-amdhsa\00gfx90a\00\00\00;
 Mod  uleID = 'tl2.c'\0Asource_filename = \22tl2.c\22\0Atarget datalayout = 
\22e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-
  v1024:1024-v2048:2048-n32:64-S32-A5-G1-ni:7\22\0Atarget triple = 
\22amdgcn-amd-amdhsa\22\0A\0A%struct.ident_t = type { i32, i32, i32, i32, ptr 
}\0A%struct.DeviceEnvironmentTy = type { i32, i32, i32, i32 }\0A  
%\22struct.ompx::state::TeamStateTy\22 = type { 
%\22struct.ompx::state::ICVStateTy\22, i32, i32, ptr 
}\0A%\22struct.ompx::state::ICVStateTy\22 = type { i32, i32, i32, i32, i32, i32 
}\0A%\22struct.(anonymous   namespace)::SharedMemorySmartStackTy\22 = type { 
[512 x i8], [1024 x i8] }\0A%\22struct.ompx::state::ThreadStateTy\22 = type { 
%\22struct.ompx::state::ICVStateTy\22, ptr 
}\0A\0A@__omp_rtl_assume_teams_oversu  bscription = weak_odr hidden 
addrspace(1) constant i32 0\0A@__omp_rtl_assume_threads_oversubscription = 
weak_odr hidden addrspace(1) constant i32 0\0A@0 = private unnamed_addr 
constant [23 x i8] c\22;unknown  ;unknown;0;0;;\\00\22, align 1\0A@1 = private 
unnamed_addr addrspace(1) constant %struct.ident_t { i32 0, i32 2, i32 0, i32 
22, ptr @0 }, align 8\0A@__omp_offloading_16_5d6227e_thread_limit_l2_exec_mode 
= we  ak protected addrspace(1) constant i8 
1\0A@__omp_offloading_16_5d6227e_thread_limit_l5_exec_mode = weak protected 
addrspace(1) constant i8 
1\0A@__omp_offloading_16_5d6227e_thread_limit_l8_exec_mode = weak pr  otected 
addrspace(1) constant i8 1\0...@llvm.compiler.used = appending addrspace(1) 
global [3 x ptr] [ptr addrspacecast (ptr addrspace(1) 
@__omp_offloading_16_5d6227e_thread_limit_l2_exec_mode to ptr), ptr add  
rspacecast (ptr addrspace(1) @__omp_offloading_16_5d6227e_

This is bad because it can't be handled by LTO or anything else. It makes the 
resulting IR file difficult to use for its intended purpose.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141717/new/

https://reviews.llvm.org/D141717

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to