[PATCH] D94355: [Passes] Add relative lookup table generator pass

2021-02-24 Thread Gulfem Savrun Yeniceri via Phabricator via cfe-commits
gulfem updated this revision to Diff 326265.
gulfem added a comment.
Herald added subscribers: wenlei, nikic, kerbowa, steven_wu, nhaehnle, jvesely.

- Rename the pass to RelLookupTableConverter to be consistent
- Addressed reviewers' feedback
- Added tests for user-defined lookup tables and hidden visibility


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94355/new/

https://reviews.llvm.org/D94355

Files:
  clang/test/CodeGen/switch-to-lookup-table.c
  llvm/docs/Passes.rst
  llvm/include/llvm/InitializePasses.h
  llvm/include/llvm/Transforms/Scalar.h
  llvm/include/llvm/Transforms/Utils/RelLookupTableConverter.h
  llvm/lib/Passes/PassBuilder.cpp
  llvm/lib/Passes/PassRegistry.def
  llvm/lib/Transforms/IPO/PassManagerBuilder.cpp
  llvm/lib/Transforms/Utils/CMakeLists.txt
  llvm/lib/Transforms/Utils/RelLookupTableConverter.cpp
  llvm/lib/Transforms/Utils/Utils.cpp
  llvm/test/CodeGen/AMDGPU/opt-pipeline.ll
  llvm/test/Other/new-pm-defaults.ll
  llvm/test/Other/new-pm-thinlto-defaults.ll
  llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll
  llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll
  llvm/test/Other/opt-O2-pipeline.ll
  llvm/test/Other/opt-O3-pipeline-enable-matrix.ll
  llvm/test/Other/opt-O3-pipeline.ll
  llvm/test/Other/opt-Os-pipeline.ll
  llvm/test/Transforms/SimplifyCFG/X86/relative_lookup_table.ll
  llvm/utils/gn/secondary/llvm/lib/Transforms/Utils/BUILD.gn

Index: llvm/utils/gn/secondary/llvm/lib/Transforms/Utils/BUILD.gn
===
--- llvm/utils/gn/secondary/llvm/lib/Transforms/Utils/BUILD.gn
+++ llvm/utils/gn/secondary/llvm/lib/Transforms/Utils/BUILD.gn
@@ -60,6 +60,7 @@
 "NameAnonGlobals.cpp",
 "PredicateInfo.cpp",
 "PromoteMemoryToRegister.cpp",
+"RelLookupTableConverter.cpp"
 "SSAUpdater.cpp",
 "SSAUpdaterBulk.cpp",
 "SampleProfileLoaderBaseUtil.cpp",
Index: llvm/test/Transforms/SimplifyCFG/X86/relative_lookup_table.ll
===
--- /dev/null
+++ llvm/test/Transforms/SimplifyCFG/X86/relative_lookup_table.ll
@@ -0,0 +1,310 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt < %s -rel-lookup-table-converter -S | FileCheck %s
+; RUN: opt < %s -passes=rel-lookup-table-converter -S | FileCheck %s
+target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+@.str = private unnamed_addr constant [5 x i8] c"zero\00", align 1
+@.str.1 = private unnamed_addr constant [4 x i8] c"one\00", align 1
+@.str.2 = private unnamed_addr constant [4 x i8] c"two\00", align 1
+@.str.3 = private unnamed_addr constant [8 x i8] c"default\00", align 1
+@.str.4 = private unnamed_addr constant [6 x i8] c"three\00", align 1
+@.str.5 = private unnamed_addr constant [5 x i8] c"str1\00", align 1
+@.str.6 = private unnamed_addr constant [5 x i8] c"str2\00", align 1
+@.str.7 = private unnamed_addr constant [12 x i8] c"singlevalue\00", align 1
+
+@a1 = external global i32, align 4
+@b1 = external global i32, align 4
+@c1 = external global i32, align 4
+@d1 = external global i32, align 4
+
+@a2 = internal global i32 0, align 4
+@b2 = internal global i32 0, align 4
+@c2 = internal global i32 0, align 4
+@d2 = internal global i32 0, align 4
+
+@hidden0 = external hidden global i32, align 8
+@hidden1 = external hidden global i32, align 8
+@hidden2 = external hidden global i32, align 8
+@hidden3 = external hidden global i32, align 8
+
+@switch.table.no_dso_local = private unnamed_addr constant [3 x i32*] [i32* @a1, i32* @b1, i32* @c1], align 8
+
+@switch.table.dso_local = private unnamed_addr constant [3 x i32*] [i32* @a2, i32* @b2, i32* @c2], align 8
+
+@switch.table.hidden = private unnamed_addr constant [3 x i32*] [i32* @hidden0, i32* @hidden1, i32* @hidden2], align 8
+
+@switch.table.string_table = private unnamed_addr constant [3 x i8*]
+ [
+  i8* getelementptr inbounds ([5 x i8], [5 x i8]* @.str, i64 0, i64 0),
+  i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str.1, i64 0, i64 0),
+  i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str.2, i64 0, i64 0)
+ ], align 8
+
+@switch.table.string_table_holes = private unnamed_addr constant [4 x i8*]
+   [
+i8* getelementptr inbounds ([5 x i8], [5 x i8]* @.str, i64 0, i64 0),
+i8* getelementptr inbounds ([8 x i8], [8 x i8]* @.str.3, i64 0, i64 0),
+i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str.2, i64 0, i64 0),
+i8* getelementptr inbounds ([6 x i8], [6 x i8]* 

[PATCH] D94355: [Passes] Add relative lookup table generator pass

2021-02-18 Thread Leonard Chan via Phabricator via cfe-commits
leonardchan added a comment.

It looks like you have everything setup for running on the new PM, but it 
doesn't look like the pass is added anywhere in the new PM pipeline. 
Unfortunately, I don't *think* there's anything in the new PM that acts as a 
"default pipeline that gets added to all other pipelines" similar to 
`PassManagerBuilder::populateModulePassManager`, so we'll need to manually 
include this somewhere in `PassBuilder::buildO0DefaultPipeline` and 
`PassBuilder::buildPerModuleDefaultPipeline`.

Depending on if we also want to support this in [thin]LTO, we may need to add 
this to more pipelines.




Comment at: llvm/lib/Transforms/Utils/RelLookupTableGenerator.cpp:46
+
+bool shouldGenerateRelLookupTableForGlobal(GlobalVariable ) {
+  if (!GlobalVar.hasInitializer() ||

We should also check if the switch table itself is dso_local since the right 
relocation won't be generated if it isn't.



Comment at: llvm/lib/Transforms/Utils/RelLookupTableGenerator.cpp:51
+
+  ConstantArray *Array = dyn_cast(GlobalVar.getInitializer());
+  // If values are not pointers, do not generate a relative lookup table.

`cast` since we checked before this is a ConstantArray. Alternatively, what you 
could have is:

```
if (!GlobalVar.hasInitializer())
  return false;

ConstantArray *Array = dyn_cast(GlobalVar.getInitializer());
if (!Array || !Array->getType()->getElementType()->isPointerTy())
  return false;
```



Comment at: llvm/lib/Transforms/Utils/RelLookupTableGenerator.cpp:61
+if (GlobalVarOp &&
+!(GlobalVarOp->isDSOLocal() || GlobalVarOp->hasLocalLinkage()))
+  return false;

Rather than checking the linkage explicitly, you can use `isImplicitDSOLocal` 
which also has some visibility checks.

```
GlobalVarOp->isDSOLocal() || GlobalVarOp->isImplicitDSOLocal()
```



Comment at: llvm/lib/Transforms/Utils/RelLookupTableGenerator.cpp:64-70
+ConstantExpr *CE = dyn_cast(Operand);
+if (!CE || CE->getOpcode() != Instruction::GetElementPtr)
+  return false;
+
+GlobalValue *Pointer = dyn_cast(CE->getOperand(0));
+if (!Pointer)
+  return false;

It seems that with this, we're limiting this to only arrays with GEPs with 
globals as the base, but I think this will return false if the array element is 
just a dso_local global. We definitely should still be taking into account GEPs 
though.

I'm thinking `IsConstantOffsetFromGlobal` might be more useful here since it 
already contains a bunch of logic for handling ConstantExpr GEPs, then you can 
check if the global found by that is dso_local.



Comment at: llvm/lib/Transforms/Utils/RelLookupTableGenerator.cpp:76-77
+
+  /// If lookup table has more than one user,
+  /// do not generate a relative lookup table.
+  if (!GlobalVar.hasOneUser())

What's the reason for why the number of users matters?



Comment at: llvm/lib/Transforms/Utils/RelLookupTableGenerator.cpp:92-95
+  GlobalVariable *RelLookupTable =
+  new GlobalVariable(Mod, IntArrayTy,
+ /*isConstant=*/true, GlobalVariable::PrivateLinkage,
+ nullptr, "reltable." + Func.getName());

I think the visibility and linkage should be set the same as those of the 
original lookup table.

I think to avoid many changes to the original lookup table and only focus on 
the new layout, we should also propagate any properties of the original table 
to the new relative table. That is, visibility, linkage, attributes, 
unnamed_addr, etc. should be copied from the original.

(For copying attributes, you can use `copyAttributesFrom`.)



Comment at: llvm/test/Transforms/SimplifyCFG/X86/relative_lookup_table.ll:31
+
+@switch.table.no_dso_local = private unnamed_addr constant [3 x i32*] [i32* 
@a, i32* @b, i32* @c], align 8
+

We should also have cases that cover other linkages/visibilities:
- If the table elements are `extern dso_local` or `extern hidden`, we should 
still expect a relative lookup table
- If the switch table is not dso_local/hidden, we shouldn't expect a relative 
lookup table


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94355/new/

https://reviews.llvm.org/D94355

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D94355: [Passes] Add relative lookup table generator pass

2021-02-18 Thread Hans Wennborg via Phabricator via cfe-commits
hans added a comment.

Thanks for pushing this forward! I think this will be a nice transformation 
once all the details are worked out.




Comment at: clang/test/CodeGen/switch-to-lookup-table.c:2
+// Check switch to lookup optimization in fPIC and fno-PIC mode
+// RUN: %clang_cc1 %s -triple=aarch64-unknown-fuchsia -O2 -fno-PIC -S 
-emit-llvm -o - | FileCheck %s --check-prefix=CHECK --check-prefix=FNOPIC
+// RUN: %clang_cc1 %s -triple=aarch64-unknown-fuchsia -O2 -fPIC -S -emit-llvm 
-o - | FileCheck %s --check-prefix=CHECK --check-prefix=FPIC

gulfem wrote:
> hans wrote:
> > Clang codegen tests are not normally used to test LLVM optimizations. I 
> > think the tests for this should all live in LLVM, not Clang.
> > 
> > (Also aarch64 is not guaranteed to be included as a target in the LLVM 
> > build, so this test would not necessarily run.)
> I'm not able to use -fPIC and -fno-PIC options in the `opt` tool.
> I am setting the `PIC Level` flag to enable -fPIC in `opt.
> I thought that testing -fPIC and -fno-PIC in the same file seems easier and 
> more readable  in CodeGen tests.
> Please let me know if you have a good suggestion how to do that with `opt`.
> 
> I changed the target to `x86_64-linux` in this test.
Buildbots may not have x86_64 as a registered target, so this test will break 
some buildbots.

I think the opt flags -relocation-model=pic and -relocation-model=static will 
do the right thing (see for example 
llvm/test/Transforms/SimplifyCFG/ARM/switch-to-lookup-table.ll



Comment at: llvm/include/llvm/InitializePasses.h:321
 void initializeNameAnonGlobalLegacyPassPass(PassRegistry&);
+void initializeRelLookupTableConverterPassPass(PassRegistry &);
 void initializeUniqueInternalLinkageNamesLegacyPassPass(PassRegistry &);

In some places the pass is referred to as a generator and here it's a 
converter. I think converter is a better name, and it should be consistent.



Comment at: llvm/include/llvm/Transforms/Utils/RelLookupTableGenerator.h:10
+// This file implements relative lookup table generator that converts
+// lookup tables to relative lookup tables to make them PIC-friendly.
+//

For this kind of pass, I think it would be helpful to have a small example in 
the comment that shows what kind of IR it's looking for, and what it will be 
transformed to. (Either here or in the .cpp file.)



Comment at: llvm/include/llvm/Transforms/Utils/RelLookupTableGenerator.h:22
+
+// Simple pass that converts lookup tables to relative lookup tables.
+class RelLookupTableGeneratorPass

No need to call it simple :)



Comment at: llvm/include/llvm/Transforms/Utils/RelLookupTableGenerator.h:33
+
+#endif // LLVM_TRANSFORMS_RELLOOKUPTABLEGENERATOR_H

As clang-tidy points out, the comment doesn't match the actual macro name.



Comment at: llvm/lib/Transforms/Utils/RelLookupTableGenerator.cpp:25
+
+namespace llvm {
+

Since the function below are only used within this file, they should be static 
or in an anonymous namespace. Since you're already using an anonymous namespace 
for the RelLookupTableConverterPass class, I'd suggest using that for these 
functions too.



Comment at: llvm/lib/Transforms/Utils/RelLookupTableGenerator.cpp:28
+bool shouldGenerateRelLookupTables(Module ) {
+  // If not in x86 or aarch64 mode, do not generate a relative lookup table.
+  Triple TargetTriple(M.getTargetTriple());

This explains the "what" of what the code does, but not the "why". Why should 
the transformation only run for these two targets?



Comment at: llvm/lib/Transforms/Utils/RelLookupTableGenerator.cpp:34
+
+  // If not tiny or small code model, do not generate a relative lookup table.
+  Optional CodeModel = M.getCodeModel();

This also needs the "why".



Comment at: llvm/lib/Transforms/Utils/RelLookupTableGenerator.cpp:39
+
+  // If not in PIC mode, do not generate a relative lookup table.
+  if (M.getPICLevel() == PICLevel::NotPIC)

Again, this needs the "why".
And perhaps it would make sense to put this check first.



Comment at: llvm/lib/Transforms/Utils/RelLookupTableGenerator.cpp:58
+GlobalVariable *GlobalVarOp = dyn_cast(Operand);
+/// If any of the pointer values in the lookup table is not a global value
+/// or dso_local, do not generate a relative lookup table.

The comment doesn't match the code exactly I think, since further down you also 
allow GetElementPtr expressions. Maybe the comment could be clearer.



Comment at: llvm/lib/Transforms/Utils/RelLookupTableGenerator.cpp:78
+  /// do not generate a relative lookup table.
+  if (!GlobalVar.hasOneUser())
+return false;

Should it be hasOneUse() or hasOneUser()?

Also maybe this