[PATCH] D94355: [Passes] Add relative lookup table generator pass
gulfem updated this revision to Diff 326265. gulfem added a comment. Herald added subscribers: wenlei, nikic, kerbowa, steven_wu, nhaehnle, jvesely. - Rename the pass to RelLookupTableConverter to be consistent - Addressed reviewers' feedback - Added tests for user-defined lookup tables and hidden visibility Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D94355/new/ https://reviews.llvm.org/D94355 Files: clang/test/CodeGen/switch-to-lookup-table.c llvm/docs/Passes.rst llvm/include/llvm/InitializePasses.h llvm/include/llvm/Transforms/Scalar.h llvm/include/llvm/Transforms/Utils/RelLookupTableConverter.h llvm/lib/Passes/PassBuilder.cpp llvm/lib/Passes/PassRegistry.def llvm/lib/Transforms/IPO/PassManagerBuilder.cpp llvm/lib/Transforms/Utils/CMakeLists.txt llvm/lib/Transforms/Utils/RelLookupTableConverter.cpp llvm/lib/Transforms/Utils/Utils.cpp llvm/test/CodeGen/AMDGPU/opt-pipeline.ll llvm/test/Other/new-pm-defaults.ll llvm/test/Other/new-pm-thinlto-defaults.ll llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll llvm/test/Other/opt-O2-pipeline.ll llvm/test/Other/opt-O3-pipeline-enable-matrix.ll llvm/test/Other/opt-O3-pipeline.ll llvm/test/Other/opt-Os-pipeline.ll llvm/test/Transforms/SimplifyCFG/X86/relative_lookup_table.ll llvm/utils/gn/secondary/llvm/lib/Transforms/Utils/BUILD.gn Index: llvm/utils/gn/secondary/llvm/lib/Transforms/Utils/BUILD.gn === --- llvm/utils/gn/secondary/llvm/lib/Transforms/Utils/BUILD.gn +++ llvm/utils/gn/secondary/llvm/lib/Transforms/Utils/BUILD.gn @@ -60,6 +60,7 @@ "NameAnonGlobals.cpp", "PredicateInfo.cpp", "PromoteMemoryToRegister.cpp", +"RelLookupTableConverter.cpp" "SSAUpdater.cpp", "SSAUpdaterBulk.cpp", "SampleProfileLoaderBaseUtil.cpp", Index: llvm/test/Transforms/SimplifyCFG/X86/relative_lookup_table.ll === --- /dev/null +++ llvm/test/Transforms/SimplifyCFG/X86/relative_lookup_table.ll @@ -0,0 +1,310 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py +; RUN: opt < %s -rel-lookup-table-converter -S | FileCheck %s +; RUN: opt < %s -passes=rel-lookup-table-converter -S | FileCheck %s +target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128" +target triple = "x86_64-unknown-linux-gnu" + +@.str = private unnamed_addr constant [5 x i8] c"zero\00", align 1 +@.str.1 = private unnamed_addr constant [4 x i8] c"one\00", align 1 +@.str.2 = private unnamed_addr constant [4 x i8] c"two\00", align 1 +@.str.3 = private unnamed_addr constant [8 x i8] c"default\00", align 1 +@.str.4 = private unnamed_addr constant [6 x i8] c"three\00", align 1 +@.str.5 = private unnamed_addr constant [5 x i8] c"str1\00", align 1 +@.str.6 = private unnamed_addr constant [5 x i8] c"str2\00", align 1 +@.str.7 = private unnamed_addr constant [12 x i8] c"singlevalue\00", align 1 + +@a1 = external global i32, align 4 +@b1 = external global i32, align 4 +@c1 = external global i32, align 4 +@d1 = external global i32, align 4 + +@a2 = internal global i32 0, align 4 +@b2 = internal global i32 0, align 4 +@c2 = internal global i32 0, align 4 +@d2 = internal global i32 0, align 4 + +@hidden0 = external hidden global i32, align 8 +@hidden1 = external hidden global i32, align 8 +@hidden2 = external hidden global i32, align 8 +@hidden3 = external hidden global i32, align 8 + +@switch.table.no_dso_local = private unnamed_addr constant [3 x i32*] [i32* @a1, i32* @b1, i32* @c1], align 8 + +@switch.table.dso_local = private unnamed_addr constant [3 x i32*] [i32* @a2, i32* @b2, i32* @c2], align 8 + +@switch.table.hidden = private unnamed_addr constant [3 x i32*] [i32* @hidden0, i32* @hidden1, i32* @hidden2], align 8 + +@switch.table.string_table = private unnamed_addr constant [3 x i8*] + [ + i8* getelementptr inbounds ([5 x i8], [5 x i8]* @.str, i64 0, i64 0), + i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str.1, i64 0, i64 0), + i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str.2, i64 0, i64 0) + ], align 8 + +@switch.table.string_table_holes = private unnamed_addr constant [4 x i8*] + [ +i8* getelementptr inbounds ([5 x i8], [5 x i8]* @.str, i64 0, i64 0), +i8* getelementptr inbounds ([8 x i8], [8 x i8]* @.str.3, i64 0, i64 0), +i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str.2, i64 0, i64 0), +i8* getelementptr inbounds ([6 x i8], [6 x i8]*
[PATCH] D94355: [Passes] Add relative lookup table generator pass
leonardchan added a comment. It looks like you have everything setup for running on the new PM, but it doesn't look like the pass is added anywhere in the new PM pipeline. Unfortunately, I don't *think* there's anything in the new PM that acts as a "default pipeline that gets added to all other pipelines" similar to `PassManagerBuilder::populateModulePassManager`, so we'll need to manually include this somewhere in `PassBuilder::buildO0DefaultPipeline` and `PassBuilder::buildPerModuleDefaultPipeline`. Depending on if we also want to support this in [thin]LTO, we may need to add this to more pipelines. Comment at: llvm/lib/Transforms/Utils/RelLookupTableGenerator.cpp:46 + +bool shouldGenerateRelLookupTableForGlobal(GlobalVariable ) { + if (!GlobalVar.hasInitializer() || We should also check if the switch table itself is dso_local since the right relocation won't be generated if it isn't. Comment at: llvm/lib/Transforms/Utils/RelLookupTableGenerator.cpp:51 + + ConstantArray *Array = dyn_cast(GlobalVar.getInitializer()); + // If values are not pointers, do not generate a relative lookup table. `cast` since we checked before this is a ConstantArray. Alternatively, what you could have is: ``` if (!GlobalVar.hasInitializer()) return false; ConstantArray *Array = dyn_cast(GlobalVar.getInitializer()); if (!Array || !Array->getType()->getElementType()->isPointerTy()) return false; ``` Comment at: llvm/lib/Transforms/Utils/RelLookupTableGenerator.cpp:61 +if (GlobalVarOp && +!(GlobalVarOp->isDSOLocal() || GlobalVarOp->hasLocalLinkage())) + return false; Rather than checking the linkage explicitly, you can use `isImplicitDSOLocal` which also has some visibility checks. ``` GlobalVarOp->isDSOLocal() || GlobalVarOp->isImplicitDSOLocal() ``` Comment at: llvm/lib/Transforms/Utils/RelLookupTableGenerator.cpp:64-70 +ConstantExpr *CE = dyn_cast(Operand); +if (!CE || CE->getOpcode() != Instruction::GetElementPtr) + return false; + +GlobalValue *Pointer = dyn_cast(CE->getOperand(0)); +if (!Pointer) + return false; It seems that with this, we're limiting this to only arrays with GEPs with globals as the base, but I think this will return false if the array element is just a dso_local global. We definitely should still be taking into account GEPs though. I'm thinking `IsConstantOffsetFromGlobal` might be more useful here since it already contains a bunch of logic for handling ConstantExpr GEPs, then you can check if the global found by that is dso_local. Comment at: llvm/lib/Transforms/Utils/RelLookupTableGenerator.cpp:76-77 + + /// If lookup table has more than one user, + /// do not generate a relative lookup table. + if (!GlobalVar.hasOneUser()) What's the reason for why the number of users matters? Comment at: llvm/lib/Transforms/Utils/RelLookupTableGenerator.cpp:92-95 + GlobalVariable *RelLookupTable = + new GlobalVariable(Mod, IntArrayTy, + /*isConstant=*/true, GlobalVariable::PrivateLinkage, + nullptr, "reltable." + Func.getName()); I think the visibility and linkage should be set the same as those of the original lookup table. I think to avoid many changes to the original lookup table and only focus on the new layout, we should also propagate any properties of the original table to the new relative table. That is, visibility, linkage, attributes, unnamed_addr, etc. should be copied from the original. (For copying attributes, you can use `copyAttributesFrom`.) Comment at: llvm/test/Transforms/SimplifyCFG/X86/relative_lookup_table.ll:31 + +@switch.table.no_dso_local = private unnamed_addr constant [3 x i32*] [i32* @a, i32* @b, i32* @c], align 8 + We should also have cases that cover other linkages/visibilities: - If the table elements are `extern dso_local` or `extern hidden`, we should still expect a relative lookup table - If the switch table is not dso_local/hidden, we shouldn't expect a relative lookup table Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D94355/new/ https://reviews.llvm.org/D94355 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D94355: [Passes] Add relative lookup table generator pass
hans added a comment. Thanks for pushing this forward! I think this will be a nice transformation once all the details are worked out. Comment at: clang/test/CodeGen/switch-to-lookup-table.c:2 +// Check switch to lookup optimization in fPIC and fno-PIC mode +// RUN: %clang_cc1 %s -triple=aarch64-unknown-fuchsia -O2 -fno-PIC -S -emit-llvm -o - | FileCheck %s --check-prefix=CHECK --check-prefix=FNOPIC +// RUN: %clang_cc1 %s -triple=aarch64-unknown-fuchsia -O2 -fPIC -S -emit-llvm -o - | FileCheck %s --check-prefix=CHECK --check-prefix=FPIC gulfem wrote: > hans wrote: > > Clang codegen tests are not normally used to test LLVM optimizations. I > > think the tests for this should all live in LLVM, not Clang. > > > > (Also aarch64 is not guaranteed to be included as a target in the LLVM > > build, so this test would not necessarily run.) > I'm not able to use -fPIC and -fno-PIC options in the `opt` tool. > I am setting the `PIC Level` flag to enable -fPIC in `opt. > I thought that testing -fPIC and -fno-PIC in the same file seems easier and > more readable in CodeGen tests. > Please let me know if you have a good suggestion how to do that with `opt`. > > I changed the target to `x86_64-linux` in this test. Buildbots may not have x86_64 as a registered target, so this test will break some buildbots. I think the opt flags -relocation-model=pic and -relocation-model=static will do the right thing (see for example llvm/test/Transforms/SimplifyCFG/ARM/switch-to-lookup-table.ll Comment at: llvm/include/llvm/InitializePasses.h:321 void initializeNameAnonGlobalLegacyPassPass(PassRegistry&); +void initializeRelLookupTableConverterPassPass(PassRegistry &); void initializeUniqueInternalLinkageNamesLegacyPassPass(PassRegistry &); In some places the pass is referred to as a generator and here it's a converter. I think converter is a better name, and it should be consistent. Comment at: llvm/include/llvm/Transforms/Utils/RelLookupTableGenerator.h:10 +// This file implements relative lookup table generator that converts +// lookup tables to relative lookup tables to make them PIC-friendly. +// For this kind of pass, I think it would be helpful to have a small example in the comment that shows what kind of IR it's looking for, and what it will be transformed to. (Either here or in the .cpp file.) Comment at: llvm/include/llvm/Transforms/Utils/RelLookupTableGenerator.h:22 + +// Simple pass that converts lookup tables to relative lookup tables. +class RelLookupTableGeneratorPass No need to call it simple :) Comment at: llvm/include/llvm/Transforms/Utils/RelLookupTableGenerator.h:33 + +#endif // LLVM_TRANSFORMS_RELLOOKUPTABLEGENERATOR_H As clang-tidy points out, the comment doesn't match the actual macro name. Comment at: llvm/lib/Transforms/Utils/RelLookupTableGenerator.cpp:25 + +namespace llvm { + Since the function below are only used within this file, they should be static or in an anonymous namespace. Since you're already using an anonymous namespace for the RelLookupTableConverterPass class, I'd suggest using that for these functions too. Comment at: llvm/lib/Transforms/Utils/RelLookupTableGenerator.cpp:28 +bool shouldGenerateRelLookupTables(Module ) { + // If not in x86 or aarch64 mode, do not generate a relative lookup table. + Triple TargetTriple(M.getTargetTriple()); This explains the "what" of what the code does, but not the "why". Why should the transformation only run for these two targets? Comment at: llvm/lib/Transforms/Utils/RelLookupTableGenerator.cpp:34 + + // If not tiny or small code model, do not generate a relative lookup table. + Optional CodeModel = M.getCodeModel(); This also needs the "why". Comment at: llvm/lib/Transforms/Utils/RelLookupTableGenerator.cpp:39 + + // If not in PIC mode, do not generate a relative lookup table. + if (M.getPICLevel() == PICLevel::NotPIC) Again, this needs the "why". And perhaps it would make sense to put this check first. Comment at: llvm/lib/Transforms/Utils/RelLookupTableGenerator.cpp:58 +GlobalVariable *GlobalVarOp = dyn_cast(Operand); +/// If any of the pointer values in the lookup table is not a global value +/// or dso_local, do not generate a relative lookup table. The comment doesn't match the code exactly I think, since further down you also allow GetElementPtr expressions. Maybe the comment could be clearer. Comment at: llvm/lib/Transforms/Utils/RelLookupTableGenerator.cpp:78 + /// do not generate a relative lookup table. + if (!GlobalVar.hasOneUser()) +return false; Should it be hasOneUse() or hasOneUser()? Also maybe this