[clang] [IRPGO][ValueProfile] Instrument virtual table address that could be used to do virtual table address comparision for indirect-call-promotion. (PR #66825)

2023-10-12 Thread Hongtao Yu via cfe-commits

htyu wrote:


> The performance win depends a lot on value distribution. For large copies, 
> using SIMD with nontemporal hint is the way to go.

Right, and the dominating single-range distribution is also important for our 
approach, similar to how speculative indirect call promotion works.

Recently we have also found out that rep mov started outperforming nontemporal 
SIMD for very large copies, e.g above 1KB.

https://github.com/llvm/llvm-project/pull/66825
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [IRPGO][ValueProfile] Instrument virtual table address that could be used to do virtual table address comparision for indirect-call-promotion. (PR #66825)

2023-10-12 Thread Hongtao Yu via cfe-commits

htyu wrote:

> @htyu Pretty much a drive-by question if it's convenient for you to share 
> more, how ranges are selected out of sampled values? For example, are ranges 
> the same for all workloads or dynamically generated based on the distribution 
> of size from the per-workload profile data?

We group sampled values by ranges pretty much based on the current 
`folly::memcpy` implementation:

  ```
  0 
  1 
  [2,3] 
  [4,7] 
  [8,16] 
  [17,32] 
  [33,64] 
  [65,128] 
  [129,inf] 
  ```

Values in a range share the same memcpy code, i.e, a pair of forward and 
backward copies. 

The range layout is fixed and hardcoded in LLVM. The compiler can choose to 
prioritize a specific range based on the value profile it sees. The profile is 
just like a LBR profile which is collected per service. 

So at compile time we may have such transformation:

memcpy(src, dst, size)

=> 

```
   if (33 <= size <= 64)
  vmovups  (%rsi), %ymm0
  vmovups  -32(%rsi,%rdx), %ymm1
  vmovups  %ymm0, (%rdi)
  vmovups  %ymm1, -32(%rdi,%rdx)
   else
 call memcpy(src, dst, size)
```




https://github.com/llvm/llvm-project/pull/66825
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [IRPGO][ValueProfile] Instrument virtual table address that could be used to do virtual table address comparision for indirect-call-promotion. (PR #66825)

2023-10-11 Thread Hongtao Yu via cfe-commits

htyu wrote:

> The AutoFDO support Mingming mentioned is the vtable profiling part using 
> MEM_INST_RETIRED event that captures data address. This data access profiling 
> will/can also be used for global variable layout. However this is current 
> Intel only so having a branch profiling based method can be useful overall.

@david-xl  It's interesting to know this. How is that going on your end? We've 
been using similar technique to do memcpy size optimization. We currently can 
generate a value profile that fits into the existing LBR profile format and 
consumed together by the compiler. We haven't upstreamed this work yet since we 
are still evaluating the effectiveness of the optimization. But a common 
problem here could be how to generalize the AutoFDO profile format to 
incorporate both indirect call targets, callsite parameter values and other 
types of values. Do you have a plan for that? Maybe we can work together on 
this.

https://github.com/llvm/llvm-project/pull/66825
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [IRPGO][ValueProfile] Instrument virtual table address that could be used to do virtual table address comparision for indirect-call-promotion. (PR #66825)

2023-09-19 Thread Hongtao Yu via cfe-commits

htyu wrote:

> > The work sounds interesting. Can you provide a bit more context about it? 
> > Will it be used to improve ICP when it's sufficient to just compare the 
> > vtable address instead of the vfunc address?
> 
> yes -- it can not only eliminate vtable load, but also enable target check 
> combining.
> 
> What is more important is that it can be combined with more aggressive 
> interprocedural type propagation that enables full (unconditional) 
> devirtualization. Example:
> 
> base->foo(); base->bar(); ==> if (base->vptr == Derived) { 
> Derived::foo(base); // base type is known so virtual calls in foo,bar can 
> further be devirtualized. Derived::bar(base); } else {.. }

Thanks for the illustration! Have you enabled this in your fleet, and how much 
performance improvement have you seen?

We've been also thinking about similar work based on sample PGO, in both the 
compiler and bolt.

https://github.com/llvm/llvm-project/pull/66825
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [IRPGO][ValueProfile] Instrument virtual table address that could be used to do virtual table address comparision for indirect-call-promotion. (PR #66825)

2023-09-19 Thread Hongtao Yu via cfe-commits

htyu wrote:

The work sounds interesting. Can you provide a bit more context about it? Will 
it be used to improve ICP when it's sufficient to just compare the vtable 
address instead of the vfunc address? 

https://github.com/llvm/llvm-project/pull/66825
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 56e36e4 - [UsersManual] Add llvm-progen as an alternative tool for AutoFDO profile generation.

2023-06-20 Thread Hongtao Yu via cfe-commits

Author: Hongtao Yu
Date: 2023-06-20T13:26:30-07:00
New Revision: 56e36e47f50446f519dfa2132cf5a49cc0816866

URL: 
https://github.com/llvm/llvm-project/commit/56e36e47f50446f519dfa2132cf5a49cc0816866
DIFF: 
https://github.com/llvm/llvm-project/commit/56e36e47f50446f519dfa2132cf5a49cc0816866.diff

LOG: [UsersManual] Add llvm-progen as an alternative tool for AutoFDO profile 
generation.

I'm adding llvm-profgen as an alternative AutoFDO profile generator to the user 
manual. llvm-profgen is widely used and tested by META as their default profile 
generator.

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D142994

Added: 


Modified: 
clang/docs/UsersManual.rst

Removed: 




diff  --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst
index f8dbf819d6ff7..6a644eed6d32b 100644
--- a/clang/docs/UsersManual.rst
+++ b/clang/docs/UsersManual.rst
@@ -2400,6 +2400,14 @@ usual build cycle when using sample profilers for 
optimization:
without the ``-b`` flag, you need to use ``--use_lbr=false`` when
calling ``create_llvm_prof``.
 
+   Alternatively, the LLVM tool ``llvm-profgen`` can also be used to generate
+   the LLVM sample profile:
+
+   .. code-block:: console
+
+ $ llvm-profgen --binary=./code --output=code.prof--perfdata=perf.data
+
+
 4. Build the code again using the collected profile. This step feeds
the profile back to the optimizers. This should result in a binary
that executes faster than the original one. Note that you are not



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 4d75717 - [NFC] Add split-file as runtime test dependency

2023-02-02 Thread Hongtao Yu via cfe-commits

Author: YongKang Zhu
Date: 2023-02-02T11:22:26-08:00
New Revision: 4d757177df473691640095ea8b0e2104a97af83f

URL: 
https://github.com/llvm/llvm-project/commit/4d757177df473691640095ea8b0e2104a97af83f
DIFF: 
https://github.com/llvm/llvm-project/commit/4d757177df473691640095ea8b0e2104a97af83f.diff

LOG: [NFC] Add split-file as runtime test dependency

Here is a similar change that adds `split-file` as compiler-rt test dependency: 
https://reviews.llvm.org/rG0eb01a9c4581a24c163f3464cebdb20534fbda35

Reviewed By: thevinster

Differential Revision: https://reviews.llvm.org/D143123

Added: 


Modified: 
clang/runtime/CMakeLists.txt
llvm/runtimes/CMakeLists.txt

Removed: 




diff  --git a/clang/runtime/CMakeLists.txt b/clang/runtime/CMakeLists.txt
index 0cccf730e417d..94b5d783ce361 100644
--- a/clang/runtime/CMakeLists.txt
+++ b/clang/runtime/CMakeLists.txt
@@ -132,7 +132,7 @@ if(LLVM_BUILD_EXTERNAL_COMPILER_RT AND EXISTS 
${COMPILER_RT_SRC_ROOT}/)
   if(LLVM_INCLUDE_TESTS)
 # Add binaries that compiler-rt tests depend on.
 set(COMPILER_RT_TEST_DEPENDENCIES
-  FileCheck count not llvm-nm llvm-objdump llvm-symbolizer llvm-jitlink 
lli)
+  FileCheck count not llvm-nm llvm-objdump llvm-symbolizer llvm-jitlink 
lli split-file)
 
 # Add top-level targets for various compiler-rt test suites.
 set(COMPILER_RT_TEST_SUITES check-fuzzer check-asan check-hwasan 
check-asan-dynamic check-dfsan

diff  --git a/llvm/runtimes/CMakeLists.txt b/llvm/runtimes/CMakeLists.txt
index 2d0689d1ed4ec..5bb890452c215 100644
--- a/llvm/runtimes/CMakeLists.txt
+++ b/llvm/runtimes/CMakeLists.txt
@@ -503,6 +503,7 @@ if(runtimes)
 sanstats
 llvm_gtest_main
 llvm_gtest
+split-file
   )
 foreach(target ${test_targets} ${SUB_CHECK_TARGETS})
   add_dependencies(${target} ${RUNTIMES_TEST_DEPENDS})



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 72acd04 - Pass split-machine-functions to code generator when flto is used

2022-03-23 Thread Hongtao Yu via cfe-commits

Author: Junfeng Dong
Date: 2022-03-23T08:55:30-07:00
New Revision: 72acd042bad35f78232f17addc02196a7af1a6e9

URL: 
https://github.com/llvm/llvm-project/commit/72acd042bad35f78232f17addc02196a7af1a6e9
DIFF: 
https://github.com/llvm/llvm-project/commit/72acd042bad35f78232f17addc02196a7af1a6e9.diff

LOG: Pass split-machine-functions to code generator when flto is used

-fsplit-machine-functions is an optimization in codegen phase. when -flto is 
use, clang generate IR bitcode in .o files, and linker will call into these 
codegen optimization passes. Current clang driver doesn't pass this option to 
linker when both -fsplit-machine-functions and -flto are used, so the 
optimization is silently ignored.  My fix generates linker option 
-plugin-opt=-split-machine-functions for this case. It allows the linker to 
pass "split-machine-functions" to code generator to turn on that optimization.  
It works for both gold and lld.

Reviewed By: hoy, wenlei

Differential Revision: https://reviews.llvm.org/D121969

Added: 
clang/test/Driver/fsplit-machine-functions2.c

Modified: 
clang/lib/Driver/ToolChains/CommonArgs.cpp

Removed: 




diff  --git a/clang/lib/Driver/ToolChains/CommonArgs.cpp 
b/clang/lib/Driver/ToolChains/CommonArgs.cpp
index 2f3dc86eaad1d..156821a6e7854 100644
--- a/clang/lib/Driver/ToolChains/CommonArgs.cpp
+++ b/clang/lib/Driver/ToolChains/CommonArgs.cpp
@@ -574,6 +574,13 @@ void tools::addLTOOptions(const ToolChain , 
const ArgList ,
 CmdArgs.push_back("-plugin-opt=-data-sections");
   }
 
+  // Pass an option to enable split machine functions.
+  if (auto *A = Args.getLastArg(options::OPT_fsplit_machine_functions,
+options::OPT_fno_split_machine_functions)) {
+if (A->getOption().matches(options::OPT_fsplit_machine_functions))
+  CmdArgs.push_back("-plugin-opt=-split-machine-functions");
+  }
+
   if (Arg *A = getLastProfileSampleUseArg(Args)) {
 StringRef FName = A->getValue();
 if (!llvm::sys::fs::exists(FName))

diff  --git a/clang/test/Driver/fsplit-machine-functions2.c 
b/clang/test/Driver/fsplit-machine-functions2.c
new file mode 100644
index 0..1b81be084eff9
--- /dev/null
+++ b/clang/test/Driver/fsplit-machine-functions2.c
@@ -0,0 +1,12 @@
+// Test -fsplit-machine-functions option pass-through with lto
+// RUN: %clang -### -target x86_64-unknown-linux -flto 
-fsplit-machine-functions %s 2>&1 | FileCheck %s -check-prefix=CHECK-PASS
+
+// Test no pass-through to ld without lto
+// RUN: %clang -### -target x86_64-unknown-linux -fsplit-machine-functions %s 
2>&1 | FileCheck %s -check-prefix=CHECK-NOPASS
+
+// Test the mix of -fsplit-machine-functions and -fno-split-machine-functions
+// RUN: %clang -### -target x86_64-unknown-linux -flto 
-fsplit-machine-functions -fno-split-machine-functions %s 2>&1 | FileCheck %s 
-check-prefix=CHECK-NOPASS
+// RUN: %clang -### -target x86_64-unknown-linux -flto 
-fno-split-machine-functions -fsplit-machine-functions %s 2>&1 | FileCheck %s 
-check-prefix=CHECK-PASS
+
+// CHECK-PASS:  "-plugin-opt=-split-machine-functions"
+// CHECK-NOPASS-NOT:"-plugin-opt=-split-machine-functions"



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] e9d1a67 - [CSSPGO] Do not pass -fpseudo-probe-for-profiling to the linker.

2021-09-23 Thread Hongtao Yu via cfe-commits

Author: Hongtao Yu
Date: 2021-09-23T15:50:40-07:00
New Revision: e9d1a679a1c9cb309aea8c5d944e55865d38b867

URL: 
https://github.com/llvm/llvm-project/commit/e9d1a679a1c9cb309aea8c5d944e55865d38b867
DIFF: 
https://github.com/llvm/llvm-project/commit/e9d1a679a1c9cb309aea8c5d944e55865d38b867.diff

LOG: [CSSPGO] Do not pass -fpseudo-probe-for-profiling to the linker.

The correponding linker switch has been removed by 
https://reviews.llvm.org/D110209, so do not pass it in clang.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D110371

Added: 


Modified: 
clang/lib/Driver/ToolChains/CommonArgs.cpp

Removed: 
clang/test/Driver/pseudo-probe-lto.c



diff  --git a/clang/lib/Driver/ToolChains/CommonArgs.cpp 
b/clang/lib/Driver/ToolChains/CommonArgs.cpp
index f440fd4ca33d3..9f1895466c98d 100644
--- a/clang/lib/Driver/ToolChains/CommonArgs.cpp
+++ b/clang/lib/Driver/ToolChains/CommonArgs.cpp
@@ -620,11 +620,6 @@ void tools::addLTOOptions(const ToolChain , 
const ArgList ,
   CmdArgs.push_back("-plugin-opt=new-pass-manager");
   }
 
-  // Pass an option to enable pseudo probe emission.
-  if (Args.hasFlag(options::OPT_fpseudo_probe_for_profiling,
-   options::OPT_fno_pseudo_probe_for_profiling, false))
-CmdArgs.push_back("-plugin-opt=pseudo-probe-for-profiling");
-
   // Setup statistics file output.
   SmallString<128> StatsFile = getStatsFileName(Args, Output, Input, D);
   if (!StatsFile.empty())

diff  --git a/clang/test/Driver/pseudo-probe-lto.c 
b/clang/test/Driver/pseudo-probe-lto.c
deleted file mode 100644
index e319b8c0098bf..0
--- a/clang/test/Driver/pseudo-probe-lto.c
+++ /dev/null
@@ -1,10 +0,0 @@
-// RUN: touch %t.o
-// RUN: %clang -### %t.o -target x86_64-unknown-linux -flto 
-fpseudo-probe-for-profiling 2>&1 | FileCheck %s --check-prefix=PROBE
-// RUN: %clang -### %t.o -target x86_64-unknown-linux -flto=thin 
-fpseudo-probe-for-profiling 2>&1 | FileCheck %s --check-prefix=PROBE
-// RUN: %clang -### %t.o -target x86_64-unknown-linux -flto 
-fno-pseudo-probe-for-profiling -fpseudo-probe-for-profiling 2>&1 | FileCheck 
%s --check-prefix=PROBE
-// RUN: %clang -### %t.o -target x86_64-unknown-linux -flto 2>&1 | FileCheck 
%s --check-prefix=NOPROBE
-// RUN: %clang -### %t.o -target x86_64-unknown-linux -flto 
-fno-pseudo-probe-for-profiling 2>&1 | FileCheck %s --check-prefix=NOPROBE
-// RUN: %clang -### %t.o -target x86_64-unknown-linux -flto 
-fpseudo-probe-for-profiling -fno-pseudo-probe-for-profiling 2>&1 | FileCheck 
%s --check-prefix=NOPROBE
-
-// PROBE: -plugin-opt=pseudo-probe-for-profiling
-// NOPROBE-NOT: -plugin-opt=pseudo-probe-for-profiling



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] d9b511d - [CSSPGO] Set PseudoProbeInserter as a default pass.

2021-09-22 Thread Hongtao Yu via cfe-commits

Author: Hongtao Yu
Date: 2021-09-22T09:09:48-07:00
New Revision: d9b511d8e8c43f79e0e277be287656693dd6563f

URL: 
https://github.com/llvm/llvm-project/commit/d9b511d8e8c43f79e0e277be287656693dd6563f
DIFF: 
https://github.com/llvm/llvm-project/commit/d9b511d8e8c43f79e0e277be287656693dd6563f.diff

LOG: [CSSPGO] Set PseudoProbeInserter as a default pass.

Currenlty PseudoProbeInserter is a pass conditioned on a target switch. It 
works well with a single clang invocation. It doesn't work so well when the 
backend is called separately (i.e, through the linker or llc), where user has 
always to pass -pseudo-probe-for-profiling explictly. I'm making the pass a 
default pass that requires no command line arg to trigger, but will be actually 
run depending on whether the CU comes with `llvm.pseudo_probe_desc` metadata.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D110209

Added: 


Modified: 
clang/lib/CodeGen/BackendUtil.cpp
lld/ELF/Config.h
lld/ELF/Driver.cpp
lld/ELF/LTO.cpp
lld/ELF/Options.td
lld/test/ELF/lto/pseudo-probe-lto.ll
llvm/include/llvm/CodeGen/CommandFlags.h
llvm/include/llvm/Target/TargetOptions.h
llvm/lib/CodeGen/CommandFlags.cpp
llvm/lib/CodeGen/PseudoProbeInserter.cpp
llvm/lib/CodeGen/TargetPassConfig.cpp
llvm/lib/Target/X86/X86TargetMachine.cpp
llvm/test/CodeGen/X86/O0-pipeline.ll
llvm/test/CodeGen/X86/opt-pipeline.ll
llvm/test/Transforms/SampleProfile/pseudo-probe-dangle.ll
llvm/test/Transforms/SampleProfile/pseudo-probe-emit-inline.ll
llvm/test/Transforms/SampleProfile/pseudo-probe-emit.ll
llvm/test/Transforms/SampleProfile/pseudo-probe-instsched.ll
llvm/test/tools/llvm-profgen/truncated-pseudoprobe.test

Removed: 




diff  --git a/clang/lib/CodeGen/BackendUtil.cpp 
b/clang/lib/CodeGen/BackendUtil.cpp
index e31fa3f9f94de..99e33b227f792 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -576,7 +576,6 @@ static bool initTargetOptions(DiagnosticsEngine ,
   Options.ForceDwarfFrameSection = CodeGenOpts.ForceDwarfFrameSection;
   Options.EmitCallSiteInfo = CodeGenOpts.EmitCallSiteInfo;
   Options.EnableAIXExtendedAltivecABI = 
CodeGenOpts.EnableAIXExtendedAltivecABI;
-  Options.PseudoProbeForProfiling = CodeGenOpts.PseudoProbeForProfiling;
   Options.ValueTrackingVariableLocations =
   CodeGenOpts.ValueTrackingVariableLocations;
   Options.XRayOmitFunctionIndex = CodeGenOpts.XRayOmitFunctionIndex;

diff  --git a/lld/ELF/Config.h b/lld/ELF/Config.h
index f9851d03e78bf..65101d29136e2 100644
--- a/lld/ELF/Config.h
+++ b/lld/ELF/Config.h
@@ -183,7 +183,6 @@ struct Configuration {
   bool ltoDebugPassManager;
   bool ltoEmitAsm;
   bool ltoNewPassManager;
-  bool ltoPseudoProbeForProfiling;
   bool ltoUniqueBasicBlockSectionNames;
   bool ltoWholeProgramVisibility;
   bool mergeArmExidx;

diff  --git a/lld/ELF/Driver.cpp b/lld/ELF/Driver.cpp
index 6607c0fe15a4b..8cb81987163fc 100644
--- a/lld/ELF/Driver.cpp
+++ b/lld/ELF/Driver.cpp
@@ -1084,8 +1084,6 @@ static void readConfigs(opt::InputArgList ) {
   config->ltoo = args::getInteger(args, OPT_lto_O, 2);
   config->ltoObjPath = args.getLastArgValue(OPT_lto_obj_path_eq);
   config->ltoPartitions = args::getInteger(args, OPT_lto_partitions, 1);
-  config->ltoPseudoProbeForProfiling =
-  args.hasArg(OPT_lto_pseudo_probe_for_profiling);
   config->ltoSampleProfile = args.getLastArgValue(OPT_lto_sample_profile);
   config->ltoBasicBlockSections =
   args.getLastArgValue(OPT_lto_basic_block_sections);

diff  --git a/lld/ELF/LTO.cpp b/lld/ELF/LTO.cpp
index 1f60e1e8a395c..fb354f81d49d6 100644
--- a/lld/ELF/LTO.cpp
+++ b/lld/ELF/LTO.cpp
@@ -112,7 +112,6 @@ static lto::Config createConfig() {
 }
   }
 
-  c.Options.PseudoProbeForProfiling = config->ltoPseudoProbeForProfiling;
   c.Options.UniqueBasicBlockSectionNames =
   config->ltoUniqueBasicBlockSectionNames;
 

diff  --git a/lld/ELF/Options.td b/lld/ELF/Options.td
index 874399d5f41f2..852a27d62812b 100644
--- a/lld/ELF/Options.td
+++ b/lld/ELF/Options.td
@@ -574,8 +574,6 @@ def lto_sample_profile: JJ<"lto-sample-profile=">,
 defm lto_whole_program_visibility: BB<"lto-whole-program-visibility",
   "Asserts that the LTO link has whole program visibility",
   "Asserts that the LTO link does not have whole program visibility">;
-def lto_pseudo_probe_for_profiling: F<"lto-pseudo-probe-for-profiling">,
-  HelpText<"Emit pseudo probes for sample profiling">;
 def disable_verify: F<"disable-verify">;
 defm mllvm: Eq<"mllvm", "Additional arguments to forward to LLVM's option 
processing">;
 def opt_remarks_filename: Separate<["--"], "opt-remarks-filename">,
@@ -651,8 +649,6 @@ def: F<"plugin-opt=opt-remarks-with-hotness">,
 def: J<"plugin-opt=opt-remarks-hotness-threshold=">,
   Alias,
   HelpText<"Alias for --opt-remarks-hotness-threshold">;
-def: 

[clang] 299b5d4 - [CSSPGO] Enable pseudo probe instrumentation in O0 mode.

2021-09-14 Thread Hongtao Yu via cfe-commits

Author: Hongtao Yu
Date: 2021-09-14T18:13:29-07:00
New Revision: 299b5d420df15fafc9936bc24995f6cd6ad325be

URL: 
https://github.com/llvm/llvm-project/commit/299b5d420df15fafc9936bc24995f6cd6ad325be
DIFF: 
https://github.com/llvm/llvm-project/commit/299b5d420df15fafc9936bc24995f6cd6ad325be.diff

LOG: [CSSPGO] Enable pseudo probe instrumentation in O0 mode.

Pseudo probe instrumentation was missing from O0 build. It is needed in cases 
where some source files are built in O0 while the others are built in optimize 
mode.

Reviewed By: wenlei, wlei, wmi

Differential Revision: https://reviews.llvm.org/D109531

Added: 


Modified: 
clang/test/CodeGen/pseudo-probe-emit.c
llvm/lib/Passes/PassBuilder.cpp

Removed: 




diff  --git a/clang/test/CodeGen/pseudo-probe-emit.c 
b/clang/test/CodeGen/pseudo-probe-emit.c
index 5fe1d23846763..b4e6a014d474b 100644
--- a/clang/test/CodeGen/pseudo-probe-emit.c
+++ b/clang/test/CodeGen/pseudo-probe-emit.c
@@ -1,3 +1,4 @@
+// RUN: %clang_cc1 -O0 -fno-legacy-pass-manager -fpseudo-probe-for-profiling 
-debug-info-kind=limited -emit-llvm -o - %s | FileCheck %s
 // RUN: %clang_cc1 -O2 -fno-legacy-pass-manager -fpseudo-probe-for-profiling 
-debug-info-kind=limited -emit-llvm -o - %s | FileCheck %s
 
 // Check the generation of pseudoprobe intrinsic call

diff  --git a/llvm/lib/Passes/PassBuilder.cpp b/llvm/lib/Passes/PassBuilder.cpp
index 6c3241ba3e52c..076ff95cb8d21 100644
--- a/llvm/lib/Passes/PassBuilder.cpp
+++ b/llvm/lib/Passes/PassBuilder.cpp
@@ -1924,6 +1924,13 @@ ModulePassManager 
PassBuilder::buildO0DefaultPipeline(OptimizationLevel Level,
 
   ModulePassManager MPM;
 
+  // Perform pseudo probe instrumentation in O0 mode. This is for the
+  // consistency between 
diff erent build modes. For example, a LTO build can be
+  // mixed with an O0 prelink and an O2 postlink. Loading a sample profile in
+  // the postlink will require pseudo probe instrumentation in the prelink.
+  if (PGOOpt && PGOOpt->PseudoProbeForProfiling)
+MPM.addPass(SampleProfileProbePass(TM));
+
   if (PGOOpt && (PGOOpt->Action == PGOOptions::IRInstr ||
  PGOOpt->Action == PGOOptions::IRUse))
 addPGOInstrPassesForO0(



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] ccb5b9b - [CSSPGO] Allow the use of debug-info-for-profiling and pseudo-probe-for-profiling together

2021-08-12 Thread Hongtao Yu via cfe-commits

Author: Hongtao Yu
Date: 2021-08-12T08:52:49-07:00
New Revision: ccb5b9bbfb5cef1aa2982481894f30c8f81d5253

URL: 
https://github.com/llvm/llvm-project/commit/ccb5b9bbfb5cef1aa2982481894f30c8f81d5253
DIFF: 
https://github.com/llvm/llvm-project/commit/ccb5b9bbfb5cef1aa2982481894f30c8f81d5253.diff

LOG: [CSSPGO] Allow the use of debug-info-for-profiling and 
pseudo-probe-for-profiling together

Previoulsy debug-info-for-profiling and pseudo-probe-for-profiling are mutual 
exclusive because they compete the dwarf discrimnator for callsites on the IR. 
This changes allows to use the two switches together. The side effect is that 
callsite discriminators will be taken by pseudo probe, while discriminators for 
other instructions are still available for AutoFDO use. This is less than 
ideal, however, it still allows us a chance to smoothly transition from AutoFDO 
to CSSPGO, by collecting both profiles from a CSSPGO binary.

Reviewed By: wenlei, wmi

Differential Revision: https://reviews.llvm.org/D107876

Added: 
llvm/test/Transforms/SampleProfile/pseudo-probe-discriminator.ll

Modified: 
clang/lib/Driver/ToolChains/Clang.cpp
clang/test/CodeGenCXX/fdebug-info-for-profiling.cpp
clang/test/Driver/pseudo-probe.c
llvm/include/llvm/Passes/PassBuilder.h
llvm/tools/opt/NewPMDriver.cpp

Removed: 




diff  --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index ceeae94e56678..e19e1222f702e 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -3892,12 +3892,6 @@ static void renderDebugOptions(const ToolChain , 
const Driver ,
ArgStringList ,
codegenoptions::DebugInfoKind ,
DwarfFissionKind ) {
-  // These two forms of profiling info can't be used together.
-  if (const Arg *A1 = 
Args.getLastArg(options::OPT_fpseudo_probe_for_profiling))
-if (const Arg *A2 = 
Args.getLastArg(options::OPT_fdebug_info_for_profiling))
-  D.Diag(diag::err_drv_argument_not_allowed_with)
-  << A1->getAsString(Args) << A2->getAsString(Args);
-
   if (Args.hasFlag(options::OPT_fdebug_info_for_profiling,
options::OPT_fno_debug_info_for_profiling, false) &&
   checkDebugInfoOption(

diff  --git a/clang/test/CodeGenCXX/fdebug-info-for-profiling.cpp 
b/clang/test/CodeGenCXX/fdebug-info-for-profiling.cpp
index 0a66818b23be3..421195db83408 100644
--- a/clang/test/CodeGenCXX/fdebug-info-for-profiling.cpp
+++ b/clang/test/CodeGenCXX/fdebug-info-for-profiling.cpp
@@ -14,8 +14,11 @@
 // RUN: echo > %t.proftext
 // RUN: llvm-profdata merge %t.proftext -o %t.profdata
 // RUN: %clang_cc1 -emit-llvm -fno-legacy-pass-manager -fdebug-pass-manager 
-O1 -fprofile-instrument-use-path=%t.profdata -fdebug-info-for-profiling %s -o 
- 2>&1 | FileCheck %s --check-prefix=DISCR
+// RUN: %clang_cc1 -emit-llvm -fno-legacy-pass-manager -fdebug-pass-manager 
-O1 -fdebug-info-for-profiling -fpseudo-probe-for-profiling %s -o - 2>&1 | 
FileCheck %s --check-prefix=PROBE
 
 // NODISCR-NOT: Running pass: AddDiscriminatorsPass
 // DISCR:   Running pass: AddDiscriminatorsPass on {{.*}}
+// PROBE:   Running pass: AddDiscriminatorsPass on {{.*}}
+// PROBE:   Running pass: SampleProfileProbePass on {{.*}}
 
 void foo() {}

diff  --git a/clang/test/Driver/pseudo-probe.c 
b/clang/test/Driver/pseudo-probe.c
index 79b23df557a6c..76c4364e609d0 100644
--- a/clang/test/Driver/pseudo-probe.c
+++ b/clang/test/Driver/pseudo-probe.c
@@ -1,13 +1,13 @@
 // RUN: %clang -### -fpseudo-probe-for-profiling %s 2>&1 | FileCheck %s 
--check-prefix=YESPROBE
 // RUN: %clang -### -fno-pseudo-probe-for-profiling %s 2>&1 | FileCheck %s 
--check-prefix=NOPROBE
-// RUN: %clang -### -fpseudo-probe-for-profiling -fdebug-info-for-profiling %s 
2>&1 | FileCheck %s --check-prefix=CONFLICT
+// RUN: %clang -### -fpseudo-probe-for-profiling -fdebug-info-for-profiling %s 
2>&1 | FileCheck %s --check-prefix=YESPROBE --check-prefix=YESDEBUG  
 // RUN: %clang -### -fpseudo-probe-for-profiling 
-funique-internal-linkage-names %s 2>&1 | FileCheck %s --check-prefix=YESPROBE
 // RUN: %clang -### -fpseudo-probe-for-profiling 
-fno-unique-internal-linkage-names %s 2>&1 | FileCheck %s --check-prefix=NONAME
 
+// YESDEBUG: -fdebug-info-for-profiling
 // YESPROBE: -fpseudo-probe-for-profiling
 // YESPROBE: -funique-internal-linkage-names
 // NOPROBE-NOT: -fpseudo-probe-for-profiling
 // NOPROBE-NOT: -funique-internal-linkage-names
 // NONAME: -fpseudo-probe-for-profiling
 // NONAME-NOT: -funique-internal-linkage-names
-// CONFLICT: invalid argument

diff  --git a/llvm/include/llvm/Passes/PassBuilder.h 
b/llvm/include/llvm/Passes/PassBuilder.h
index 35f791f9c2609..9ab7bd4664f59 100644
--- a/llvm/include/llvm/Passes/PassBuilder.h
+++ b/llvm/include/llvm/Passes/PassBuilder.h
@@ -65,14 +65,6 @@ struct 

[clang] 77aec97 - [CSSPGO] Turn on unique linkage name by default for pseudo probe.

2021-07-16 Thread Hongtao Yu via cfe-commits

Author: Hongtao Yu
Date: 2021-07-16T16:43:23-07:00
New Revision: 77aec978a911254299640f9b10bdf1933986b96e

URL: 
https://github.com/llvm/llvm-project/commit/77aec978a911254299640f9b10bdf1933986b96e
DIFF: 
https://github.com/llvm/llvm-project/commit/77aec978a911254299640f9b10bdf1933986b96e.diff

LOG: [CSSPGO] Turn on unique linkage name by default for pseudo probe.

Turning on -funique-internal-linkage-names when -fpseudo-probe-for-profiling is 
on, unless -fno-unique-internal-linkage-names is specified.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D106193

Added: 


Modified: 
clang/lib/Driver/ToolChains/Clang.cpp
clang/test/Driver/pseudo-probe.c

Removed: 




diff  --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 4336a25f091c4..0720ed4bb94a8 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -5972,8 +5972,14 @@ void Clang::ConstructJob(Compilation , const JobAction 
,
 Args.AddLastArg(CmdArgs, options::OPT_fprofile_remapping_file_EQ);
 
 if (Args.hasFlag(options::OPT_fpseudo_probe_for_profiling,
- options::OPT_fno_pseudo_probe_for_profiling, false))
+ options::OPT_fno_pseudo_probe_for_profiling, false)) {
   CmdArgs.push_back("-fpseudo-probe-for-profiling");
+  // Enforce -funique-internal-linkage-names if it's not explicitly turned
+  // off.
+  if (Args.hasFlag(options::OPT_funique_internal_linkage_names,
+   options::OPT_fno_unique_internal_linkage_names, true))
+CmdArgs.push_back("-funique-internal-linkage-names");
+}
   }
   RenderBuiltinOptions(TC, RawTriple, Args, CmdArgs);
 

diff  --git a/clang/test/Driver/pseudo-probe.c 
b/clang/test/Driver/pseudo-probe.c
index 297992cfd1a15..79b23df557a6c 100644
--- a/clang/test/Driver/pseudo-probe.c
+++ b/clang/test/Driver/pseudo-probe.c
@@ -1,7 +1,13 @@
 // RUN: %clang -### -fpseudo-probe-for-profiling %s 2>&1 | FileCheck %s 
--check-prefix=YESPROBE
 // RUN: %clang -### -fno-pseudo-probe-for-profiling %s 2>&1 | FileCheck %s 
--check-prefix=NOPROBE
 // RUN: %clang -### -fpseudo-probe-for-profiling -fdebug-info-for-profiling %s 
2>&1 | FileCheck %s --check-prefix=CONFLICT
+// RUN: %clang -### -fpseudo-probe-for-profiling 
-funique-internal-linkage-names %s 2>&1 | FileCheck %s --check-prefix=YESPROBE
+// RUN: %clang -### -fpseudo-probe-for-profiling 
-fno-unique-internal-linkage-names %s 2>&1 | FileCheck %s --check-prefix=NONAME
 
 // YESPROBE: -fpseudo-probe-for-profiling
+// YESPROBE: -funique-internal-linkage-names
 // NOPROBE-NOT: -fpseudo-probe-for-profiling
+// NOPROBE-NOT: -funique-internal-linkage-names
+// NONAME: -fpseudo-probe-for-profiling
+// NONAME-NOT: -funique-internal-linkage-names
 // CONFLICT: invalid argument



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 633ca3f - [UniqueLinkageName] Use exsiting GlobalDecl object instead of reconstructing one.

2021-06-28 Thread Hongtao Yu via cfe-commits

Author: Hongtao Yu
Date: 2021-06-28T14:50:41-07:00
New Revision: 633ca3ff2f8fc2e2b69001d17abc43f302578fc1

URL: 
https://github.com/llvm/llvm-project/commit/633ca3ff2f8fc2e2b69001d17abc43f302578fc1
DIFF: 
https://github.com/llvm/llvm-project/commit/633ca3ff2f8fc2e2b69001d17abc43f302578fc1.diff

LOG: [UniqueLinkageName] Use exsiting GlobalDecl object instead of 
reconstructing one.

C++ constructors/destructors need to go through a different constructor to 
construct a GlobalDecl object in order to retrieve their linkage type. This 
causes an assert failure in the default constructor of GlobalDecl. I'm chaning 
it to using the exsiting GlobalDecl object.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D102356

Added: 


Modified: 
clang/lib/CodeGen/CGCall.cpp
clang/test/CodeGen/unique-internal-linkage-names.cpp

Removed: 




diff  --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 1cd972f32f3ff..35b34179cc231 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -2174,7 +2174,8 @@ void CodeGenModule::ConstructAttributeList(StringRef Name,
   // functions with -funique-internal-linkage-names.
   if (TargetDecl && CodeGenOpts.UniqueInternalLinkageNames) {
 if (auto *Fn = dyn_cast(TargetDecl)) {
-  if (this->getFunctionLinkage(Fn) == llvm::GlobalValue::InternalLinkage)
+  if (this->getFunctionLinkage(CalleeInfo.getCalleeDecl()) ==
+  llvm::GlobalValue::InternalLinkage)
 FuncAttrs.addAttribute("sample-profile-suffix-elision-policy",
"selected");
 }

diff  --git a/clang/test/CodeGen/unique-internal-linkage-names.cpp 
b/clang/test/CodeGen/unique-internal-linkage-names.cpp
index c567bcde45a84..95591de308d37 100644
--- a/clang/test/CodeGen/unique-internal-linkage-names.cpp
+++ b/clang/test/CodeGen/unique-internal-linkage-names.cpp
@@ -42,12 +42,26 @@ int mver_call() {
   return mver();
 }
 
+namespace {
+class A {
+public:
+  A() {}
+  ~A() {}
+};
+}
+
+void test() {
+  A a;
+}
+
 // PLAIN: @_ZL4glob = internal global
 // PLAIN: @_ZZ8retAnonMvE5fGlob = internal global
 // PLAIN: @_ZN12_GLOBAL__N_16anon_mE = internal global
 // PLAIN: define internal i32 @_ZL3foov()
 // PLAIN: define internal i32 @_ZN12_GLOBAL__N_14getMEv
 // PLAIN: define weak_odr i32 ()* @_ZL4mverv.resolver()
+// PLAIN: define internal void @_ZN12_GLOBAL__N_11AC1Ev
+// PLAIN: define internal void @_ZN12_GLOBAL__N_11AD1Ev
 // PLAIN: define internal i32 @_ZL4mverv()
 // PLAIN: define internal i32 @_ZL4mverv.sse4.2()
 // PLAIN-NOT: "sample-profile-suffix-elision-policy"
@@ -57,6 +71,8 @@ int mver_call() {
 // UNIQUE: define internal i32 @_ZL3foov.[[MODHASH:__uniq.[0-9]+]]() 
#[[#ATTR:]] {
 // UNIQUE: define internal i32 @_ZN12_GLOBAL__N_14getMEv.[[MODHASH]]
 // UNIQUE: define weak_odr i32 ()* @_ZL4mverv.[[MODHASH]].resolver()
+// UNIQUE: define internal void 
@_ZN12_GLOBAL__N_11AC1Ev.__uniq.68358509610070717889884130747296293671
+// UNIQUE: define internal void 
@_ZN12_GLOBAL__N_11AD1Ev.__uniq.68358509610070717889884130747296293671
 // UNIQUE: define internal i32 @_ZL4mverv.[[MODHASH]]()
 // UNIQUE: define internal i32 @_ZL4mverv.[[MODHASH]].sse4.2
 // UNIQUE: attributes #[[#ATTR]] = { 
{{.*}}"sample-profile-suffix-elision-policy"{{.*}} }



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 64b2fb7 - [CSSPGO] Emit mangled dwarf names for line tables debug option under -fpseudo-probe-for-profiling

2021-06-09 Thread Hongtao Yu via cfe-commits

Author: Hongtao Yu
Date: 2021-06-09T10:46:03-07:00
New Revision: 64b2fb7967a749b83f59656f0cd2f4d00501efaa

URL: 
https://github.com/llvm/llvm-project/commit/64b2fb7967a749b83f59656f0cd2f4d00501efaa
DIFF: 
https://github.com/llvm/llvm-project/commit/64b2fb7967a749b83f59656f0cd2f4d00501efaa.diff

LOG: [CSSPGO] Emit mangled dwarf names for line tables debug option under 
-fpseudo-probe-for-profiling

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D103909

Added: 
clang/test/CodeGen/debug-info-pseudo-probe.cpp

Modified: 
clang/lib/CodeGen/CGDebugInfo.cpp

Removed: 




diff  --git a/clang/lib/CodeGen/CGDebugInfo.cpp 
b/clang/lib/CodeGen/CGDebugInfo.cpp
index 1367ef46d85d..080d494a2830 100644
--- a/clang/lib/CodeGen/CGDebugInfo.cpp
+++ b/clang/lib/CodeGen/CGDebugInfo.cpp
@@ -3551,6 +3551,7 @@ void CGDebugInfo::collectFunctionDeclProps(GlobalDecl GD, 
llvm::DIFile *Unit,
   if (LinkageName == Name || (!CGM.getCodeGenOpts().EmitGcovArcs &&
   !CGM.getCodeGenOpts().EmitGcovNotes &&
   !CGM.getCodeGenOpts().DebugInfoForProfiling &&
+  !CGM.getCodeGenOpts().PseudoProbeForProfiling &&
   DebugKind <= 
codegenoptions::DebugLineTablesOnly))
 LinkageName = StringRef();
 

diff  --git a/clang/test/CodeGen/debug-info-pseudo-probe.cpp 
b/clang/test/CodeGen/debug-info-pseudo-probe.cpp
new file mode 100644
index ..78a684cd1f39
--- /dev/null
+++ b/clang/test/CodeGen/debug-info-pseudo-probe.cpp
@@ -0,0 +1,12 @@
+// This test checks if a symbol gets mangled dwarf names with 
-fpseudo-probe-for-profiling option.
+// RUN: %clang_cc1 -triple x86_64 -x c++ -S -emit-llvm 
-debug-info-kind=line-tables-only -o - < %s | FileCheck %s --check-prefix=PLAIN
+// RUN: %clang_cc1 -triple x86_64 -x c++  -S -emit-llvm 
-debug-info-kind=line-tables-only -fpseudo-probe-for-profiling -o - < %s | 
FileCheck %s --check-prefix=MANGLE
+
+int foo() {
+  return 0;
+}
+
+// PLAIN: define dso_local i32 @_Z3foov()
+// PLAIN: distinct !DISubprogram(name: "foo", scope:
+// MANGLE: define dso_local i32 @_Z3foov()
+// MANGLE: distinct !DISubprogram(name: "foo", linkageName: "_Z3foov"



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] fc1812a - [UniqueLinkageName] Use consistent checks when mangling symbo linkage name and debug linkage name.

2021-03-18 Thread Hongtao Yu via cfe-commits

Author: Hongtao Yu
Date: 2021-03-18T22:11:16-07:00
New Revision: fc1812a0ad757838b66aab57e1df720ec205a16a

URL: 
https://github.com/llvm/llvm-project/commit/fc1812a0ad757838b66aab57e1df720ec205a16a
DIFF: 
https://github.com/llvm/llvm-project/commit/fc1812a0ad757838b66aab57e1df720ec205a16a.diff

LOG: [UniqueLinkageName] Use consistent checks when mangling symbo linkage name 
and debug linkage name.

C functions may be declared and defined in different prototypes like below. 
This patch unifies the checks for mangling names in symbol linkage name 
emission and debug linkage name emission so that the two names are consistent.

static int go(int);

static int go(a) int a;
{
  return a;
}

Test Plan:

Differential Revision: https://reviews.llvm.org/D98799

Added: 


Modified: 
clang/lib/AST/ItaniumMangle.cpp
clang/lib/CodeGen/CGDebugInfo.cpp
clang/test/CodeGen/unique-internal-linkage-names-dwarf.c

Removed: 




diff  --git a/clang/lib/AST/ItaniumMangle.cpp b/clang/lib/AST/ItaniumMangle.cpp
index ba96fda6cd57..3e6e29207f08 100644
--- a/clang/lib/AST/ItaniumMangle.cpp
+++ b/clang/lib/AST/ItaniumMangle.cpp
@@ -640,7 +640,7 @@ bool ItaniumMangleContextImpl::isUniqueInternalLinkageDecl(
 
   // For C functions without prototypes, return false as their
   // names should not be mangled.
-  if (!FD->getType()->getAs())
+  if (!FD->hasPrototype())
 return false;
 
   if (isInternalLinkageDecl(ND))

diff  --git a/clang/lib/CodeGen/CGDebugInfo.cpp 
b/clang/lib/CodeGen/CGDebugInfo.cpp
index 468c2b78b488..c80249a9c9fc 100644
--- a/clang/lib/CodeGen/CGDebugInfo.cpp
+++ b/clang/lib/CodeGen/CGDebugInfo.cpp
@@ -3522,7 +3522,7 @@ void CGDebugInfo::collectFunctionDeclProps(GlobalDecl GD, 
llvm::DIFile *Unit,
llvm::DIScope *,
llvm::DINodeArray ,
llvm::DINode::DIFlags ) {
-  const auto *FD = cast(GD.getDecl());
+  const auto *FD = cast(GD.getCanonicalDecl().getDecl());
   Name = getFunctionName(FD);
   // Use mangled name as linkage name for C/C++ functions.
   if (FD->hasPrototype()) {

diff  --git a/clang/test/CodeGen/unique-internal-linkage-names-dwarf.c 
b/clang/test/CodeGen/unique-internal-linkage-names-dwarf.c
index a3583426de79..e5d507e154ae 100644
--- a/clang/test/CodeGen/unique-internal-linkage-names-dwarf.c
+++ b/clang/test/CodeGen/unique-internal-linkage-names-dwarf.c
@@ -8,21 +8,48 @@
 // RUN: %clang_cc1 -triple x86_64-unknown-linux -debug-info-kind=limited 
-dwarf-version=5 -funique-internal-linkage-names -emit-llvm -o -  %s | 
FileCheck %s --check-prefix=UNIQUE
 
 static int glob;
+// foo should be given a uniquefied name under -funique-internal-linkage-names.
 static int foo(void) {
   return glob;
 }
 
+// bar should not be given a uniquefied name under 
-funique-internal-linkage-names, 
+// since it doesn't come with valid prototype.
+static int bar(a) int a;
+{
+  return glob + a;
+}
+
+// go should be given a uniquefied name under -funique-internal-linkage-names, 
even 
+// if its definition doesn't come with a valid prototype, but the declaration 
here
+// has a prototype.
+static int go(int);
+
 void baz() {
   foo();
+  bar(1);
+  go(2);
 }
 
+static int go(a) int a;
+{
+  return glob + a;
+}
+
+
 // PLAIN: @glob = internal global i32
 // PLAIN: define internal i32 @foo()
+// PLAIN: define internal i32 @bar(i32 %a)
 // PLAIN: distinct !DIGlobalVariable(name: "glob"{{.*}})
 // PLAIN: distinct !DISubprogram(name: "foo"{{.*}})
+// PLAIN: distinct !DISubprogram(name: "bar"{{.*}})
+// PLAIN: distinct !DISubprogram(name: "go"{{.*}})
 // PLAIN-NOT: linkageName:
 //
 // UNIQUE: @glob = internal global i32
 // UNIQUE: define internal i32 @_ZL3foov.[[MODHASH:__uniq.[0-9]+]]()
+// UNIQUE: define internal i32 @bar(i32 %a)
+// UNIQUE: define internal i32 @_ZL2goi.[[MODHASH]](i32 %a)
 // UNIQUE: distinct !DIGlobalVariable(name: "glob"{{.*}})
 // UNIQUE: distinct !DISubprogram(name: "foo", linkageName: 
"_ZL3foov.[[MODHASH]]"{{.*}})
+// UNIQUE: distinct !DISubprogram(name: "go", linkageName: 
"_ZL2goi.[[MODHASH]]"{{.*}})



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 3d89b3c - [CSSPGO] Introducing distribution factor for pseudo probe.

2021-02-02 Thread Hongtao Yu via cfe-commits

Author: Hongtao Yu
Date: 2021-02-02T11:55:01-08:00
New Revision: 3d89b3cbec230633e8228787819b15116c1a1730

URL: 
https://github.com/llvm/llvm-project/commit/3d89b3cbec230633e8228787819b15116c1a1730
DIFF: 
https://github.com/llvm/llvm-project/commit/3d89b3cbec230633e8228787819b15116c1a1730.diff

LOG: [CSSPGO] Introducing distribution factor for pseudo probe.

Sample re-annotation is required in LTO time to achieve a reasonable 
post-inline profile quality. However, we have seen that such LTO-time 
re-annotation degrades profile quality. This is mainly caused by preLTO code 
duplication that is done by passes such as loop unrolling, jump threading, 
indirect call promotion etc, where samples corresponding to a source location 
are aggregated multiple times due to the duplicates. In this change we are 
introducing a concept of distribution factor for pseudo probes so that samples 
can be distributed for duplicated probes scaled by a factor. We hope that 
optimizations duplicating code well-maintain the branch frequency information 
(BFI) based on which probe distribution factors are calculated. Distribution 
factors are updated at the end of preLTO pipeline to reflect an estimated 
portion of the real execution count.

This change also introduces a pseudo probe verifier that can be run after each 
IR passes to detect duplicated pseudo probes.

A saturated distribution factor stands for 1.0. A pesudo probe will carry a 
factor with the value ranged from 0.0 to 1.0. A 64-bit integral distribution 
factor field that represents [0.0, 1.0] is associated to each block probe. 
Unfortunately this cannot be done for callsite probes due to the size 
limitation of a 32-bit Dwarf discriminator. A 7-bit distribution factor is used 
instead.

Changes are also needed to the sample profile inliner to deal with prorated 
callsite counts. Call sites duplicated by PreLTO passes, when later on inlined 
in LTO time, should have the callees’s probe prorated based on the 
Prelink-computed distribution factors. The distribution factors should also be 
taken into account when computing hotness for inline candidates. Also, Indirect 
call promotion results in multiple callisites. The original samples should be 
distributed across them. This is fixed by adjusting the callisites' 
distribution factors.

Reviewed By: wmi

Differential Revision: https://reviews.llvm.org/D93264

Added: 
llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof
llvm/test/Transforms/SampleProfile/pseudo-probe-update.ll
llvm/test/Transforms/SampleProfile/pseudo-probe-verify.ll

Modified: 
clang/test/CodeGen/pseudo-probe-emit.c
llvm/include/llvm/IR/IntrinsicInst.h
llvm/include/llvm/IR/Intrinsics.td
llvm/include/llvm/IR/PseudoProbe.h
llvm/include/llvm/Passes/StandardInstrumentations.h
llvm/include/llvm/ProfileData/SampleProf.h
llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h
llvm/lib/IR/PseudoProbe.cpp
llvm/lib/Passes/PassBuilder.cpp
llvm/lib/Passes/PassRegistry.def
llvm/lib/Passes/StandardInstrumentations.cpp
llvm/lib/Transforms/IPO/SampleProfile.cpp
llvm/lib/Transforms/IPO/SampleProfileProbe.cpp
llvm/test/Transforms/SampleProfile/pseudo-probe-emit-inline.ll
llvm/test/Transforms/SampleProfile/pseudo-probe-emit.ll
llvm/test/Transforms/SampleProfile/pseudo-probe-inline.ll
llvm/test/Transforms/SampleProfile/pseudo-probe-profile.ll

Removed: 




diff  --git a/clang/test/CodeGen/pseudo-probe-emit.c 
b/clang/test/CodeGen/pseudo-probe-emit.c
index 059673b6992e..fccc8f04844d 100644
--- a/clang/test/CodeGen/pseudo-probe-emit.c
+++ b/clang/test/CodeGen/pseudo-probe-emit.c
@@ -6,12 +6,12 @@ void bar();
 void go();
 
 void foo(int x) {
-  // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID:]], i64 1, i32 0)
+  // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID:]], i64 1, i32 0, i64 -1)
   if (x == 0)
-// CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 2, i32 0)
+// CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 2, i32 0, i64 -1)
 bar();
   else
-// CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 3, i32 0)
+// CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 3, i32 0, i64 -1)
 go();
-  // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 4, i32 0)
+  // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 4, i32 0, i64 -1)
 }

diff  --git a/llvm/include/llvm/IR/IntrinsicInst.h 
b/llvm/include/llvm/IR/IntrinsicInst.h
index 9d68f3fdde6c..df3a1d568756 100644
--- a/llvm/include/llvm/IR/IntrinsicInst.h
+++ b/llvm/include/llvm/IR/IntrinsicInst.h
@@ -981,12 +981,16 @@ class PseudoProbeInst : public IntrinsicInst {
 return cast(const_cast(getArgOperand(0)));
   }
 
+  ConstantInt *getIndex() const {
+return cast(const_cast(getArgOperand(1)));
+  }
+
   ConstantInt *getAttributes() const {
 return cast(const_cast(getArgOperand(2)));
   }
 
-  

[clang] d3e2e37 - [CSSPGO] Passing the clang driver switch -fpseudo-probe-for-profiling to the linker.

2021-02-02 Thread Hongtao Yu via cfe-commits

Author: Hongtao Yu
Date: 2021-02-02T09:43:57-08:00
New Revision: d3e2e3740d0730cb6788c771bb01a8f3e935bf2e

URL: 
https://github.com/llvm/llvm-project/commit/d3e2e3740d0730cb6788c771bb01a8f3e935bf2e
DIFF: 
https://github.com/llvm/llvm-project/commit/d3e2e3740d0730cb6788c771bb01a8f3e935bf2e.diff

LOG: [CSSPGO] Passing the clang driver switch -fpseudo-probe-for-profiling to 
the linker.

As titled.

Reviewed By: wmi, wenlei

Differential Revision: https://reviews.llvm.org/D95271

Added: 
clang/test/Driver/pseudo-probe-lto.c

Modified: 
clang/include/clang/Driver/Options.td
clang/lib/Driver/ToolChains/CommonArgs.cpp

Removed: 




diff  --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index aee312ea8e8a..edaa42741062 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1155,7 +1155,7 @@ def fprofile_update_EQ : Joined<["-"], 
"fprofile-update=">,
 defm pseudo_probe_for_profiling : BoolFOption<"pseudo-probe-for-profiling",
   CodeGenOpts<"PseudoProbeForProfiling">, DefaultFalse,
   PosFlag, NegFlag,
-  BothFlags<[NoXarchOption, CC1Option], " pseudo probes for sample profiler">>;
+  BothFlags<[NoXarchOption, CC1Option], " pseudo probes for sample 
profiling">>;
 def forder_file_instrumentation : Flag<["-"], "forder-file-instrumentation">,
 Group, Flags<[CC1Option, CoreOption]>,
 HelpText<"Generate instrumented code to collect order file into 
default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env 
var)">;

diff  --git a/clang/lib/Driver/ToolChains/CommonArgs.cpp 
b/clang/lib/Driver/ToolChains/CommonArgs.cpp
index 6a95aa5ec628..bcaea71dca94 100644
--- a/clang/lib/Driver/ToolChains/CommonArgs.cpp
+++ b/clang/lib/Driver/ToolChains/CommonArgs.cpp
@@ -605,6 +605,11 @@ void tools::addLTOOptions(const ToolChain , 
const ArgList ,
   CmdArgs.push_back("-plugin-opt=new-pass-manager");
   }
 
+  // Pass an option to enable pseudo probe emission.
+  if (Args.hasFlag(options::OPT_fpseudo_probe_for_profiling,
+   options::OPT_fno_pseudo_probe_for_profiling, false))
+CmdArgs.push_back("-plugin-opt=pseudo-probe-for-profiling");
+
   // Setup statistics file output.
   SmallString<128> StatsFile = getStatsFileName(Args, Output, Input, D);
   if (!StatsFile.empty())

diff  --git a/clang/test/Driver/pseudo-probe-lto.c 
b/clang/test/Driver/pseudo-probe-lto.c
new file mode 100644
index ..e319b8c0098b
--- /dev/null
+++ b/clang/test/Driver/pseudo-probe-lto.c
@@ -0,0 +1,10 @@
+// RUN: touch %t.o
+// RUN: %clang -### %t.o -target x86_64-unknown-linux -flto 
-fpseudo-probe-for-profiling 2>&1 | FileCheck %s --check-prefix=PROBE
+// RUN: %clang -### %t.o -target x86_64-unknown-linux -flto=thin 
-fpseudo-probe-for-profiling 2>&1 | FileCheck %s --check-prefix=PROBE
+// RUN: %clang -### %t.o -target x86_64-unknown-linux -flto 
-fno-pseudo-probe-for-profiling -fpseudo-probe-for-profiling 2>&1 | FileCheck 
%s --check-prefix=PROBE
+// RUN: %clang -### %t.o -target x86_64-unknown-linux -flto 2>&1 | FileCheck 
%s --check-prefix=NOPROBE
+// RUN: %clang -### %t.o -target x86_64-unknown-linux -flto 
-fno-pseudo-probe-for-profiling 2>&1 | FileCheck %s --check-prefix=NOPROBE
+// RUN: %clang -### %t.o -target x86_64-unknown-linux -flto 
-fpseudo-probe-for-profiling -fno-pseudo-probe-for-profiling 2>&1 | FileCheck 
%s --check-prefix=NOPROBE
+
+// PROBE: -plugin-opt=pseudo-probe-for-profiling
+// NOPROBE-NOT: -plugin-opt=pseudo-probe-for-profiling



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 0e23fd6 - [Driver] Add DWARF64 flag: -gdwarf64

2021-01-08 Thread Hongtao Yu via cfe-commits

Author: Hongtao Yu
Date: 2021-01-08T12:58:38-08:00
New Revision: 0e23fd676c3984a2b867c167950464262c8e0dc6

URL: 
https://github.com/llvm/llvm-project/commit/0e23fd676c3984a2b867c167950464262c8e0dc6
DIFF: 
https://github.com/llvm/llvm-project/commit/0e23fd676c3984a2b867c167950464262c8e0dc6.diff

LOG: [Driver] Add DWARF64 flag: -gdwarf64

@ikudrin enabled support for dwarf64 in D87011.  Adding a clang flag so it can 
be used through that compilation pass.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D90507

Added: 


Modified: 
clang/include/clang/Basic/CodeGenOptions.def
clang/include/clang/Driver/Options.td
clang/lib/CodeGen/BackendUtil.cpp
clang/lib/Driver/ToolChains/Clang.cpp
clang/lib/Frontend/CompilerInvocation.cpp
clang/test/Driver/debug-options.c

Removed: 




diff  --git a/clang/include/clang/Basic/CodeGenOptions.def 
b/clang/include/clang/Basic/CodeGenOptions.def
index 50a778e2a328..d3851df23122 100644
--- a/clang/include/clang/Basic/CodeGenOptions.def
+++ b/clang/include/clang/Basic/CodeGenOptions.def
@@ -32,6 +32,8 @@ ENUM_CODEGENOPT(CompressDebugSections, 
llvm::DebugCompressionType, 2,
 llvm::DebugCompressionType::None)
 CODEGENOPT(RelaxELFRelocations, 1, 0) ///< -Wa,--mrelax-relocations
 CODEGENOPT(AsmVerbose, 1, 0) ///< -dA, -fverbose-asm.
+CODEGENOPT(Dwarf64   , 1, 0) ///< -gdwarf64.
+CODEGENOPT(Dwarf32   , 1, 1) ///< -gdwarf32.
 CODEGENOPT(PreserveAsmComments, 1, 1) ///< -dA, -fno-preserve-as-comments.
 CODEGENOPT(AssumeSaneOperatorNew , 1, 1) ///< implicit __attribute__((malloc)) 
operator new
 CODEGENOPT(Autolink  , 1, 1) ///< -fno-autolink

diff  --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 6585fd1ceb01..9e1059cd14f0 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -2574,6 +2574,10 @@ def gdwarf_4 : Flag<["-"], "gdwarf-4">, Group,
   HelpText<"Generate source-level debug information with dwarf version 4">;
 def gdwarf_5 : Flag<["-"], "gdwarf-5">, Group,
   HelpText<"Generate source-level debug information with dwarf version 5">;
+def gdwarf64 : Flag<["-"], "gdwarf64">, Group, Flags<[CC1Option]>,
+  HelpText<"Enables DWARF64 format for ELF binaries, if debug information 
emission is enabled.">;
+def gdwarf32 : Flag<["-"], "gdwarf32">, Group, Flags<[CC1Option]>,
+  HelpText<"Enables DWARF32 format for ELF binaries, if debug information 
emission is enabled.">;
 
 def gcodeview : Flag<["-"], "gcodeview">,
   HelpText<"Generate CodeView debug information">,

diff  --git a/clang/lib/CodeGen/BackendUtil.cpp 
b/clang/lib/CodeGen/BackendUtil.cpp
index 296b111feb2d..90cf5fc5df9a 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -571,6 +571,7 @@ static bool initTargetOptions(DiagnosticsEngine ,
   Options.MCOptions.MCFatalWarnings = CodeGenOpts.FatalWarnings;
   Options.MCOptions.MCNoWarn = CodeGenOpts.NoWarn;
   Options.MCOptions.AsmVerbose = CodeGenOpts.AsmVerbose;
+  Options.MCOptions.Dwarf64 = CodeGenOpts.Dwarf64;
   Options.MCOptions.PreserveAsmComments = CodeGenOpts.PreserveAsmComments;
   Options.MCOptions.ABIName = TargetOpts.ABI;
   for (const auto  : HSOpts.UserEntries)

diff  --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index a462758bf1c2..d6453777200f 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -4017,6 +4017,25 @@ static void RenderDebugOptions(const ToolChain , 
const Driver ,
   if (DebuggerTuning == llvm::DebuggerKind::SCE)
 CmdArgs.push_back("-dwarf-explicit-import");
 
+  auto *DwarfFormatArg =
+  Args.getLastArg(options::OPT_gdwarf64, options::OPT_gdwarf32);
+  if (DwarfFormatArg &&
+  DwarfFormatArg->getOption().matches(options::OPT_gdwarf64)) {
+const llvm::Triple  = TC.getTriple();
+if (EffectiveDWARFVersion < 3)
+  D.Diag(diag::err_drv_argument_only_allowed_with)
+  << DwarfFormatArg->getAsString(Args) << "DWARFv3 or greater";
+else if (!RawTriple.isArch64Bit())
+  D.Diag(diag::err_drv_argument_only_allowed_with)
+  << DwarfFormatArg->getAsString(Args) << "64 bit architecture";
+else if (!RawTriple.isOSBinFormatELF())
+  D.Diag(diag::err_drv_argument_only_allowed_with)
+  << DwarfFormatArg->getAsString(Args) << "ELF platforms";
+  }
+
+  if (DwarfFormatArg)
+DwarfFormatArg->render(Args, CmdArgs);
+
   RenderDebugInfoCompressionArgs(Args, CmdArgs, D, TC);
 }
 

diff  --git a/clang/lib/Frontend/CompilerInvocation.cpp 
b/clang/lib/Frontend/CompilerInvocation.cpp
index 6795151d08d5..dd66bf5b4efc 100644
--- a/clang/lib/Frontend/CompilerInvocation.cpp
+++ b/clang/lib/Frontend/CompilerInvocation.cpp
@@ -986,6 +986,7 @@ static bool ParseCodeGenArgs(CodeGenOptions , ArgList 
, InputKind IK,
 

[clang] 4034f92 - Switching Clang UniqueInternalLinkageNamesPass scheduling to using the LLVM one with newpm.

2021-01-04 Thread Hongtao Yu via cfe-commits

Author: Hongtao Yu
Date: 2021-01-04T12:04:46-08:00
New Revision: 4034f9273edacbb1c37acf19139594a226c8bdac

URL: 
https://github.com/llvm/llvm-project/commit/4034f9273edacbb1c37acf19139594a226c8bdac
DIFF: 
https://github.com/llvm/llvm-project/commit/4034f9273edacbb1c37acf19139594a226c8bdac.diff

LOG: Switching Clang UniqueInternalLinkageNamesPass scheduling to using the 
LLVM one with newpm.

As a follow-up to D93656, I'm switching the Clang 
UniqueInternalLinkageNamesPass scheduling to using the LLVM one with newpm.

Test Plan:

Reviewed By: aeubanks, tmsriram

Differential Revision: https://reviews.llvm.org/D94019

Added: 


Modified: 
clang/lib/CodeGen/BackendUtil.cpp

Removed: 




diff  --git a/clang/lib/CodeGen/BackendUtil.cpp 
b/clang/lib/CodeGen/BackendUtil.cpp
index b326c643738f..296b111feb2d 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -1145,6 +1145,7 @@ void EmitAssemblyHelper::EmitAssemblyWithNewPassManager(
   // non-integrated assemblers don't recognize .cgprofile section.
   PTO.CallGraphProfile = !CodeGenOpts.DisableIntegratedAS;
   PTO.Coroutines = LangOpts.Coroutines;
+  PTO.UniqueLinkageNames = CodeGenOpts.UniqueInternalLinkageNames;
 
   PassInstrumentationCallbacks PIC;
   StandardInstrumentations SI(CodeGenOpts.DebugPassManager);
@@ -1326,11 +1327,6 @@ void EmitAssemblyHelper::EmitAssemblyWithNewPassManager(
   MPM = PB.buildPerModuleDefaultPipeline(Level);
 }
 
-// Add UniqueInternalLinkageNames Pass which renames internal linkage
-// symbols with unique names.
-if (CodeGenOpts.UniqueInternalLinkageNames)
-  MPM.addPass(UniqueInternalLinkageNamesPass());
-
 if (!CodeGenOpts.MemoryProfileOutput.empty()) {
   MPM.addPass(createModuleToFunctionPassAdaptor(MemProfilerPass()));
   MPM.addPass(ModuleMemProfilerPass());



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 24d4291 - [CSSPGO] Pseudo probes for function calls.

2020-12-02 Thread Hongtao Yu via cfe-commits

Author: Hongtao Yu
Date: 2020-12-02T13:45:20-08:00
New Revision: 24d4291ca704fa5ee2419b4163aa324eca693fd6

URL: 
https://github.com/llvm/llvm-project/commit/24d4291ca704fa5ee2419b4163aa324eca693fd6
DIFF: 
https://github.com/llvm/llvm-project/commit/24d4291ca704fa5ee2419b4163aa324eca693fd6.diff

LOG: [CSSPGO] Pseudo probes for function calls.

An indirect call site needs to be probed for its potential call targets. With 
CSSPGO a direct call also needs a probe so that a calling context can be 
represented by a stack of callsite probes. Unlike pseudo probes for basic 
blocks that are in form of standalone intrinsic call instructions, pseudo 
probes for callsites have to be attached to the call instruction, thus a 
separate instruction would not work.

One possible way of attaching a probe to a call instruction is to use a special 
metadata that carries information about the probe. The special metadata will 
have to make its way through the optimization pipeline down to object emission. 
This requires additional efforts to maintain the metadata in various places. 
Given that the `!dbg` metadata is a first-class metadata and has all essential 
support in place , leveraging the `!dbg` metadata as a channel to encode pseudo 
probe information is probably the easiest solution.

With the requirement of not inflating `!dbg` metadata that is allocated for 
almost every instruction, we found that the 32-bit DWARF discriminator field 
which mainly serves AutoFDO can be reused for pseudo probes. DWARF 
discriminators distinguish identical source locations between instructions and 
with pseudo probes such support is not required. In this change we are using 
the discriminator field to encode the ID and type of a callsite probe and the 
encoded value will be unpacked and consumed right before object emission. When 
a callsite is inlined, the callsite discriminator field will go with the 
inlined instructions. The `!dbg` metadata of an inlined instruction is in form 
of a scope stack. The top of the stack is the instruction's original `!dbg` 
metadata and the bottom of the stack is for the original callsite of the 
top-level inliner. Except for the top of the stack, all other elements of the 
stack actually refer to the nested inlined callsites whose discriminator field 
(which actually represents a calliste probe) can be used together to represent 
the inline context of an inlined PseudoProbeInst or CallInst.

To avoid collision with the baseline AutoFDO in various places that handles 
dwarf discriminators where a check against  the `-pseudo-probe-for-profiling` 
switch is not available, a special encoding scheme is used to tell apart a 
pseudo probe discriminator from a regular discriminator. For the regular 
discriminator, if all lowest 3 bits are non-zero, it means the discriminator is 
basically empty and all higher 29 bits can be reversed for pseudo probe use.

Callsite pseudo probes are inserted in `SampleProfileProbePass` and a 
target-independent MIR pass `PseudoProbeInserter` is added to unpack the probe 
ID/type from `!dbg`.

Note that with this work the switch -debug-info-for-profiling will not work 
with -pseudo-probe-for-profiling anymore. They cannot be used at the same time.

Reviewed By: wmi

Differential Revision: https://reviews.llvm.org/D91756

Added: 
llvm/include/llvm/IR/PseudoProbe.h
llvm/lib/CodeGen/PseudoProbeInserter.cpp

Modified: 
clang/lib/CodeGen/BackendUtil.cpp
llvm/include/llvm/CodeGen/CommandFlags.h
llvm/include/llvm/CodeGen/Passes.h
llvm/include/llvm/IR/DebugInfoMetadata.h
llvm/include/llvm/InitializePasses.h
llvm/include/llvm/Passes/PassBuilder.h
llvm/include/llvm/Target/TargetOptions.h
llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h
llvm/lib/CodeGen/CMakeLists.txt
llvm/lib/CodeGen/CommandFlags.cpp
llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp
llvm/lib/CodeGen/TargetPassConfig.cpp
llvm/lib/Target/X86/X86TargetMachine.cpp
llvm/lib/Transforms/IPO/SampleProfileProbe.cpp
llvm/test/Transforms/SampleProfile/pseudo-probe-emit.ll

Removed: 




diff  --git a/clang/lib/CodeGen/BackendUtil.cpp 
b/clang/lib/CodeGen/BackendUtil.cpp
index b62a66a51d26..724e2ec16fc3 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -555,6 +555,7 @@ static bool initTargetOptions(DiagnosticsEngine ,
   Options.ForceDwarfFrameSection = CodeGenOpts.ForceDwarfFrameSection;
   Options.EmitCallSiteInfo = CodeGenOpts.EmitCallSiteInfo;
   Options.EnableAIXExtendedAltivecABI = 
CodeGenOpts.EnableAIXExtendedAltivecABI;
+  Options.PseudoProbeForProfiling = CodeGenOpts.PseudoProbeForProfiling;
   Options.ValueTrackingVariableLocations =
   CodeGenOpts.ValueTrackingVariableLocations;
   Options.XRayOmitFunctionIndex = CodeGenOpts.XRayOmitFunctionIndex;

diff  --git a/llvm/include/llvm/CodeGen/CommandFlags.h 

[clang] c083fed - [CSSPGO] A Clang switch -fpseudo-probe-for-profiling for pseudo-probe instrumentation.

2020-11-30 Thread Hongtao Yu via cfe-commits

Author: Hongtao Yu
Date: 2020-11-30T10:16:54-08:00
New Revision: c083fededfa63df6e1a560334bdb78797da9ee57

URL: 
https://github.com/llvm/llvm-project/commit/c083fededfa63df6e1a560334bdb78797da9ee57
DIFF: 
https://github.com/llvm/llvm-project/commit/c083fededfa63df6e1a560334bdb78797da9ee57.diff

LOG: [CSSPGO] A Clang switch -fpseudo-probe-for-profiling for pseudo-probe 
instrumentation.

This change introduces a new clang switch `-fpseudo-probe-for-profiling` to 
enable AutoFDO with pseudo instrumentation. Please refer to 
https://reviews.llvm.org/D86193 for the whole story.

One implication from pseudo-probe instrumentation is that the profile is now 
sensitive to CFG changes. We perform the pseudo instrumentation very early in 
the pre-LTO pipeline, before any CFG transformation. This ensures that the CFG 
instrumented and annotated is stable and optimization-resilient.

The early instrumentation also allows the inliner to duplicate probes for 
inlined instances. When a probe along with the other instructions of a callee 
function are inlined into its caller function, the GUID of the callee function 
goes with the probe. This allows samples collected on inlined probes to be 
reported for the original callee function.

Reviewed By: wmi

Differential Revision: https://reviews.llvm.org/D86502

Added: 
clang/test/CodeGen/pseudo-probe-emit.c

Modified: 
clang/include/clang/Basic/CodeGenOptions.def
clang/include/clang/Driver/Options.td
clang/lib/CodeGen/BackendUtil.cpp
clang/lib/Driver/ToolChains/Clang.cpp
clang/lib/Frontend/CompilerInvocation.cpp
llvm/include/llvm/Passes/PassBuilder.h
llvm/lib/Passes/PassBuilder.cpp

Removed: 




diff  --git a/clang/include/clang/Basic/CodeGenOptions.def 
b/clang/include/clang/Basic/CodeGenOptions.def
index d90e403915ed..8c4a70ba4125 100644
--- a/clang/include/clang/Basic/CodeGenOptions.def
+++ b/clang/include/clang/Basic/CodeGenOptions.def
@@ -380,6 +380,9 @@ CODEGENOPT(StrictReturn, 1, 1)
 /// Whether emit extra debug info for sample pgo profile collection.
 CODEGENOPT(DebugInfoForProfiling, 1, 0)
 
+/// Whether emit pseudo probes for sample pgo profile collection.
+CODEGENOPT(PseudoProbeForProfiling, 1, 0)
+
 /// Whether 3-component vector type is preserved.
 CODEGENOPT(PreserveVec3Type, 1, 0)
 

diff  --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 0014ced5dca7..ac0761ec773f 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -903,6 +903,12 @@ def fprofile_exclude_files_EQ : Joined<["-"], 
"fprofile-exclude-files=">,
 def fprofile_update_EQ : Joined<["-"], "fprofile-update=">,
 Group, Flags<[CC1Option, CoreOption]>, 
Values<"atomic,prefer-atomic,single">,
 MetaVarName<"">, HelpText<"Set update method of profile counters 
(atomic,prefer-atomic,single)">;
+def fpseudo_probe_for_profiling : Flag<["-"], "fpseudo-probe-for-profiling">,
+Group, Flags<[NoXarchOption, CC1Option]>,
+HelpText<"Emit pseudo probes for sample profiler">;
+def fno_pseudo_probe_for_profiling : Flag<["-"], 
"fno-pseudo-probe-for-profiling">,
+Group, Flags<[NoXarchOption, CC1Option]>,
+HelpText<"Do not emit pseudo probes for sample profiler.">;
 def forder_file_instrumentation : Flag<["-"], "forder-file-instrumentation">,
 Group, Flags<[CC1Option, CoreOption]>,
 HelpText<"Generate instrumented code to collect order file into 
default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env 
var)">;

diff  --git a/clang/lib/CodeGen/BackendUtil.cpp 
b/clang/lib/CodeGen/BackendUtil.cpp
index dbc18cc40241..b62a66a51d26 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -1094,10 +1094,15 @@ void EmitAssemblyHelper::EmitAssemblyWithNewPassManager(
 CSAction, CodeGenOpts.DebugInfoForProfiling);
   } else if (!CodeGenOpts.SampleProfileFile.empty())
 // -fprofile-sample-use
+PGOOpt = PGOOptions(
+CodeGenOpts.SampleProfileFile, "", CodeGenOpts.ProfileRemappingFile,
+PGOOptions::SampleUse, PGOOptions::NoCSAction,
+CodeGenOpts.DebugInfoForProfiling, 
CodeGenOpts.PseudoProbeForProfiling);
+  else if (CodeGenOpts.PseudoProbeForProfiling)
+// -fpseudo-probe-for-profiling
 PGOOpt =
-PGOOptions(CodeGenOpts.SampleProfileFile, "",
-   CodeGenOpts.ProfileRemappingFile, PGOOptions::SampleUse,
-   PGOOptions::NoCSAction, CodeGenOpts.DebugInfoForProfiling);
+PGOOptions("", "", "", PGOOptions::NoAction, PGOOptions::NoCSAction,
+   CodeGenOpts.DebugInfoForProfiling, true);
   else if (CodeGenOpts.DebugInfoForProfiling)
 // -fdebug-info-for-profiling
 PGOOpt = PGOOptions("", "", "", PGOOptions::NoAction,

diff  --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index