[PATCH] D144844: [C++20] [Modules] Offer -fno-import-inter-module-function-defs to avoid duplicated compilation in modules

2023-03-15 Thread Chuanqi Xu via Phabricator via cfe-commits
ChuanqiXu added a comment.

In D144844#4197067 , @dblaikie wrote:

> I don't think of this as a performance regression for users though - this 
> functionality's never really "shipped" so we get to choose what the baseline 
> is.
>
> And I think a reasonable baseline to compare to isn't this implementation we 
> don't think is ideal (because of the build invalidation issues, if nothing 
> else, caused by having thick PCMs) - I think the baseline is what a users 
> non-modular code is. And in non-modular code these non-inline functions would 
> be in the implementation files, not able to cross-TU inline without LTO.
>
> I think not providing definitions of non-inline functions for cross-TU 
> optimizations is not a regression, but exactly in-line with existing 
> non-modular behavior, which is totally fine.

Yeah, I refer "regression"  to compare the performance of modular codes with 
the non modular codes. Since no users will be happy if they found the 
performance goes down after they convert their existing library to modules.

Then the problem is that the implicitly inline functions in headers may be 
non-inline functions in modules.   And we made some simple experiments for the 
patch, we observed performance regression indeed.

In D144844#4197646 , @iains wrote:

> In D144844#4195633 , @ChuanqiXu 
> wrote:
>
>>> However, "performance" also includes compilation speed in the 'no 
>>> optimisation, debug' case - that is also considered very important. So, 
>>> perhaps, the short-term approach should be (as @dblaikie suggested) to 
>>> include the bodies for -O >= 3?
>>
>> I don't think so. I think "performance" refers to the runtime performance 
>> generally. I don't believe the normal users will be happy if modules will 
>> decrease the performance of their program in any means. So I think we should 
>> include the bodies by default.
>
> I think I must be misunderstanding something here.
>
> The default for clang is to compile without optimisation - this benefits the 
> compile-edit-debug cycle, by providing output that is closest to the original 
> source, and quickest to compile.
>
> The user should not be expecting any optimisations to be applied unless they 
> supply `-ON` (n fact, they might well complain if we optimise something that 
> makes debugging harder).
>
> So, we should try to ensure that adding modules supports that model - and 
> provides the quickest and closest to the original sources for the default 
> options.  If the user wants better optimisation (at the expense of longer 
> compile times), then they provide `-ON`, right?

Oh, we are talking the case that the users will use `-ON` as default. Since 
every C++ programs will use `-ON` in practice.




And I am still wondering how thin the thin BMI will be by removing the 
non-inline function bodies. For example, we'll still need to contain the 
function bodies for templates, right? So I guess if we want a thin BMI, we need 
more refactoring to the current pcm format, which requires more working.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144844/new/

https://reviews.llvm.org/D144844

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D144844: [C++20] [Modules] Offer -fno-import-inter-module-function-defs to avoid duplicated compilation in modules

2023-03-15 Thread Iain Sandoe via Phabricator via cfe-commits
iains added a comment.

In D144844#4195633 , @ChuanqiXu wrote:

>> However, "performance" also includes compilation speed in the 'no 
>> optimisation, debug' case - that is also considered very important. So, 
>> perhaps, the short-term approach should be (as @dblaikie suggested) to 
>> include the bodies for -O >= 3?
>
> I don't think so. I think "performance" refers to the runtime performance 
> generally. I don't believe the normal users will be happy if modules will 
> decrease the performance of their program in any means. So I think we should 
> include the bodies by default.

I think I must be misunderstanding something here.

The default for clang is to compile without optimisation - this benefits the 
compile-edit-debug cycle, by providing output that is closest to the original 
source, and quickest to compile.

The user should not be expecting any optimisations to be applied unless they 
supply `-ON` (n fact, they might we complain if we optimise something that 
makes debugging harder).

So, we should try to ensure that adding modules supports that model - and 
provides the quickest and closest to the original sources for the default 
options.  If the user wants better optimisation (at the expense of longer 
compile times), then they provide `-ON`, right?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144844/new/

https://reviews.llvm.org/D144844

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D144844: [C++20] [Modules] Offer -fno-import-inter-module-function-defs to avoid duplicated compilation in modules

2023-03-15 Thread David Blaikie via Phabricator via cfe-commits
dblaikie added a comment.

I don't think of this as a performance regression for users though - this 
functionality's never really "shipped" so we get to choose what the baseline is.

And I think a reasonable baseline to compare to isn't this implementation we 
don't think is ideal (because of the build invalidation issues, if nothing 
else, caused by having thick PCMs) - I think the baseline is what a users 
non-modular code is. And in non-modular code these non-inline functions would 
be in the implementation files, not able to cross-TU inline without LTO.

I think not providing definitions of non-inline functions for cross-TU 
optimizations is not a regression, but exactly in-line with existing 
non-modular behavior, which is totally fine.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144844/new/

https://reviews.llvm.org/D144844

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D144844: [C++20] [Modules] Offer -fno-import-inter-module-function-defs to avoid duplicated compilation in modules

2023-03-15 Thread Chuanqi Xu via Phabricator via cfe-commits
ChuanqiXu added a comment.

> However, "performance" also includes compilation speed in the 'no 
> optimisation, debug' case - that is also considered very important. So, 
> perhaps, the short-term approach should be (as @dblaikie suggested) to 
> include the bodies for -O >= 3?

I don't think so. I think "performance" refers to the runtime performance 
generally. I don't believe the normal users will be happy if modules will 
decrease the performance of their program in any means. So I think we should 
include the bodies by default.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144844/new/

https://reviews.llvm.org/D144844

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D144844: [C++20] [Modules] Offer -fno-import-inter-module-function-defs to avoid duplicated compilation in modules

2023-03-15 Thread Iain Sandoe via Phabricator via cfe-commits
iains added a comment.

In D144844#4195316 , @ChuanqiXu wrote:

> Got your points. Let's postpone this one.
>
> But I want to emphasize that this patch (and the thin PCM) will decrease the 
> performance. While LTO can save the regression, LTO is not widely used. 
> (ThinLTO can only mitigate this.) I mean the default behavior shouldn't cause 
> performance regression. We can offer a new alternative  for the users but we 
> can't enable that by default simply now.

Agree that we (ideally, at least) should not decrease performance with a new 
feature.
However, "performance" also includes compilation speed in the 'no optimisation, 
debug' case - that is also considered very important.  So, perhaps, the 
short-term approach should be (as @dblaikie suggested) to include the bodies 
for `-O` >= 3?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144844/new/

https://reviews.llvm.org/D144844

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D144844: [C++20] [Modules] Offer -fno-import-inter-module-function-defs to avoid duplicated compilation in modules

2023-03-14 Thread Chuanqi Xu via Phabricator via cfe-commits
ChuanqiXu planned changes to this revision.
ChuanqiXu added a comment.

It should be "Plan Changed" instead of "Abandoned".


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144844/new/

https://reviews.llvm.org/D144844

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D144844: [C++20] [Modules] Offer -fno-import-inter-module-function-defs to avoid duplicated compilation in modules

2023-03-14 Thread Chuanqi Xu via Phabricator via cfe-commits
ChuanqiXu abandoned this revision.
ChuanqiXu added a comment.

Got your points. Let's postpone this one.

But I want to emphasize that this patch (and the thin PCM) will decrease the 
performance. While LTO can save the regression, LTO is not widely used. 
(ThinLTO can only mitigate this.) I mean the default behavior shouldn't cause 
performance regression. We can offer a new alternative  for the users but we 
can't enable that by default simply now.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144844/new/

https://reviews.llvm.org/D144844

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D144844: [C++20] [Modules] Offer -fno-import-inter-module-function-defs to avoid duplicated compilation in modules

2023-03-14 Thread David Blaikie via Phabricator via cfe-commits
dblaikie added a comment.

In D144844#4193727 , @iains wrote:

> In D144844#4193568 , @dblaikie 
> wrote:
>
>> Seem to recall @iains and others were concerned about the number of modules 
>> flags - this one I'd be inclined to suggest we shouldn't add if possible. If 
>> one way or the other is generally better - we should, at least initially, 
>> dictate that behavior entirely until someone can demonstrate divergent use 
>> cases that seem reasonable to support but must have different behavior here.
>
> Agreed, [IMO, at least] we have a very large set of modules options, and 
> expanding that should only be done if no sensible alternative can be found 
> (we are already seeing users getting confused about which ones apply in each 
> case).
>
> A second point here - that (once we have the ability to generate object and 
> PCM in the same compilation) that we will move to elide the function bodies 
> for non-inline and non-dependent cases from the PCM, so that this problem 
> will magically "go away" (to restore the current behaviour, we'd then be 
> using some optimisation control to avoid the elision, I suppose).

Yeah, this seems like the simplest concrete point to suggest we just change the 
defaults here - that's the plan moving forward, and I don't think it's worth 
trying to maintain two codepaths/supports here - people can write the functions 
as explicitly `inline` when they want cross-module inlining & we can let 
non-inline functions be the same as they were before modules.

If, years from now, someone wants to prototype some stronger optimization 
opportunities in modules - let's do that later/then?

>> The performance of cross-module inlining could be achieved with something 
>> like ThinLTO if we were to lean in favor of not allowing cross-module 
>> inlining by default, for instance.
>
> +1 this seems exactly what LTO is intended for (also there are folks who seem 
> to have an expectation that the BMI somehow magically contains an optimised 
> representation of the source - which again is the province of LTO).

There's some advantages (build system complexity, time, etc) to doing it at 
compile-time, rather than link-time, but I don't think we have enough data to 
be worth the complexity for now - I'd vote for letting the feature go/removing 
it for now, and folks can try to bring it back with data/measurements later.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144844/new/

https://reviews.llvm.org/D144844

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D144844: [C++20] [Modules] Offer -fno-import-inter-module-function-defs to avoid duplicated compilation in modules

2023-03-14 Thread Iain Sandoe via Phabricator via cfe-commits
iains added a comment.

In D144844#4193568 , @dblaikie wrote:

> Seem to recall @iains and others were concerned about the number of modules 
> flags - this one I'd be inclined to suggest we shouldn't add if possible. If 
> one way or the other is generally better - we should, at least initially, 
> dictate that behavior entirely until someone can demonstrate divergent use 
> cases that seem reasonable to support but must have different behavior here.

Agreed, [IMO, at least] we have a very large set of modules options, and 
expanding that should only be done if no sensible alternative can be found (we 
are already seeing users getting confused about which ones apply in each case).

A second point here - that (once we have the ability to generate object and PCM 
in the same compilation) that we will move to elide the function bodies for 
non-inline and non-dependent cases from the PCM, so that this problem will 
magically "go away" (to restore the current behaviour, we'd then be using some 
optimisation control to avoid the elision, I suppose).

> The performance of cross-module inlining could be achieved with something 
> like ThinLTO if we were to lean in favor of not allowing cross-module 
> inlining by default, for instance.

+1 this seems exactly what LTO is intended for (also there are folks who seem 
to have an expectation that the BMI somehow magically contains an optimised 
representation of the source - which again is the province of LTO).

> But if there are particular idioms where cross-module inlining is 
> disadvantageous, perhaps we can implement better ways for clang to detect 
> them (or if they're undetectable, offer users a way to annotate their code, 
> maybe).

I'd be interested to see an example where cross-module function body imports 
somehow beats the equivalent LTO.

> Could we key off -Os or -Oz or some other existing optimization/compile time 
> tradeoff flags instead of introducing a new flag?

That's an interesting idea .. I'd suggest we should default the behaviour to 
"off" (so that compilation speed is prioritised for default options, as usual) 
and then maybe enable import of function bodie at O3 
 or so? [maybe even an optimisation 
representative of LTO .. so that when we slim the BMIs down we do not get 
complaints about regression in code performance].


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144844/new/

https://reviews.llvm.org/D144844

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D144844: [C++20] [Modules] Offer -fno-import-inter-module-function-defs to avoid duplicated compilation in modules

2023-03-14 Thread David Blaikie via Phabricator via cfe-commits
dblaikie added a comment.

Seem to recall @iains and others were concerned about the number of modules 
flags - this one I'd be inclined to suggest we shouldn't add if possible. If 
one way or the other is generally better - we should, at least initially, 
dictate that behavior entirely until someone can demonstrate divergent use 
cases that seem reasonable to support but must have different behavior here.

The performance of cross-module inlining could be achieved with something like 
ThinLTO if we were to lean in favor of not allowing cross-module inlining by 
default, for instance.

But if there are particular idioms where cross-module inlining is 
disadvantageous, perhaps we can implement better ways for clang to detect them 
(or if they're undetectable, offer users a way to annotate their code, maybe).

Could we key off -Os or -Oz or some other existing optimization/compile time 
tradeoff flags instead of introducing a new flag?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144844/new/

https://reviews.llvm.org/D144844

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D144844: [C++20] [Modules] Offer -fno-import-inter-module-function-defs to avoid duplicated compilation in modules

2023-03-13 Thread Chuanqi Xu via Phabricator via cfe-commits
ChuanqiXu added a comment.

@iains @dblaikie gentle ping~


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144844/new/

https://reviews.llvm.org/D144844

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D144844: [C++20] [Modules] Offer -fno-import-inter-module-function-defs to avoid duplicated compilation in modules

2023-02-27 Thread Chuanqi Xu via Phabricator via cfe-commits
ChuanqiXu created this revision.
ChuanqiXu added reviewers: iains, Bigcheese, dblaikie.
ChuanqiXu added a project: clang-modules.
Herald added a project: All.
ChuanqiXu requested review of this revision.
Herald added subscribers: cfe-commits, MaskRay.
Herald added a project: clang.

Close https://github.com/llvm/llvm-project/issues/61015
Close https://github.com/llvm/llvm-project/issues/60996

Clang will import the function definitions from other module unit within
optimizations by default to get the best performance. But there are
cases users prefer faster compilation speed than the best performance.

Although we may not agree such ideas, we should offer an option for the
users to give them the right to choose.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D144844

Files:
  clang/include/clang/Basic/CodeGenOptions.def
  clang/include/clang/Driver/Options.td
  clang/lib/CodeGen/CodeGenModule.cpp
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/Driver/module-no-import-from-other-tu.cppm
  clang/test/Modules/inter-module-imports-function-defs.cppm

Index: clang/test/Modules/inter-module-imports-function-defs.cppm
===
--- /dev/null
+++ clang/test/Modules/inter-module-imports-function-defs.cppm
@@ -0,0 +1,57 @@
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+// RUN: cd %t
+//
+// RUN: %clang_cc1 -std=c++20 -triple %itanium_abi_triple %t/a.cppm \
+// RUN: -emit-module-interface -o %t/a.pcm
+// RUN: %clang_cc1 -std=c++20 -triple %itanium_abi_triple %t/b.cppm \
+// RUN: -emit-module-interface -fprebuilt-module-path=%t -o %t/b.pcm
+// RUN: %clang_cc1 -std=c++20 -triple %itanium_abi_triple %t/c.cppm \
+// RUN: -emit-module-interface -fprebuilt-module-path=%t -o %t/c.pcm
+// RUN: %clang_cc1 -std=c++20 -triple %itanium_abi_triple %t/c.pcm -S \
+// RUN: -emit-llvm -disable-llvm-passes -o - | FileCheck %t/c.cppm --check-prefix=NO-IMPORT
+//
+// RUN: %clang_cc1 -std=c++20 -triple %itanium_abi_triple -O3 %t/a.cppm \
+// RUN: -emit-module-interface -o %t/a.pcm
+// RUN: %clang_cc1 -std=c++20 -triple %itanium_abi_triple -O3 %t/b.cppm \
+// RUN: -emit-module-interface -fprebuilt-module-path=%t -o %t/b.pcm
+// RUN: %clang_cc1 -std=c++20 -triple %itanium_abi_triple -O3 %t/c.cppm \
+// RUN: -emit-module-interface -fprebuilt-module-path=%t -o %t/c.pcm
+// RUN: %clang_cc1 -std=c++20 -triple %itanium_abi_triple -O3 %t/c.pcm \
+// RUN: -S -emit-llvm -disable-llvm-passes -o - | FileCheck %t/c.cppm --check-prefix=IMPORT
+//
+// RUN: %clang_cc1 -std=c++20 -triple %itanium_abi_triple -O3 \
+// RUN: %t/a.cppm -emit-module-interface -o %t/a.pcm
+// RUN: %clang_cc1 -std=c++20 -triple %itanium_abi_triple -O3 \
+// RUN: %t/b.cppm -emit-module-interface -fprebuilt-module-path=%t -o %t/b.pcm
+// RUN: %clang_cc1 -std=c++20 -triple %itanium_abi_triple -O3 \
+// RUN: %t/c.cppm -emit-module-interface -fprebuilt-module-path=%t -o %t/c.pcm
+// RUN: %clang_cc1 -std=c++20 -triple %itanium_abi_triple -O3 \
+// RUN: -fno-import-inter-module-function-defs %t/c.pcm \
+// RUN: -S -emit-llvm -disable-llvm-passes -o - | FileCheck %t/c.cppm --check-prefix=NO-IMPORT
+
+//--- a.cppm
+export module a;
+export int a() {
+return 43;
+}
+
+//--- b.cppm
+export module b;
+export import a;
+export int b() {
+return 43 + a();
+}
+
+//--- c.cppm
+export module c;
+export import b;
+export int c() {
+return 43 + b() + a();
+}
+
+// NO-IMPORT: declare{{.*}}@_ZW1b1bv
+// NO-IMPORT: declare{{.*}}@_ZW1a1av
+
+// IMPORT: define available_externally{{.*}}@_ZW1b1bv
+// IMPORT: define available_externally{{.*}}@_ZW1a1av
Index: clang/test/Driver/module-no-import-from-other-tu.cppm
===
--- /dev/null
+++ clang/test/Driver/module-no-import-from-other-tu.cppm
@@ -0,0 +1,6 @@
+// RUN: %clang -O3 -fno-import-inter-module-function-defs -std=c++20 \
+// RUN: %s -### 2>&1 | FileCheck %s
+
+export module x;
+
+// CHECK: -fno-import-inter-module-function-defs
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -3850,6 +3850,10 @@
   Args.ClaimAllArgs(options::OPT_fmodule_output);
   Args.ClaimAllArgs(options::OPT_fmodule_output_EQ);
 
+  if (Args.hasFlag(options::OPT_fno_import_inter_module_function_defs,
+   options::OPT_fimport_inter_module_function_defs, false))
+CmdArgs.push_back("-fno-import-inter-module-function-defs");
+
   return HaveModules;
 }
 
Index: clang/lib/CodeGen/CodeGenModule.cpp
===
--- clang/lib/CodeGen/CodeGenModule.cpp
+++ clang/lib/CodeGen/CodeGenModule.cpp
@@ -3541,7 +3541,8 @@
   if (getFunctionLinkage(GD) != llvm::Function::AvailableExternallyLinkage)
 return true;
   const auto *F = cast(GD.getDecl());
-  if