https://github.com/ChuanqiXu9 created https://github.com/llvm/llvm-project/pull/178368
See the doc for details. >From 7fc2f7584bf307c906f55cc9367bd631643a2abe Mon Sep 17 00:00:00 2001 From: Chuanqi Xu <[email protected]> Date: Wed, 28 Jan 2026 13:54:25 +0800 Subject: [PATCH] [docs] [C++20] [Modules] Offer a method to use clang module map with named modules --- clang/docs/StandardCPlusPlusModules.rst | 275 ++++++++++++++++++++++++ 1 file changed, 275 insertions(+) diff --git a/clang/docs/StandardCPlusPlusModules.rst b/clang/docs/StandardCPlusPlusModules.rst index f6ab17ede46fa..95f86e3fc2753 100644 --- a/clang/docs/StandardCPlusPlusModules.rst +++ b/clang/docs/StandardCPlusPlusModules.rst @@ -1320,6 +1320,281 @@ indirectly imported internal partition units are not reachable. The suggested approach for using an internal partition unit in Clang is to only import them in the implementation unit. +Using Clang Module Map to Avoid mixing #include and import problems +------------------------------------------------------------------- + +.. note:: + Discussion in this section is experimental. + +Problems Background +~~~~~~~~~~~~~~~~~~~ + +As discussed before, the redeclaration in different TU is one of the major problems +of using modules from the perspective of the compiler. The redeclaration pattern +is a major trigger of compiler bugs. And even if the compiler accepts the redeclaration +pattern as expected, the compilation performance will be affected too. + +e.g, + +.. code-block:: c++ + + // a.h + #pragma once + class A { ... }; + + // a.cppm + module; + #include "a.h" + export module a; + export using ::A; + + // a.cc + import a; + #include "a.h" + A a; + +Here in ``a.cc``, we have redeclaration for ``A``, one from ``a.cppm`` and one from ``a.cc`` +itself. + +To avoid the redeclaration pattern, in previous section, we suggested users to comment +out thirdparty headers manually. + +And here we will introduce another approach to avoid such redeclaration pattern by using +clang module map. + +Clang Module Map Background +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Clang Module Map is a feature of Clang Header Modules. See `Clang Module <Modules.html>`_ +for full introduction of Clang Header Modules. Here we would only introduce Clang Header +Modules to make this document self contained. + +Clang Implicit Header Modules +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +In Clang Implicit Header Module mode, Clang will read the module map and compile the +header in the module map into a module file and use the module file automatically. +This sounds very nice. But due to the complexity, this is not so wonderful in practice. +Clang has to compile the same header in different preprocessor context into +different module file for correctness conservatively. Then this may trigger the +redeclaration in different TU problems. So that the user of implicit header modules +has to design a module system bottom up carefully. And clang implicit header module +`has many issues with soundness and performance due to tradeoffs made for module +reuse and filesystem contention +<https://discourse.llvm.org/t/clang-modules-build-daemon-build-system-agnostic-support-for-explicitly-built-modules>`_. + +Clang Explicit Header Modules +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Clang explicit header modules offloads the job of creating and managing module files +to the build system. Given the C++20 modules and clang header modules actually share the +same underlying implementation, it is actually possible to reuse the interface of clang module +map for C++20 named modules. + +Technically, Clang Explicit Header Modules may be able to solve the redeclaration problem. +For the above example, + +e.g, + +.. code-block:: c++ + + // a.h + #pragma once + class A { ... }; + + // a.cppm + module; + #include "a.h" + export module a; + export using ::A; + + // a.cc + import a; + #include "a.h" + A a; + +The build system can build the header into a module file and use it in both ``a.cppm`` and ``a.cc``. +Then there is no redeclaration in the example. All the declaration of ``class A`` come from the +synthesized TU ``a.h``. + +But there are problems: (1) the build system needs to support clang explicit module. +(2) The interaction between clang named modules and clang header modules are theoriticall fine but +not verified in practice. And also the document itself is about standard C++ modules, so we won't +expand here. + +Examples +~~~~~~~~ + +To use Clang Module Map for C++20 Named Modules, end users have to wait for the support +from build systems. Here we ignore the build systems to help users to understand the +mechanism. + +Here is an example of using clang module map to replace a header to an import of a module. + +.. code-block:: c++ + + // a.h + #pragma once + static_assert(false, "don't include a.h"); + + // main.cpp + #include "a.h" + int main() { + return 0; + } + + // a.cppm + module; + #include <iostream> + export module a; + struct Init { + Init() { + std::cout << "Module 'a' got imported" << std::endl; + } + }; + Init a; + + // a.cppm.modulemap + module a { + header "a.h" + } + +Then invoke Clang with: + +.. code-block:: console + + $ clang++ -std=c++20 a.cppm -c -fmodule-output=a.pcm -o a.o + $ clang++ -std=c++20 main.cpp -fmodule-map-file=a.cppm.modulemap -fmodule-file=a=a.pcm a.o -o main + $ ./main + Module 'a' got imported + +We can find that the header file ``a.h`` is not included actually (otherwise the compilation should fail due to the static assert). +And it imports the module ``a`` and then the varaible in module ``a`` got initialized. + +The secret comes from the flag ``-fmodule-map-file=a.cppm.modulemap``, the content of ``a.cppm.modulemap`` says: +map the #include of ``a.h`` to the import to module ``a``. Then when the compiler sees ``#include "a.h"``, the compiler +won't include ``a.h`` actually but tries to import the module ``a``. And the from the command line ``-fmodule-file=a=a.pcm``, +the compiler get the module file of module ``a``, then module file of module ``a`` get imported and the inclusion of ``a.h`` +is skipped. + +Then we can try to use the mechanism to avoid redeclaration pattern for header wrapping modules. + +.. code-block:: c++ + + // a.h + #pragma once + class A { ... }; + + // a.cppm + module; + #include "a.h" + export module a; + export using ::A; + + // a.cc + import a; + #include "a.h" + A a; + + // a.cppm.modulemap + module a { + header "a.h" + } + +Similarly, when we compile ``a.cc``, if we add the flag ``-fmodule-map-file=a.cppm.modulemap``, the compiler +will map the inclusion of ``a.h`` to the import of module ``a``. And the module ``a`` is already imported. +So we avoid the redeclaration of class ``A`` in ``a.cc``. + +An imaginable problem with this approach maybe the hidden inclusion. e.g, + +.. code-block:: c++ + + // b.h + #pragma once + struct B {}; + + // a.h + #pragma once + #include "b.h" + struct A { B b; }; + + // b.cppm + export module b; + export extern "C++" struct B { }; + + // a.cppm + export module a; + import b; + export extern "C++" struct A { B b; }; + + // test.cc + import a; + #include "a.h" + A a; + B b; + + // a.cppm.modulemap + module a { + header "a.h" + } + + // b.cppm.modulemap + module b { + header "b.h" + } + +The example is valid if we don't use the module map: + +.. code-block:: console + + $ clang++ -std=c++20 b.cppm -c -fmodule-output=b.pcm -o b.o + $ clang++ -std=c++20 a.cppm -c -fmodule-output=a.pcm -fmodule-file=b=b.pcm -o a.o + $ clang++ -std=c++20 test.cc -fmodule-file=a=a.pcm -fmodule-file=b=b.pcm -fsyntax-only + +But if we enable the module map, the example is invalid: + +.. code-block:: console + + $ clang++ -std=c++20 test.cc -fmodule-map-file=a.cppm.modulemap -fmodule-file=a=a.pcm -fmodule-map-file=b.cppm.modulemap -fmodule-file=b=b.pcm -fsyntax-only + test.cc:4:1: error: declaration of 'B' must be imported from module 'b' before it is required + 4 | B b; + | ^ + b.cppm:2:28: note: declaration here is not visible + 2 | export extern "C++" struct B { }; + | ^ + 1 error generated. + +A suggested convention for end users and build systems +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +As said, the build system is a vital role in this strategy. +However, for build systems, it is not easy to support clang explicit header modules or +support the module map with C++20 named modules generally. The complexity for build system +won't be less than supporting C++20 named modules. + +So here we suggest a convention between end users and build systems to ease the implementation +burden of build systems and help end users to avoid the redeclaration problem from mixing #include +and import. + +For end users who is the author of header based library offering named module wrappers, The header's interface +should be a subset of the module interface excluding user-facing macros. + +* Extract all user facing headers into a single header file. Since C++20 named modules +* For each named module interface, provide a module map file to map the interface headers to the named module. +The name of the module map should be the name of the module interface unit plus ``.modulemap``. + +The number of the module map may not be a lot sicne this is still a +header based library. + +For build systems, + +* For each Translation Units, if the unit doesn't import any named modules, stop. This is not what we want. +* If the TU imports named module, for all imported named module unit, look up for the module map file in the same path of the imported module unit with the name of the module unit plus ``.modulemap``. e.g., if the name of the module unit is ``a.cppm``, we should lookup for ``a.cppm.modulemap``. +* For the found module map, pass ``-fmodule-map-file=<module_map_file_path>`` to the clang compiler. + +The point of the approach is, the build system can reuse the result of C++20 named modules to manage depencies. So that +the implementation burden of build systems is largely reduced. + Known Issues ------------ _______________________________________________ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
