[clang] [Modules] Don't fail when an unused textual header is missing. (PR #138227)

Ilya Biryukov via cfe-commits Mon, 02 Jun 2025 08:04:50 -0700

ilya-biryukov wrote:

> I'm sorry that my replies came across as hostile and snarky, I didn't mean it 
> that way. Though I'll admit I'm not thrilled about your current approach, 
> that's what I was communicating. But that's what you have right now and there 
> is no easy way to change it fast.


Thanks a lot, I am sorry for misreading or misunderstanding your comments too. 
I hope we can put this behind ourselves.

> and the lack of the commitment to any work on your side made it sound as a 
> demand.

Sorry that it came out this way. I meant to say that changing this would likely 
be a very lengthy and labor-intenstive commitment on our side. Simply because 
it's so closely interacting with the build system and the build system is used 
by all of our code, I expect that we'll hit every corner case imaginable no 
matter what change we do.

It does not mean that we should not do it if that's for the better of Clang, 
just want to make sure we have the tools at our disposal to land a change 
internally.

> I think I understand the logic behind your approach. 

Right, I believe you're right on spot with the conclusions.

> Access to headers from .cc files is restricted through the used module maps 
> and maybe through some Bazel sand-boxing. Doesn't particularly matter in this 
> case.

One clarification here, for .cc files we simply pass the same module maps and 
set `-fmodule-name=<cc file's library>`, making Clang treat `.cc` file as 
internal to the module, the rest of the access checking rules applied are the 
same.


> What about the approach to provide all_textual.pcm to use.cpp and for clang 
> to find foo.h in all_textual.pcm? It doesn't work right now but I'm not sure 
> that it shouldn't (there can be good reasons why it shouldn't work, tbh).

We don't actually produce `.pcm` files for the modules that only have textual 
headers. The rational is that doing  so would be detrimental to build 
performance and resource usage on our distributed build infrastructure. 
Normally `.pcm` files are a net win because they allow to save work required to 
generate the AST in their dependecies. However, if all the headers in a module 
are textual, we merely carry around an empty PCM and add unnecessary 
dependencies in the build action graph (`.cc` files that depend only on textual 
headers will require the PCM even though it contains no useful information).

We also have content-addressed caches for the inputs to build actions that 
benefit from the fine granularity. When producing a `.pcm`, embedding the 
contents turns our to be a net win downstream because we have to carry it 
around anyway and any change in its headers will cause the PCM itself to 
change. But shipping all textual headers like this would make this cache to be 
evicted much more frequently, increase the size of inputs to actions depending 
on those libraries and decrease the cache hit rates.

If we were to make such a change in Clang, we'll have to make a few more 
changes:

-  `-fmodules-embed-all-files` currently only writes files that were actually 
read during the compilation. In order to be able to ship `all_textual.pcm` we 
would need to add additional logic to write all files mentioned in module maps.
-   In addition, after landing the current PR, we would end up in a situation 
where the modules the compilation may succeed or fail when the textual headers 
are missing depending on the presence of `-fmodules-embed-all-files` flag. This 
feels weird, although I feel that the flag itself is rather specialized, so 
it's probably not the end of the world for those using it (I only know of our 
use-case, though, so may be missing some important problems that others have).

> I'm relieved that you aren't a heavy user of textual headers to make macros 
> work (this is one of the use cases for textual headers).

That's one of the use cases too, but indirectly. If the headers define macros, 
users are expected to mark them as such. Sometimes this is necessary, e.g. 
certain C++ libraries do rely on macros and our code can have those libraries 
as dependencies.



## Looping back to #141792. 

(I am not sure if this was already addressed, sorry if I missed the comments).
How does the idea of always preferring to load a header from PCM sound to folks?
Given that access checking of headers and module resolution are independent in 
Clang right now and given that same header in multiple modules is rare, it 
seems like a behavior change we could potentially agree on.

That being said, I also wouldn't be surprised if folks don't like this, it's 
certainly not the cleanest design one could imagine.
I would want to hear opinions on that approach.

https://github.com/llvm/llvm-project/pull/138227
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Modules] Don't fail when an unused textual header is missing. (PR #138227)

Reply via email to