ilya-biryukov wrote: > So far we are still aiming for a more principled solution. For example, if > support of a header in multiple modules is a supported feature, then some > kind of a spec with a more comprehensive testing would be appropriate. It is > up to you but personally I feel like that's a lot of work and it is more > pragmatic to skirt the issue and make sure your use case keeps working with > smaller investments.
I agree we should have better testing and specification, but to your point it is a lot of work and for small issues like this we rather need a quick solution. It's not to say that we should never do it, just want to stay pragmatic about it. If this gets **really** gets in the way, we will absolutely have to invest here. > Personally, at the moment I'm interested in how -fmodules-embed-all-files > works and if its behavior is consistent. Though it is possible I like putting > stuff there because we don't rely on this flag and it feels less risky. > Anyway, the first impression of using -fmodules-embed-all-files makes me > suspicious it works inconsistently. The mental model that I have after staring at the implementation is that any file that `SourceManager` reads gets written into a PCM and should later be used whenever someone looks up the same file (at least with the same path, symlinks make this much-much more complicated and I don't think there's any good semantics there). However, this mental model breaks down when looking at your example. What's worse, it really depends on whether any `FileID` used by that file was ever deserialized, e.g. here's a slightly modified example: ```cpp // RUN: rm -rf %t // RUN: split-file %s %t // // Build modules 'a' and 'b' embedding all files. // RUN: %clang_cc1 -fmodules -fmodule-map-file=%t/modules1.map -fmodule-name=a -emit-module -xc++ -fmodules-embed-all-files -o %t/a.pcm %t/modules1.map // RUN: %clang_cc1 -fmodules -fmodule-map-file=%t/modules2.map -fmodule-name=b -emit-module -xc++ -fmodules-embed-all-files -o %t/b.pcm %t/modules2.map // // Try to access foo.h from a module. // RUN: rm %t/foo.h // RUN: %clang_cc1 -fmodules -fmodule-map-file=%t/modules1.map -fmodule-file=%t/a.pcm -fsyntax-only %t/use.cpp //--- modules1.map module a { header "foo.h" header "bar.h" export * } //--- modules2.map module b { private header "foo.h" private header "bar.h" export * } //--- foo.h [[deprecated("to show location")]] void foo(void) ; //--- bar.h #include "foo.h" //--- use.cpp #include "bar.h" void test() { foo(); // When this call is present, no error below. Because the deprecation warning gets shown and FileID for `foo.h` gets deserialized. // If you comment out the previous line, the #include below gives an error 🤷 } #include "foo.h" ``` I would actually consider this a bug, so my take on it: your example should compile just fine and my follow-up example should not depend on whether there is an error mentioning `foo.h` (or something else that causes a deserialization) before `#include`. My idea to fix this would be to store overridden buffers separately and make sure we always create virtual files for all of them by path, the contents should actually be loaded lazily to avoid wasting memory. I am happy to take a stab at this, but wanted to confirm this direction seems right in advance. As for the original issue, I'll wait for some more thoughts from your side on what the principled solution would be. Happy to pursue other directions and help implement them too. PS I won't be able to respond for the next 10 days, back on June 16. Sorry about the inconvenience. If you urgently need to land the original patch, please do, I have asked @alexfh to locally revert it until I'm back and can handle it. It seems a small enough change and we understand the root cause and impact pretty well now, so I am sure we can deal with it until there's a final solution in place. https://github.com/llvm/llvm-project/pull/138227 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits