sammccall added a comment.

In D153114#4602414 <https://reviews.llvm.org/D153114#4602414>, @ChuanqiXu wrote:

>> Don't attempt any cross-file or cross-version coordination: i.e. don't try 
>> to reuse BMIs between different files, don't try to reuse BMIs between 
>> (preamble) reparses of the same file, don't try to persist the module graph. 
>> Instead, when building a preamble, synchronously scan for the module graph, 
>> build the required PCMs on the single preamble thread with filenames private 
>> to that preamble, and then proceed to build the preamble.
>
> Do I understand right? If I understand correctly, I fully agree with the 
> direction. We can go slowly, as long as we keep moving forward.
>
> Then I'd like to leave the patch as-is for referencing and create new patches 
> following the suggestion.

Yes, that's the suggestion, and that plan makes sense to me, thanks!

I did some more thinking about this (having a concrete implementation helps a 
lot!) and had a couple more thoughts.
At some point we should write down a design somewhere, need to strike a balance 
between doing it early enough to be useful but late enough that we've 
understood!

Dep scanning - roles
--------------------

IIUC we do this for two reasons:

- to identify what module names we must have PCMs for in order to build a given 
TU (either an open file, or a module we're building as PCM)
- to build a database mapping module name => filename, which we compose with 
the CDB to know how to build a PCM for a given module name

I think it would be good to clearly separate these. The latter is simpler, more 
performance-critical, async, and is probably not used at all if the build 
system can tell us this mapping.
The latter is more complex, and will always be needed synchronously for the 
mainfile regardless of the build system.

Dep scanning - implementation
-----------------------------

The dep scanner seems to work by getting the compile command and running the 
preprocessor. This is fairly heavyweight, and I can't see anywhere it's going 
into single-file mode - is it really reading all `#included` headers? This is 
certainly not workable for reparses of the mainfile (when no headers have 
changed).

It seems unneccesary: the standard seems to go to some lengths to ensure that 
we (almost) only need to lex the top-level file:

- module and import decls can't be produced by macros (seems to be the effect 
of the `pp-module` directive etc)
- module and import decls can't be `#include`d (definition of `module-file` and 
`[cpp.import]` rules)

The wrinkle I see is that some PP usage is possible: the module name can be 
produced by a macro, and imports can be `#ifdef`d. I think the former is very 
unlikely (like `#include MACRO_NAME`) and we can not support it, and the latter 
will just result in us overestimating the deps, which seems OK.
You have more context here though, and maybe I'm misreading the restrictions 
placed by the standard. Today clang doesn't seem to enforce most of these sort 
of restrictions, which is probably worth fixing if they're real.

(This doesn't apply to header modules: it's perfectly possible to include a 
textual header which includes a modular header, and it's impossible to know 
without actually preprocessing. This divergence is nasty, but I don't think we 
should pessimize standard modules for it).

Interaction with preamble
-------------------------

At a high level, `import` decls should be processed with the preamble: they 
should change infrequently, rebuilding modules is expensive, coarse-grained 
work, we want to make the same policy decisions on whether to use stale PCMs or 
block on fresh ones etc.
However they don't appear in a prefix of the file, and this is pretty important 
to how the preamble action works, so exactly in what sense are they part of the 
preamble?

I believe the best answer is:

- "preamble" is really a set of required artifacts + conditions to check for 
validity
- `import foo` in a file means `foo.pcm` is a required artifact, not that 
`preamble.pcm` contains an `ImportDecl`

So given this code:

  module;
  #include <x>
  module foo;
  import dep1;
  module :private;
  import dep2;

The "preamble region" should end after `#include <x>` and preamble.pcm should 
contain the AST & PP state up to that point.
Meanwhile `dep1.pcm` and `dep2.pcm` are separate PCM files that will be loaded 
on each parse.
For a preamble to be usable at all, we need to have built `preamble.pcm`, 
`dep1.pcm`, `dep2.pcm`.
For a preamble to be up-to-date, the preamble region + set of imported modules 
must be unchanged, the preamble.pcm must be up to date with respect to its 
sources, and the module PCMs must be up-to-date with respect to their sources. 
(unclear exactly how to implement the latter, may need to add extra tracking)

(When building this module as a PCM we may not want to treat dep2 as a 
dependency or parse the private fragment... but this is not relevant to 
preambles as we won't be using a preamble for this anyway)


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153114/new/

https://reviews.llvm.org/D153114

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to