Re: [PATCH] Allow multiple modules with the same name to coexist in the module cache

Ben Langmuir Mon, 31 Mar 2014 13:08:35 -0700

On Mar 31, 2014, at 11:40 AM, Ben Langmuir <[email protected]> wrote:


> 
> On Mar 28, 2014, at 4:42 PM, Richard Smith <[email protected]> wrote:
> 
>> On Fri, Mar 28, 2014 at 3:27 PM, Ben Langmuir <[email protected]> wrote:
>> Hi Richard,
>> 
>> 
>> On Mar 28, 2014, at 2:45 PM, Richard Smith <[email protected]> wrote:
>> 
>>> On Fri, Mar 28, 2014 at 1:12 PM, Ben Langmuir <[email protected]> wrote:
>>> This patch allows multiple modules that have the same name to coexist in 
>>> the module cache.  To differentiate between two modules with the same name, 
>>> we will consider the path the module map file that they are defined by* 
>>> part of the ‘key’ for looking up the precompiled module (pcm file).  
>>> Specifically, this patch renames the precompiled module (pcm) files from
>>> 
>>> cache-path/<module hash>/Foo.pcm
>>> 
>>> to
>>> 
>>> cache-path/<module hash>/Foo-<hash of module map path>.pcm
>>> 
>>> From a high level, I don't really see why we need a second hash here. 
>>> Shouldn't the -I options be included in the <module hash>? If I build the 
>>> same module with different -I flags, that should resolve to different .pcm 
>>> files, regardless of whether it makes the module name resolve to a 
>>> different module.map file.
>> 
>> If we include the -I options in the module hash, we will explode the number 
>> of module compilations needed.  The following should all be able to share a 
>> module ‘A’.
>> 
>> I really don't think they should, if that second -I path is used in any way 
>> when building module 'A’.
>> 
>> clang -fmodules -I /path/to/A -I /path/to/B some_file.c
>> clang -fmodules -I /path/to/A -I /path/to/C some_file2.c
>> clang -fmodules -I /path/to/A -I /path/to/D some_file3.c
>> 
>> I would think a better solution here would be to not have that second -I 
>> path in header search when building module A (and thus not include it in the 
>> hash for a principled reason).
>> 
>> Ultimately, each module should have its own header search path -- the model 
>> of one set of include paths for the entire TU (with that include path 
>> potentially causing some libraries to find the wrong files) is a broken, 
>> antiquated legacy of the non-modules world. 
> 
> I agree with this in principle - having a predictable context for building 
> the modules would solve a lot of problems.  However, that seems to only work 
> for semantic import (@import, not #include), which is the less common case 
> for us at the moment.

Oh, BTW we’re planning to add a flag that will make including non-modular 
content from inside a module an error, even with auto import.  That would 
prevent a lot of the problems of having include paths accidentally change the 
content of a module.

> 
>> In the short term, we should probably drop all header search paths that are 
>> before the path in which the module map was found.
> 
> To be clear, this has the same correctness problem as my approach - the 
> header search paths that come before the module map may still change the 
> content of the module, especially if the module depends on other modules.
> 
> In general, adding the -I options is brittle when precompiled headers are 
> involved. Since you likely don’t have exactly the same -I options when 
> building your PCH as when you use it (you likely have fewer -I paths when 
> building the PCH if it is being widely used) you could potentially have 
> different module hashes in the PCH and in the main file, which leads to 
> trying to load multiple copies of a module from different paths.  I ran into 
> this when I tried to naively add all of the -I options to the module hash.  
> The heuristic you suggested for dropping the earlier paths might help here, 
> but it would be required for correctness, not merely as an optimization.
> 
> I am also not sure what effect this would have on the global module index, 
> since the set of modules to load would span multiple hash-directories and we 
> would need to avoid looking at incompatible modules.  I’m not sure if this is 
> a problem or not.
> 
> Any thoughts?
> 
> Ben
> 
>> 
>>> Are you trying to cope with the case where the -I path finds multiple 
>>> module.map files defining the same module (where it's basically chance 
>>> which one will get built and used)? I don't really feel like this is the 
>>> right solution to that problem either -- we should remove the 'luck' aspect 
>>> and use some sane mechanism to determine which module.map files are loaded, 
>>> and in what order.
>> 
>> Nope - that is not being addressed.
>> 
>>> 
>>> Is this addressing some other case?
>>> 
>> 
>> See above.
>> 
>>>  
>>> In addition, I’ve taught the ASTReader to re-resolve the names of imported 
>>> modules during module loading so that if the header search context changes 
>>> between when a module was originally built and when it is loaded we can 
>>> rebuild it if necessary.  For example, if module A imports module B
>>> 
>>> first time:
>>> clang -I /path/to/A -I /path/to/B …
>>> 
>>> second time:
>>> clang -I /path/to/A -I /different/path/to/B …
>>> 
>>> will now rebuild A as expected.
>>> 
>>> 
>>> * in the case of inferred modules, we use the module map file that 
>>> *allowed* the inference, not the __inferred_module.map file, since the 
>>> inferred file path is the same for every inferred module.
>>> 
>>> 
>>> Review comments on the patch itself:
>>> 
>>>  +  /// the inferrence (e.g. contained 'module *') rather than the virtual
>>> 
>>> Typo 'inference', 'Module *'.
>>> 
>>> +  /// For an explanaition of \p ModuleMap, see Module::ModuleMap.
>>> 
>>> Typo 'explanation'.
>>> 
>>> +  // safe becuase the FileManager is shared between the compiler instances.
>>> 
>>> Typo ‘because'
>> 
>> Thanks for catching the embarrassing quantity of typos :)
>> 
>>> 
>>> +  // the inferred module. If this->ModuleMap is nullptr, then we are using
>>> +  // -emit-module directly, and we cannot have an inferred module.
>>> 
>>> I don't understand what this comment is trying to say. If we're using 
>>> -emit-module, then we were given a module map on the command-line; should 
>>> that not be referred to by this->ModuleMap? (Also, why 'this->'?) How can a 
>>> top-level module be inferred? Is that a framework-specific thing?
>> 
>> Hmm, I don’t recall why I didn’t just pass in the InputFile as the module 
>> map.  I’ll do that.
>> 
>> Yes, AFAIK only framework modules can be inferred at the top-level.
>> 
>>> 
>>> +    StringRef ModuleMap = this->ModuleMap ? this->ModuleMap->getName() : 
>>> InFile;
>>> 
>>> Please pick a different variable name rather than shadowing a member of 
>>> '*this' here.
>> 
>> Will do.
>> 
>>> 
>>> +    // Construct the name <ModuleName>-<hash of ModuleMapPath>.pcm which 
>>> should
>>> +    // be globally unique to this particular module.
>>> +    llvm::APInt Code(64, llvm::hash_value(ModuleMapPath));
>>> +    SmallString<128> HashStr;
>>> +    Code.toStringUnsigned(HashStr);
>>> 
>>> Use base 36, like the module hash.
>> 
>> I could have sworn I did… must have got lost along the way.  Will do.
>> 
>> 
> 
> _______________________________________________
> cfe-commits mailing list
> [email protected]
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Re: [PATCH] Allow multiple modules with the same name to coexist in the module cache

Reply via email to