Re: [PATCH] Allow multiple modules with the same name to coexist in the module cache

Ben Langmuir Tue, 08 Apr 2014 16:41:33 -0700

Gentle ping. Richard, did you want to continue to review this patch in detail, 
or can I commit now that the high-level issues are resolved (at least for this 
patch)? I’m happy to evolve this patch post-commit or here in this review 
thread.


Ben

On Apr 4, 2014, at 4:13 PM, Ben Langmuir <[email protected]> wrote:

> 
> On Apr 4, 2014, at 2:35 PM, Richard Smith <[email protected]> wrote:
> 
>> On Fri, Apr 4, 2014 at 10:44 AM, Ben Langmuir <[email protected]> wrote:
>> 
>> On Apr 4, 2014, at 10:35 AM, Richard Smith <[email protected]> wrote:
>> 
>>> 
>>> On 4 Apr 2014 09:09, "Ben Langmuir" <[email protected]> wrote:
>>> >
>>> >
>>> > On Apr 3, 2014, at 7:49 PM, Richard Smith <[email protected]> wrote:
>>> >
>>> >> On Thu, Apr 3, 2014 at 3:40 PM, Ben Langmuir <[email protected]> wrote:
>>> >>>
>>> >>>
>>> >>> On Mar 28, 2014, at 2:45 PM, Richard Smith <[email protected]> 
>>> >>> wrote:
>>> >>>
>>> >>>> On Fri, Mar 28, 2014 at 1:12 PM, Ben Langmuir <[email protected]> 
>>> >>>> wrote:
>>> >>>>>
>>> >>>>> This patch allows multiple modules that have the same name to coexist 
>>> >>>>> in the module cache.  To differentiate between two modules with the 
>>> >>>>> same name, we will consider the path the module map file that they 
>>> >>>>> are defined by* part of the ‘key’ for looking up the precompiled 
>>> >>>>> module (pcm file).  Specifically, this patch renames the precompiled 
>>> >>>>> module (pcm) files from
>>> >>>>>
>>> >>>>> cache-path/<module hash>/Foo.pcm
>>> >>>>>
>>> >>>>> to
>>> >>>>>
>>> >>>>> cache-path/<module hash>/Foo-<hash of module map path>.pcm
>>> >>>>
>>> >>>>
>>> >>>> From a high level, I don't really see why we need a second hash here. 
>>> >>>> Shouldn't the -I options be included in the <module hash>? If I build 
>>> >>>> the same module with different -I flags, that should resolve to 
>>> >>>> different .pcm files, regardless of whether it makes the module name 
>>> >>>> resolve to a different module.map file.
>>> >>>>
>>> >>>> Are you trying to cope with the case where the -I path finds multiple 
>>> >>>> module.map files defining the same module (where it's basically chance 
>>> >>>> which one will get built and used)? I don't really feel like this is 
>>> >>>> the right solution to that problem either -- we should remove the 
>>> >>>> 'luck' aspect and use some sane mechanism to determine which 
>>> >>>> module.map files are loaded, and in what order.
>>> >>>>
>>> >>>> Is this addressing some other case?
>>> >>>>
>>> >>>>  
>>> >>>>>
>>> >>>>> In addition, I’ve taught the ASTReader to re-resolve the names of 
>>> >>>>> imported modules during module loading so that if the header search 
>>> >>>>> context changes between when a module was originally built and when 
>>> >>>>> it is loaded we can rebuild it if necessary.  For example, if module 
>>> >>>>> A imports module B
>>> >>>>>
>>> >>>>> first time:
>>> >>>>> clang -I /path/to/A -I /path/to/B …
>>> >>>>>
>>> >>>>> second time:
>>> >>>>> clang -I /path/to/A -I /different/path/to/B …
>>> >>>>>
>>> >>>>> will now rebuild A as expected.
>>> >>>>>
>>> >>>>>
>>> >>>>> * in the case of inferred modules, we use the module map file that 
>>> >>>>> *allowed* the inference, not the __inferred_module.map file, since 
>>> >>>>> the inferred file path is the same for every inferred module.
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>> Review comments on the patch itself:
>>> >>>>
>>> >>>>  +  /// the inferrence (e.g. contained 'module *') rather than the 
>>> >>>> virtual
>>> >>>>
>>> >>>> Typo 'inference', 'Module *'.
>>> >>>>
>>> >>>> +  /// For an explanaition of \p ModuleMap, see Module::ModuleMap.
>>> >>>>
>>> >>>> Typo 'explanation'.
>>> >>>>
>>> >>>> +  // safe becuase the FileManager is shared between the compiler 
>>> >>>> instances.
>>> >>>>
>>> >>>> Typo 'because'
>>> >>>>
>>> >>>> +  // the inferred module. If this->ModuleMap is nullptr, then we are 
>>> >>>> using
>>> >>>> +  // -emit-module directly, and we cannot have an inferred module.
>>> >>>>
>>> >>>> I don't understand what this comment is trying to say. If we're using 
>>> >>>> -emit-module, then we were given a module map on the command-line; 
>>> >>>> should that not be referred to by this->ModuleMap? (Also, why 
>>> >>>> 'this->'?) How can a top-level module be inferred? Is that a 
>>> >>>> framework-specific thing?
>>> >>>>
>>> >>>> +    StringRef ModuleMap = this->ModuleMap ? 
>>> >>>> this->ModuleMap->getName() : InFile;
>>> >>>>
>>> >>>> Please pick a different variable name rather than shadowing a member 
>>> >>>> of '*this' here.
>>> >>>>
>>> >>>> +    // Construct the name <ModuleName>-<hash of ModuleMapPath>.pcm 
>>> >>>> which should
>>> >>>> +    // be globally unique to this particular module.
>>> >>>> +    llvm::APInt Code(64, llvm::hash_value(ModuleMapPath));
>>> >>>> +    SmallString<128> HashStr;
>>> >>>> +    Code.toStringUnsigned(HashStr);
>>> >>>>
>>> >>>> Use base 36, like the module hash.
>>> >>>
>>> >>>
>>> >>> I’ve attached an updated patch.  Changes since the previous one:
>>> >>> 1. Fixed the typos and other issues Richard pointed out
>>> >>> 2. I’ve added code to canonicalize the module map path (using 
>>> >>> realpath); I was getting spurious failures on case-intensitive 
>>> >>> filesystems.
>>> >>
>>> >>
>>> >> This part is probably not OK, because it'll do the wrong thing on some 
>>> >> build farms (where the canonical path is not predictable, but the path 
>>> >> that make_absolute returns is, by careful control of $PWD). I'll look 
>>> >> into this more, but will be traveling for the next couple of days.
>>> >
>>> >
>>> > Okay, I have a better idea: I already store the actual module map path in 
>>> > MODULE_MAP_FILE, so when that is read, we can just get a FileEntry for it 
>>> > and compare to what header search finds for the current module (since 
>>> > FileEntries are uniqued by inode).  That also means I can give a better 
>>> > diagnostic with the module map paths rather than the pcm filenames.
>>> 
>>> Sounds great, thanks!
>>> 
>>> 
>> 
>> Nuts, it turns out I still need a canonical path - when we hash the path 
>> into the pcm name we don’t want to get a different value depending on the 
>> case-sensitivity of the the file system.
>> 
>> Shall I just make_absolute and then realpath()? That should incorporate $PWD 
>> correctly, I think.
>> 
>> That'll resolve symlinks in $PWD -- the problem is with build farms that 
>> don't (or can't) make the file system look the same on each target, but try 
>> to fake it up with symlinks; if you resolve the symlinks, you defeat this 
>> attempt.
> 
> I don’t know how that could break anything with the pcm filenames, but sure.
> 
>> How about folding the path to the module map to lowercase before hashing it, 
>> then when we come to load the module map, check that the FileEntry found by 
>> header search for the module matches the FileEntry for the actual module map 
>> path as you were planning? We'd get (and detect and recover from) cache slot 
>> collisions if two module map files differ only by case, but that sounds 
>> pretty unlikely to me.
> 
> I can live with that.  The attached patch uses StringRef::lower() and drops 
> the modifications to FileManager.
> 
> <modulemappath-v4.patch>
> 
>> 
>> Ben
>> 
>>> >>
>>> >>> 3. I’ve moved the initialization of the MainFileID (in source manager) 
>>> >>> from Execute() to BeginSourceFile(), since we are now potentially 
>>> >>> creating file ids for module map files during pch loading and need to 
>>> >>> be able to find the main file reliably to construct a correct include 
>>> >>> stack.
>>> >
>>> >
>>>

_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Re: [PATCH] Allow multiple modules with the same name to coexist in the module cache

Reply via email to