On Mon, Jun 22, 2009 at 4:59 PM, Antoine Pitrou <solip...@pitrou.net> wrote:
> - the **MD5** hash of the file, encoded in hex. Notice that `pyc` and > `pyo` > generated files will not have a hash. > > Why the exception for pyc and pyo files? As in PEP 262, since they are produced automatically from a py file, checking the py file is enough to decide if the file has changed. > > - `zlib` and `zlib-2.5.2.egg-info` are located in `site-packages` so the > file > paths are relative to it. > > Is it a general rule? That is, the paths in RECORD are always relative to > its > grandparent directory? no, they can be located anywhere on the system. But when the paths are located in the same directory where the .egg-info directory is located, a relative path is used. (see the section before this example) I'll add an example that contains files located elswhere in the system (like a script and a data file) > > > The RECORD format > ----------------- > > The `RECORD` file is a CSV file, composed of records, one line per > installed file. The ``csv`` module is used to read the file, with > the `excel` dialect, which uses these options to read the file: > > - field delimiter : `,` > - quoting char : `"`. > - line terminator : `\r\n` > > Wouldn't it be better to use the native line terminator on the current > platform? (someone might want to edit or at least view the file) Good idea, I'll change that, > > > What is the character encoding for non-ASCII filenames? UTF-8? > Yes, there's a constant in Distutils, called PKG_INFO_ENCODING that will be used for the file generation. > > Are the RECORD file's contents somehow part of the DistributionMetadata? The DistributionMetadata correspond to the fields defined in PEP 345, e.g. written in the PKG-INFO file, which is mentioned in the RECORD file. We are reworking PEP 345 as well, to add some fields. What did you have in mind ? > > > - ``DistributionDirectories``: manages ``EggInfoDirectory`` instances. > > What is an EggInfoDirectory ? Typo (old name), fixing this.. > > > A plural class name looks strange (I think it's the first time I see one in > the CPython codebase). How about another name? (DistributionPool, > DistributionMap, WorkingSet etc.). Sure, WorkingSet is nice, it's the name used in setuptools, > > > - ``get_egginfo_file(path, binary=False)`` -> file object > > Returns a file located under the `.egg-info` directory. > > Returns a ``file`` instance for the file pointed by ``path``. > > Is it always a file instance or just a file-like object? (zipped > distributions > come to mind). Is it opened read-only? It's in read-only mode, either "r" either "rb" and in case of a zip file, it returns a file-like object using zipfile.ZipFile.open. > > - ``owner(path)`` -> ``Distribution`` instance or None > > If ``path`` is used by only one ``Distribution`` instance, returns it. > Otherwise returns None. > > This is a bit confusing. If it returns None, it doesn't distinguish between > the > case where several Distributions refer to the path, and the case where no > distributions refer to it, does it? > The idea of this API is to find out of a distribution "owns" a file, e.g. is the only distribution that uses it, so it can be safely removed. > Is there any reason to have this method while file_users(path) already > exists? > Its just a helper for uninstallers, but file_users() could probably be enough, I can remove "owns" if people find it confusing, > A new class called ``DistributionDirectories`` is created. It's a > collection of > ``DistributionDirectory`` and ``ZippedDistributionDirectory`` instances. > The constructor takes one optional argument ``use_cache`` set to ``True`` > by > default. > > You forgot to describe the constructor's signature and what it does > exactly. > I'll add that, > > ``EggInfoDirectories`` also provides the following methods besides the > ones > from ``dict``:: > > What is EggInfoDirectories? This is a DistributionDirectories, one more instance I forgot to rename, I'll fix that > > > - ``append(path)`` > > Creates an ``DistributionDirectory`` (or > ``ZippedDistributionDirectory``) > instance for ``path`` and adds it in the mapping. > > - ``load(paths)`` > > Creates and adds ``DistributionDirectory`` (or > ``ZippedDistributionDirectory``) instances corresponding to ``paths``. > > Why are these methods named completely differently although they do almost > the > same thing? Besides, append() makes it look like ordering matters. Does it? > (for a naming suggestion, e.g. load(path) and load_all(paths). Or, > even simpler, load(*paths) or load(paths)) > Right, I'll fix that, > > - ``get_file_users(path)`` -> Iterator of ``Distribution`` (or > ``ZippedDistribution``) instances. > > This method is named file_users in another class. Perhaps the naming should > be > consistent? Right, it used to be get_*, that's a typo. I'll fix it, > > > All these functions use the same global instance of > ``DistributionDirectories`` > to use the cache. > > Is the global instance available to users? No I didn't made it available to avoid concurrency problems, > > > >>> for path, hash, size in dist.get_installed_files():: > ... print '%s %s %d %s' % (path, hash, size) > > There's one too many "%s" here. > Fixing it too, > > > Thanks for your work! > Thanks for the feedback, I'll commit the fixes asap. Tarek
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com