2009/7/5 Tarek Ziadé <ziade.ta...@gmail.com>: > Agreed, the zip case was added afterwards, but in practice, the APIs are still > dealing with the files are *filesystem files* located in a container (eg a > directory > or a zip file) located somewhere on the filesystem. > > "local" in that case is a flag that means "translate a file path expressed in > the > local filesystem" which make no sense anymore with zip files. But the goal > really, > is to be able to point out that two distributions are using the very same > file. > > Right now PEP 376 and the prototype code handle these two real world use > cases: > > - browsing regular site-packages-like directories > - browsing site-packages-like directories, that are zipped. > > For example: > > - I have a "packages.zip" file in /var/, wich is also in my sys.path. > It contains a distribution "foo-1.0" that has the "roman.py" file in > its root. So the RECORD file located in "foo-1.0.egg-info" has a line > starting with "roman.py,..." > > - Then if I install docutils 0.5 as a regular filesystem distribution, > "roman.py" will be added in Python's site-packages. > and docutils-0.5.egg-info/RECORD will contain "roman.py,..." with > the same hash. > > The local flag will return these paths: > > - /var/packages.zip/roman.py <--- not a "real" path > - /usr/local/lib/python2.6/site-packages/roman.py > > So removing the docutils distribution will be doable, because these > paths are different. > >> >> Concrete proposal: >> >> get_metadata_files() - returns slash-separated names, relative to the >> egginfo dir >> get_metadata_file(path) - path must be slash-separated, relative to >> the egginfo dir >> >> get_installed_files - returns the contents of RECORD unaltered >> uses(path) - checks if path is in RECORD >> >> The latter 2 are not very useful in practice - you can't say anything >> about entries in different RECORD files, which is likely the real use >> case you want. Maybe RECORD could have an extra "Location" entry, >> which determines where it exists globally (this would be the directory >> to which the filenames were relative, in the case of filesystem-based >> distributions) and RECORD entries are comparable if the Location >> values in the 2 RECORD files match. That's a lot more complex - but >> depending on what use people expect to make of these 2 APIs, it may be >> justified. > > Yes, > In practice, if you look at my previous example, even if > "/var/packages.zip/roman.py" isn't a > real path, it's enough to compare RECORD entries globally. > > The "Location" entry you are proposing in that case, would be > "/var/packages.zip". > > But do we really need to store it the RECORD ? Or can't we define an > API that returns > two elements : > > - the path to the location (in the example: /var/packages.zip or > /usr/local/lib/python2.6/site-packages) > - the path within the location itself (in the example: roman.py) > > A concrete proposal would be to take back your proposal, but return > tuples with the location as the first member. > e.g. "(location, relative path[s])"
That sounds reasonable. So we can forget the "local" parameter, and return a tuple: - absolute location of the container (directory, zipfile or whatever containing the egginfo file) as a filesystem path in canonical native form (where it's filesystem based) or as an opaque token for the odd cases (frozen modules, for example) where a filesystem location isn't available. - entry from the RECORD file, as a slash-separated filename relative to the root of the container. > The code that is comparing paths to see if they are the same can join > location+relative path[s], while we can provide in a dedicated function > something to read the content of the file (that would be get_data I guess, > if I refer to PEP 302) Unfortunately, get_data loads data files located within a *package*, using a name relative to the package directory. You can't get at the metadata of a *distribution* like that. But if you're using get_installed_files(), why would you then want to read the files? What exactly would you *use* get_installed_files for which would then leave you needing to read the files? If it's to check they haven't changed (by comparing md5 values) you're doing that to uninstall, so that's the responsibility of the uninstall function. Again, it's a question of what is a public API, and what is the use case it's designed for. I'm currently writing a SQLite importer, which will allow me to store "files" in any sort of database tables I want, so I can build in some nice pathological behaviour. That should tease out some awkward corner cases :-) Paul _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com