On Sun, Jan 31, 2010 at 11:04, Raymond Hettinger <raymond.hettin...@gmail.com> wrote: > > On Jan 30, 2010, at 4:00 PM, Barry Warsaw wrote: >> Abstract >> ======== >> >> This PEP describes an extension to Python's import mechanism which >> improves sharing of Python source code files among multiple installed >> different versions of the Python interpreter. > > +1 > > >> It does this by >> allowing many different byte compilation files (.pyc files) to be >> co-located with the Python source file (.py file). > > It would be nice if all the compilation files could be tucked > into one single zipfile per directory to reduce directory clutter. > > It has several benefits besides tidiness. It hides the implementation > details of when magic numbers get shifted. And it may allow faster > start-up times when the zipfile is in the disk cache.
It also eliminates stat calls. I have not seen anyone mention this, but on filesystems where stat calls are expensive (e.g. NFS), this is going to increase import cost (and thus startup time which some people are already incredibly paranoid about). You are now going to shift from a single stat call to check for a bytecode file to two just in the search phase *per file check* (remember you need to search for module.py and module/__init__.py). And then you get to repeat all of this during the load process (potentially, depending on how aggressive the loader is with caching). As others have said, an uncompressed zip file could work here. Or even a file format where the first 4 bytes is the timestamp and then after that are chunks of length-of-bytecode|magic|bytecode. That allows for opening a file in append mode to add more bytecode instead of a zipfile's requirement of rewriting the TOC on the end of the file every time you mutate the file (if I remember the zip file format correctly). Biggest cost in this simple approach would be reading the file in (unless you mmap the thing when possible) since once read the code will be a bytes object which means constant time indexing until you find the right magic number. And adding support to differentiate between -O bytecode is simply adding a marker per chunk of bytecode. And I disagree this would be difficult as the PEP suggests given the proper file format. For zip files zipimport already has the read code in C; it just would require the code to write to a zip file. And as for the format I mentioned above, that's dead-simple to implement. -Brett _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com