Re: [Python-Dev] PEP 3147: PYC Repository Directories
Henning von Bargen wrote: I like the idea of the PEP. On the other hand, I dislike using directories for it. Others have explained enough reasons for why creating many directories is a bad idea; and there may be other reasons (file-system limits for number of directories, problems when the directories are located on the network). Actually, this is the first post I've seen noting objective problems with the use of a subdirectory. The others were just a subjective difference in perspective that saw subdirectory clutter as somehow being worse than file clutter. Specific examples of filesystems with different limits on file and subdirectory counts and network filesystems where opening a subdirectory can result in a significant speed impact would be very helpful. The solution is so obvious: Why not use a .pyr file that is internally a zip file? Agreed this should be discussed in the PEP, but one obvious problem is the speed impact. Picking up a file from a subdirectory is going to introduce less overhead than unpacking it from a zipfile. That said, using a non-compressed zipfile would make a lot more sense than inventing our own archive format if a subdirectory is eventually deemed unsuitable. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia --- ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
Linux distributions such as Ubuntu [2]_ and Debian [3]_ provide more than one Python version at the same time to their users. For example, Ubuntu 9.10 Karmic Koala can install Python 2.5, 2.6, and 3.1, with Python 2.6 being the default. In order to ease the burden on operating system packagers for these distributions, the distribution packages do not contain Python version numbers [4]_; they are shared across all Python versions installed on the system. Putting Python version numbers in the packages would be a maintenance nightmare, since all the packages - *and their dependencies* - would have to be updated every time a new Python release was added or removed from the distribution. Because of the sheer number of packages available, this amount of work is infeasible. As a non-Debian user (I'm a Gentoo user), the above doesn't enlighten me, even after skimming the referenced document. Perhaps an example would be helpful? I think the basic question is: how do you get stuff into /usr/lib/python2.6/site-packages/Pyrex? One option would be to have a Debian package python26-pyrex. Then you would also need a python25-pyrex package and a python27-pyrex package, all essentially containing the very same files (but installed into different directories). What they want is a single python-pyrex package that automatically works for all Python versions - even those that aren't yet installed (i.e. install python-pyrex first, and Python 2.7 later, and python-pyrex should be available). Having a single directory in sys.path for all Python versions currently doesn't work, as the pyc files for each version would conflict. The current solution consists (for package installation) of a) installing the files in a single place b) creating a directory hiearchy in each Python's site-package c) symlinking all .py files into this directory hierarchy d) byte-compiling all .py files in the hierarchy For installation of new Python versions, they need to a) walk over the list of installed Python packages b) for each one, repeat steps b..d from above With the PEP in place, for pure-Python packages, they could a) have a system wide directory for pure-Python packages, and b) arrange that directory to appear on sys.path for all Python versions On package installation, they then could a) install the files in that system-wide directory b) for each Python version, run byte-code compilation of the new package On Python installation, they would a) byte-compile the entire directory. Alternatively, to support packages that don't work with all Python versions, they could continue to use symlinking, but restrict that onto the top directories of each package (i.e. not create a directory hierarchy in site-packages). (FYI, Gentoo just installs the pyc files into each of the installed Python's site-packages that is supported by the package in question...disk space is relatively cheap.) I suppose Gentoo also installs .py files into each site-packages? How does it deal with a Python installation that happens after the package installation? * Would a moratorium on byte code changes, similar to the language moratorium described in PEP 3003 [16]_ be a better approach to pursue, and would that solve the problem for vendors? At the time of this writing, PEP 3003 is silent on the issue. Unless the bytecode change moratorium was permanent (unlikely), how would this solve the vendor issues? A vendor strategy might be to not store .pyc files on disk for some Python versions (i.e. those that differ from the rest). Assume that 3.2, 3.3, 3.4 use the same pyc magic, and 3.5, 3.6, 3.7 also do. Then, at any point in time, one of the Python versions is the system python in Debian. This is the one who decides the official .pyc magic. The other Python installations on the same system can either reuse the existing .pyc files (if the magic matches), or not, in which case they have to recompile (to memory) the Python source on every startup. The longer the moratorium, the less of a problem this could cause for users. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
Nick Coghlan ncogh...@gmail.com writes: Actually, this is the first post I've seen noting objective problems with the use of a subdirectory. The others were just a subjective difference in perspective that saw subdirectory clutter as somehow being worse than file clutter. Here's another one, then: The directory where the source code files reside is often a working area for the developer. The directory structure is an essential tool of organising the project; the presence of an unwanted directory is clutter to this purpose, in a way that the presence of an unwanted file is not. -- \ “Alternative explanations are always welcome in science, if | `\ they are better and explain more. Alternative explanations that | _o__) explain nothing are not welcome.” —Victor J. Stenger, 2001-11-05 | Ben Finney ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
Agreed this should be discussed in the PEP, but one obvious problem is the speed impact. Picking up a file from a subdirectory is going to introduce less overhead than unpacking it from a zipfile. There is also the issue of race conditions with multiple simultaneous accesses. The original format for the PEP had race conditions for multiple simultaneous writers; ZIP will also have race conditions for concurrent readers/writers (as any new writer will have to overwrite the central directory, making the zip file temporarily unavailable - unless they copy it, in which case we are back to writer/writer races). Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
Barry Warsaw barry at python.org writes: Putting Python version numbers in the packages would be a maintenance nightmare, since all the packages - *and their dependencies* - would have to be updated every time a new Python release was added or removed from the distribution. Because of the sheer number of packages available, this amount of work is infeasible. How is this infeasible exactly? Wouldn't it be an easy target for scripting? As an example of the problem, a common (though fragile) Python idiom for locating data files is to do something like this:: I don't think this is fragile. It is the most robust I can think of, but perhaps I'm missing another solution :) (well, apart from pkg_resources, that is) The implementation of this PEP would have to ensure that the same directory level is returned from `__file__` as it does without the `pyr` directory, so that the common idiom above continues to work:: import foo foo.__file__ 'foo.pyr' Would things like exec() work on the given directory? An earlier version of this PEP described fat Python byte code files. These files would contain the equivalent of multiple `pyc` files in a single `pyf` file, with a lookup table keyed off the appropriate magic number. This was an extensible file format so that the first 5 parallel Python implementations could be supported fairly efficiently, but with extension lookup tables available to scale `pyf` byte code objects as large as necessary. As Martin said, this creates concurrent access problems, when several interpreters modify the file simultaneously. * What about `py` source files that are compatible with most but not all installed Python versions. We might need a way to say this py file should be hidden from Python versions X.Y or earlier. -1. This is the distributor's job, not Python's. If you want you can create dummy pyc's in your pyr that will raise an ImportError or a NotImplementedError with some versions of Python. But I don't think Python should have a stake in this. * Would a moratorium on byte code changes, similar to the language moratorium described in PEP 3003 [16]_ be a better approach to pursue, and would that solve the problem for vendors? At the time of this writing, PEP 3003 is silent on the issue. -1. Bytecode is an internal detail; besides, it is vital to be able to evolve it. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
Am 31.01.2010 07:29, schrieb Nick Coghlan: Vitor Bosshard wrote: There is no one-to-one correspondence between Python version and pyc magic numbers. Different runtime options may change the magic number and different versions may reuse a magic number Good point. Runtime options would need to change the version (e.g. foo.25U.py), and versions that reuse magic numbers would be redundantly written to disk. However, the underlying issue as I see it is that the magic value is an implementation detail that should not be exposed. I think this is actually be a good point - while there needs to be a shared namespace to allow different Python implementations to avoid stepping on each others toes, CPython's bytecode compatibility magic number may not be the best choice as the distinguishing identifier. It may be better to give the magic numbers a meaningful corresponding string, such that the filenames would be more like: foo.py foo.pyr/ cpython-25.pyc cpython-25U.pyc cpython-27.pyc cpython-27U.pyc cpython-32.pyc unladen-011.pyc wpython-11.pyc +1. It should be quite easy to assign a new name every time the magic number is updated. If we don't change the bytecode for a given Python version, then the name of the bytecode format used wouldn't change either. That would be the only remaining complaint for casual users. (Why doesn't Python compile my file for 2.8?) Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
Am 31.01.2010 05:18, schrieb Ben Finney: Nick Coghlan ncogh...@gmail.com writes: It won't be cluttered with subfolders - you will have at most one .pyr per source .py file. If that doesn't meet your threshold of “cluttered with subfolders”, I'm at a loss for words to think where that threshold might be. It meets, and exceeds by a long shot, my threshold for subfolder clutter. Even adding a *single* subfolder in arbitrary directories is an obnoxious act for a program to do automatically, and is not to be undertaken lightly. It might be justified in this case, but that doesn't mean we should open the gates to even more clutter. Then why did Subversion choose to follow the CVS way and create a subdirectory in each versioned directory? IMO, this is much more annoying given the alternative of a single .hg/.bzr/whatever directory. For .pyc vs .pyr, you didn't have the alternative of putting all that stuff in one directory now. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
Am 31.01.2010 10:21, schrieb Ben Finney: Nick Coghlan ncogh...@gmail.com writes: Actually, this is the first post I've seen noting objective problems with the use of a subdirectory. The others were just a subjective difference in perspective that saw subdirectory clutter as somehow being worse than file clutter. Here's another one, then: The directory where the source code files reside is often a working area for the developer. The directory structure is an essential tool of organising the project; the presence of an unwanted directory is clutter to this purpose, in a way that the presence of an unwanted file is not. At least to me, this does not explain why an unwanted (why unwanted? If it's unwanted, set PYTHONDONTWRITEBYTECODE=1) directory is worse than an unwanted file. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
On Sun, Jan 31, 2010 at 1:54 PM, Hanno Schlichting ha...@hannosch.eu wrote: I'd be a big +1 to using a single .pyr directory per source directory. I don't know whether I in favour of using a single pyr folder or not but if a single folder is used I'd definitely prefer the folder to be called __pyr__ rather than .pyr. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
On Sun, Jan 31, 2010 at 1:03 PM, Simon Cross hodgestar+python...@gmail.com wrote: I don't know whether I in favour of using a single pyr folder or not but if a single folder is used I'd definitely prefer the folder to be called __pyr__ rather than .pyr. Do you have any specific reason for that? Using the leading dot notation is an established pattern to hide non-essential information from directory views. What makes this non-applicable in this situation and a custom Python notation better? Hanno ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
On Sun, Jan 31, 2010 at 2:13 PM, Hanno Schlichting ha...@hannosch.eu wrote: On Sun, Jan 31, 2010 at 1:03 PM, Simon Cross hodgestar+python...@gmail.com wrote: I don't know whether I in favour of using a single pyr folder or not but if a single folder is used I'd definitely prefer the folder to be called __pyr__ rather than .pyr. Do you have any specific reason for that? It rather not have the confusion caused by stray .pyc files multiplied by having said stray files buried in a hidden folder. Using the leading dot notation is an established pattern to hide non-essential information from directory views. What makes this non-applicable in this situation and a custom Python notation better? Something being an established pattern doesn't mean it's a good idea. If we're go with an by-convention argument anyway surely Python conventions should take precedence -- this is *Python* after all. :) On the whole I'm against hiding folders because what information is non-essential varies from situation to situation. People (including me) regularly screw up dealing with .svn folders by including them in source tarballs, copying parts of one working copy into another, etc. Schiavo Simon ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
Georg Brandl wrote: Am 31.01.2010 07:18, schrieb Nick Coghlan: Ben Finney wrote: Could we instead have a single subdirectory for each tree of module packages, keeping them tidily out of the way of the source files, while making them located just as deterministically:: Not easily. With the scheme currently proposed in the PEP, setting a value for __file__ which is both reasonably accurate and backwards compatible with existing file manipulation techniques is straightforward: just use the name of the cache directory. Not really -- much of the code I've seen that tries to guess the source file name from a __file__ value just does something like this: if fname.lower().endswith(('.pyc', '.pyo')): fname = fname[:-1] That's not compatible with using .pyr, either. That's not the backwards compatibility I'm talking about - I'm talking about the more common one mentioned in the PEP where __file__ is used with os.path.split to locate adjacent resource files. Agreed that even the .pyr idea causes backwards compatibility problems with code like the above (fortunately we can fix the stdlib instances ourselves). Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia --- ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
Georg Brandl wrote: Then why did Subversion choose to follow the CVS way and create a subdirectory in each versioned directory? IMO, this is much more annoying given the alternative of a single .hg/.bzr/whatever directory. For .pyc vs .pyr, you didn't have the alternative of putting all that stuff in one directory now. I actually like the svn/cvs way, since each directory in the working copy is self-contained. The DVCS way means that you can't tell just by looking at a directory whether it is part of a working copy or not - there is a non-local element affecting you at a higher point in the filesystem hierarchy. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia --- ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
2010/1/31 Georg Brandl g.bra...@gmx.net: foo.py foo.pyr/ cpython-25.pyc cpython-25U.pyc cpython-27.pyc cpython-27U.pyc cpython-32.pyc unladen-011.pyc wpython-11.pyc +1. It should be quite easy to assign a new name every time the magic number is updated. If we don't change the bytecode for a given Python version, then the name of the bytecode format used wouldn't change either. That would be the only remaining complaint for casual users. (Why doesn't Python compile my file for 2.8?) I think it's preferable to have a redundant copy of the compiled file floating around rather than creating confusion as to which one each python interpreter uses. Optimizing disk space (and marginal compile time) is not worth the mental overhead this would introduce. Better keep it as clear and simple as possible, i.e. create different .pyc files even if the bytecode doesn't change between releases. Vitor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
Georg Brandl wrote: +1. Having a single (visible) __pyr__ directory is much less clutter than multiple .pyc files anyway. Also, don't forget Windows users, for whom the dot convention doesn't mean anything. I must admit I quite like the __pyr__ directory approach as well. Since the interpreter knows the suffix it is looking for, names shouldn't conflict. Using a single directory allows the name to be less cryptic, too (e.g. __pycache__). That still leaves the question of what to do with __file__ (for which even the solution in the PEP isn't particularly clean). Perhaps the thing to do there is to have __file__ always point to the source file and introduce a __file_cached__ that points to the bytecompiled file on disk (set to None if it doesn't exist, as may be the case for __main__ or due to writing of bytecode files being disabled). With that approach, a structure given just a run under 2.7 and one under 2.7 with -O might look like: package/ __init__.py foo.py __pycache__/ __init__.cpython-27.pyc __init__.cpython-27.pyo foo.cpython-27.pyc foo.cpython-27.pyo subpackage/ __init__.py bar.py __pycache__/ __init__.cpython-27.pyc __init__.cpython-27.pyo bar.cpython-27.pyc bar.cpython-27.pyo __file__ would always point to the source files __file_cached__ would always point to the relevant compiled file (either pre-existing or newly created) To use the final step of importing package.foo as an example (ignoring the extra backwards compatibility steps for the existing scheme): 1. Check package dir listing for __pyr__ 2. It it exists, check it for a foo.cookie.ext (where the interpreter will always know exactly which cookie and extension it wants) 3. As an CPython implementation details, use the cookie inside the file to double-check correctness 4. If all good, run with that cached file 5. Otherwise, check package dir for foo.py 6. If the source file exists, create the cached bytecode file inside the __pyr__ (if this fails, just run from RAM with __file_cached__ = None) 7. Run with the newly compiled source file 8. Otherwise report ImportError. This doesn't seem to have any significant disadvantages relative to the subdirectory-per-source-file approach (and the major advantage of creating just a single subdirectory). Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia --- ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
Vitor Bosshard wrote: Optimizing disk space (and marginal compile time) is not worth the mental overhead this would introduce. Better keep it as clear and simple as possible, i.e. create different .pyc files even if the bytecode doesn't change between releases. Yeah, makes sense. Given the level of fiddling with it these days, it may be a very long while before the issue actually comes up anyway :) Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia --- ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
2010/1/31 Nick Coghlan ncogh...@gmail.com: Georg Brandl wrote: Then why did Subversion choose to follow the CVS way and create a subdirectory in each versioned directory? IMO, this is much more annoying given the alternative of a single .hg/.bzr/whatever directory. For .pyc vs .pyr, you didn't have the alternative of putting all that stuff in one directory now. I actually like the svn/cvs way, since each directory in the working copy is self-contained. The DVCS way means that you can't tell just by looking at a directory whether it is part of a working copy or not - there is a non-local element affecting you at a higher point in the filesystem hierarchy. Exactly. How would you define where the pyr folder goes? At the root of a package? What if I delete the __init__.py file there? Will the existing pyr folder be orphaned and a new one created in each subfolder? Unlike VCS working copies, the package / module / script hierarchy is not formally defined in python. Having one single pyr (or__pycache__ or whatever it's called) subfolder per folder is an easy to understand, solid solution. I'm also in favor of making this folder non-hidden. Unlike a .git or.hg folder, it impacts the code execution itself. Think of the newbies! .pyc files already lead to heisenbugs for inexperienced developers (e.g. importing a lingering pyc instead of an intended module with the same name further down sys.path), and they're in plain sight. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
Am 31.01.2010 14:02, schrieb Nick Coghlan: Georg Brandl wrote: Then why did Subversion choose to follow the CVS way and create a subdirectory in each versioned directory? IMO, this is much more annoying given the alternative of a single .hg/.bzr/whatever directory. For .pyc vs .pyr, you didn't have the alternative of putting all that stuff in one directory now. I actually like the svn/cvs way, since each directory in the working copy is self-contained. The DVCS way means that you can't tell just by looking at a directory whether it is part of a working copy or not - there is a non-local element affecting you at a higher point in the filesystem hierarchy. Yes, but is it really so common that you need to know if the directory is part of a working copy? Usually either you already know, or it's unnecessary to know. Apart from that it's trivial to find out using hg root etc. In contrast, those .svn directories are a pain to work with when copying or moving stuff, either out of a working copy (they are overlooked) or between working copies (svn says strange things when you try to do svn add). Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
On Sun, 31 Jan 2010 13:13:24 +0100, Georg Brandl g.bra...@gmx.net wrote: Am 31.01.2010 13:03, schrieb Simon Cross: On Sun, Jan 31, 2010 at 1:54 PM, Hanno Schlichting ha...@hannosch.eu wrote: I'd be a big +1 to using a single .pyr directory per source directory. I don't know whether I in favour of using a single pyr folder or not but if a single folder is used I'd definitely prefer the folder to be called __pyr__ rather than .pyr. And to come complete with standard library functions to find the corresponding .py for a .pyc and .pyc for a .py. +1 -- R. David Murray www.bitdance.com Business Process Automation - Network/Server Management - Routers/Firewalls ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
On Sun, 31 Jan 2010 09:50:16 +0100, =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= mar...@v.loewis.de wrote: Linux distributions such as Ubuntu [2]_ and Debian [3]_ provide more than one Python version at the same time to their users. For example, Ubuntu 9.10 Karmic Koala can install Python 2.5, 2.6, and 3.1, with Python 2.6 being the default. In order to ease the burden on operating system packagers for these distributions, the distribution packages do not contain Python version numbers [4]_; they are shared across all Python versions installed on the system. Putting Python version numbers in the packages would be a maintenance nightmare, since all the packages - *and their dependencies* - would have to be updated every time a new Python release was added or removed from the distribution. Because of the sheer number of packages available, this amount of work is infeasible. As a non-Debian user (I'm a Gentoo user), the above doesn't enlighten me, even after skimming the referenced document. Perhaps an example would be helpful? I think the basic question is: how do you get stuff into /usr/lib/python2.6/site-packages/Pyrex? One option would be to have a Debian package python26-pyrex. Then you would also need a python25-pyrex package and a python27-pyrex package, all essentially containing the very same files (but installed into different directories). What they want is a single python-pyrex package that automatically works for all Python versions - even those that aren't yet installed (i.e. install python-pyrex first, and Python 2.7 later, and python-pyrex should be available). Having a single directory in sys.path for all Python versions currently doesn't work, as the pyc files for each version would conflict. The current solution consists (for package installation) of a) installing the files in a single place b) creating a directory hiearchy in each Python's site-package c) symlinking all .py files into this directory hierarchy d) byte-compiling all .py files in the hierarchy For installation of new Python versions, they need to a) walk over the list of installed Python packages b) for each one, repeat steps b..d from above With the PEP in place, for pure-Python packages, they could a) have a system wide directory for pure-Python packages, and b) arrange that directory to appear on sys.path for all Python versions On package installation, they then could a) install the files in that system-wide directory b) for each Python version, run byte-code compilation of the new package On Python installation, they would a) byte-compile the entire directory. Alternatively, to support packages that don't work with all Python versions, they could continue to use symlinking, but restrict that onto the top directories of each package (i.e. not create a directory hierarchy in site-packages). Excellent, thank you. IMO this explanation should go in the PEP. By the way, the part that caused me the most confusion in the language in the PEP was the emphasized *and their dependencies*, as if a package having dependencies somehow turned the problem into a factorial explosion. But there seems to be nothing special, according to your explanation, about dependencies in this scheme. (FYI, Gentoo just installs the pyc files into each of the installed Python's site-packages that is supported by the package in question...disk space is relatively cheap.) I suppose Gentoo also installs .py files into each site-packages? Arg. My fingers added the 'c' without my mind getting involved, apparently. I meant that the .py is installed directly in each site-packages. How does it deal with a Python installation that happens after the package installation? There's a tool that runs through all installed python packages and does the install and byte compile (basically, reinstalls the package for the new Python version). Gentoo doesn't have the multiple-os-packages-per-Python-version problem, since it installs from source. It seems like it would be simple enough to enhance the os packaging systems to allow the install path to be specified at install time, if that really is the only difference between the package versions. And a script that runs through all the installed python packages and installs them for a new Python version when a new version is installed should be as easy for other distributions as it is for Gentoo. That would also mean that dependencies on the Python version would be handled by the packaging system: it would refuse to install a given package if that package didn't support the specified Python version. Or is something missing from my understanding? If not, I think the motivation section should address why the PEP is a better idea than improving the os packaging systems as I've suggested. (The os vendors are going to have to change details of their packaging systems if the PEP is accepted, so it's not as if the
Re: [Python-Dev] PEP 3147: PYC Repository Directories
Le Sun, 31 Jan 2010 18:46:54 +1000, Nick Coghlan a écrit : Actually, this is the first post I've seen noting objective problems with the use of a subdirectory. The others were just a subjective difference in perspective that saw subdirectory clutter as somehow being worse than file clutter. Specific examples of filesystems with different limits on file and subdirectory counts and network filesystems where opening a subdirectory can result in a significant speed impact would be very helpful. I disagree. Since the proposed scheme is optional and disabled by default, this is only a problem for Debian and Ubuntu to tackle. It is none of our (CPython) business. I expect nobody outside of a couple of Linux distros will use this feature anyway (which is why, regardless of the slight ugliness of the proposal, I am not against it). Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
Le Sat, 30 Jan 2010 21:04:14 -0800, Jeffrey Yasskin a écrit : I have a couple bikesheddy or why didn't you do this comments. I'll be perfectly satisfied with an answer or a line in the pep. 1. Why the -R flag? It seems like this is a uniform improvement, so it should be the default. Have faith in your design! ;-) -1 for making it a default. It is definitely ugly and useless for most cases. It is fine as long as it is optional and merely used by the Debian/ Ubuntu installers. cheers Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
On Sun, Jan 31, 2010 at 5:00 AM, Martin v. Löwis mar...@v.loewis.de wrote: Agreed this should be discussed in the PEP, but one obvious problem is the speed impact. Picking up a file from a subdirectory is going to introduce less overhead than unpacking it from a zipfile. There is also the issue of race conditions with multiple simultaneous accesses. The original format for the PEP had race conditions for multiple simultaneous writers; ZIP will also have race conditions for concurrent readers/writers (as any new writer will have to overwrite the central directory, making the zip file temporarily unavailable - unless they copy it, in which case we are back to writer/writer races). Since a pyc file is just a marshaled code object prefixed with a magic number and mtime why not change the structure of the pyc (maybe calling it a pyr file) to store multiple code objects in a marshaled dictionary. For example: { 'MAGIC': (mtime, code_obj) } This would eliminate the read-time race condition but still potentially allow for a write-time race condition if locking isn't used. The benefit of this approach is that it is no less clear than pyc is today and doesn't result in n * versions_of_python pyc files. There is still the overhead of unmarshaling the file to check for a code object that matches your version. If unmarshaling the entire file each time is problematic an on-disk format with a short TOC at the beginning followed by the marshaled data would likely solve the issue. -- Michael E. Crute http://mike.crute.org It is a mistake to think you can solve any major problem just with potatoes. --Douglas Adams ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
Vitor Bosshard wrote: 2010/1/31 Georg Brandl g.bra...@gmx.net: foo.py foo.pyr/ cpython-25.pyc cpython-25U.pyc cpython-27.pyc cpython-27U.pyc cpython-32.pyc unladen-011.pyc wpython-11.pyc +1. It should be quite easy to assign a new name every time the magic number is updated. If we don't change the bytecode for a given Python version, then the name of the bytecode format used wouldn't change either. That would be the only remaining complaint for casual users. (Why doesn't Python compile my file for 2.8?) I think it's preferable to have a redundant copy of the compiled file floating around rather than creating confusion as to which one each python interpreter uses. Optimizing disk space (and marginal compile time) is not worth the mental overhead this would introduce. Better keep it as clear and simple as possible, i.e. create different .pyc files even if the bytecode doesn't change between releases. Additionally, it has been mentioned that the magic number might change with different builds of alpha releases, but that shouldn't matter because the existing .pyc files don't show the magic number anyway! ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
On Sun, Jan 31, 2010 at 8:34 AM, Nick Coghlan ncogh...@gmail.com wrote: That still leaves the question of what to do with __file__ (for which even the solution in the PEP isn't particularly clean). Perhaps the thing to do there is to have __file__ always point to the source file and introduce a __file_cached__ that points to the bytecompiled file on disk (set to None if it doesn't exist, as may be the case for __main__ or due to writing of bytecode files being disabled). +1 for this, it seems to be what most people want anyway, given the code that munges the .pyc back to the .py. I bet this change would break very little code. Reid ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
By the way, the part that caused me the most confusion in the language in the PEP was the emphasized *and their dependencies*, as if a package having dependencies somehow turned the problem into a factorial explosion. But there seems to be nothing special, according to your explanation, about dependencies in this scheme. For regular (forward) dependencies, there is indeed nothing special to consider - they would have to exist in all versions. In practice, this can (and was) problematic: python-zope.sendmail depends on python-pkg-resources, python-transaction, python-zope, and 10 other things. Before you could starting to provide python27-zope.sendmail, all of these dependencies would have to become available in a 2.7 version first, meaning that ten other Debian developers need to act before you can. With the failure rate of Debian developers (who go as often on holidays as any other volunteer), upgrading to a new Python release could often take many months. Now they have that new grand scheme involving tons of symbolic links; actually, they have two of them (python-central and python-support). While this is perfect in theory, it's not very robust (Barry can probably better report all the failure modes). I guess (without knowing) that this is really what triggered the PEP. It seems like it would be simple enough to enhance the os packaging systems to allow the install path to be specified at install time, if that really is the only difference between the package versions. And a script that runs through all the installed python packages and installs them for a new Python version when a new version is installed should be as easy for other distributions as it is for Gentoo. However, it's also unacceptable. I can't cite the exact piece of Debian policy, but I'm fairly sure that build activities are not allowed at installation time. So actually running setup.py files is out of question. Users who want such a thing would have to switch to Gentoo; Debian users just want it to work :-) (The os vendors are going to have to change details of their packaging systems if the PEP is accepted, so it's not as if the PEP saves the vendors work.) Again, I'm a little bit unclear on the motivation, also. I think it mostly is after years of experimentation, we have run out of ideas how to solve all related problems simultaneously without changing Python, so let's look for options that do involve changing Python. If you *really* want a list of all the simultaneous problems that need to be solved, and an explanation of why each individual solution has flaws, prepare for this conversation to take a few more weeks. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
On Sun, Jan 31, 2010 at 6:04 AM, Jeffrey Yasskin jyass...@gmail.com wrote: 1. Why the -R flag? It seems like this is a uniform improvement, so it should be the default. Have faith in your design! ;-) +1 for a single strategy that is used in all cases. The current solution could be phased out across multiple releases, but in the end there should be a single approach and no flag. Otherwise some code and tools will only support one of the approaches, especially if this is seen as something only a minority of Linux distributions uses. Hanno ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
This would eliminate the read-time race condition but still potentially allow for a write-time race condition if locking isn't used. The benefit of this approach is that it is no less clear than pyc is today and doesn't result in n * versions_of_python pyc files. There is still the overhead of unmarshaling the file to check for a code object that matches your version. If unmarshaling the entire file each time is problematic an on-disk format with a short TOC at the beginning followed by the marshaled data would likely solve the issue. This was actually what the first draft proposed. Try specifying it in full detail, and you'll find out that a) it is *really* complicated, and b) locking is really tricky to achieve. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
Reid Kleckner wrote: On Sun, Jan 31, 2010 at 8:34 AM, Nick Coghlan ncogh...@gmail.com wrote: That still leaves the question of what to do with __file__ (for which even the solution in the PEP isn't particularly clean). Perhaps the thing to do there is to have __file__ always point to the source file and introduce a __file_cached__ that points to the bytecompiled file on disk (set to None if it doesn't exist, as may be the case for __main__ or due to writing of bytecode files being disabled). +1 for this, it seems to be what most people want anyway, given the code that munges the .pyc back to the .py. I bet this change would break very little code. Isn't .pyc just an optimisation for performance reasons? If so, then the user is more interested in the .py file. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
On Jan 30, 2010, at 4:00 PM, Barry Warsaw wrote: Abstract This PEP describes an extension to Python's import mechanism which improves sharing of Python source code files among multiple installed different versions of the Python interpreter. +1 It does this by allowing many different byte compilation files (.pyc files) to be co-located with the Python source file (.py file). It would be nice if all the compilation files could be tucked into one single zipfile per directory to reduce directory clutter. It has several benefits besides tidiness. It hides the implementation details of when magic numbers get shifted. And it may allow faster start-up times when the zipfile is in the disk cache. Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
Sense this is something new, I believe it is helpful to look at all the possibilities so it doesn't become something we regret we did later. This is something that once it gets put in place may be real hard to get rid of. So here are a few questions that I think haven't seen asked yet. What command line options will be available to alter how python uses .pyo, pyc., and pyr located files? What's the easiest way to remove all pyr dirs and files? What's the easiest way to remove pyr dirs files for one project? Would having python command line argument(s) to remove bytecode make sense? Would it be possible to have one single python system __pycache__ directory and put all bytecode in it? Instead of subdirectories in the pyr or __cache__ directory, would it be possible to just use unique names and not have multiple sub directories. # some place on the users disk someplace/ foobar/ __init__.py foo.py foobar2/ bar.py # someplace on the system disk ___pycache__/ v027.xyz.foobar.__init__.py v027.xyz.foobar.foo.pyc v030.xyz.foobar.__init__.py v030.xyz.foobar.foobar2.bar.pyc # where 'xyz' identifies a unique location to differentiate when there is more than one copy of a program. (They may not be exactly the same, but my have the same file names.) # I like the version id up front because it doesn't intermix different version files together. With a single cache directory, we could have an option to force writing bytecode to a desired location. That might be useful on it's own for creating runtime bytecode only installations for installers. Ron ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
Martin v. Löwis schrieb: There is also the issue of race conditions with multiple simultaneous accesses. The original format for the PEP had race conditions for multiple simultaneous writers; ZIP will also have race conditions for concurrent readers/writers (as any new writer will have to overwrite the central directory, making the zip file temporarily unavailable - unless they copy it, in which case we are back to writer/writer races). Regards, Martin Good point. OTOH the probability for this to happen actually is very small. Henning ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
On 1/31/2010 8:58 AM, Vitor Bosshard wrote: Having one single pyr (or__pycache__ or whatever it's called) subfolder per folder is an easy to understand, solid solution. As a user who browses directories to see what is there and to find files to open and look at, I would like this. The near-duplicate .pyc listings are just noise that take up screen space. 'pycache' would be pretty clear. A slew of directories constaining, in general, one file, each would be a waste of space and make it impossible to see at once what has been used and therefore compiled. This is main reason I can think of for humans to see the .pyc names. Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
Whoa. This thread already exploded. I'm picking this message to respond to because it reflects my own view after reading the PEP. On Sun, Jan 31, 2010 at 4:13 AM, Hanno Schlichting ha...@hannosch.eu wrote: On Sun, Jan 31, 2010 at 1:03 PM, Simon Cross hodgestar+python...@gmail.com wrote: I don't know whether I in favour of using a single pyr folder or not but if a single folder is used I'd definitely prefer the folder to be called __pyr__ rather than .pyr. Exactly what I would prefer. I worry that having many small directories is a fairly poor use of the filesystem. A quick scan of /usr/local/lib/python3.2 on my Linux box reveals 1163 .py files but only 57 directories). Do you have any specific reason for that? Using the leading dot notation is an established pattern to hide non-essential information from directory views. What makes this non-applicable in this situation and a custom Python notation better? Because we don't want to completely hide the pyc files. Also the dot naming convention is somewhat platform-specific. FWIW in Python 3, the __file__ variable always points to the .py source filename. I agreed with Georg that there ought to be an API for finding the pyc file for a module. This could be a small addition to the PEP. -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
On 1/31/2010 8:34 AM, Nick Coghlan wrote: Georg Brandl wrote: +1. Having a single (visible) __pyr__ directory is much less clutter than multiple .pyc files anyway. Also, don't forget Windows users, for whom the dot convention doesn't mean anything. I must admit I quite like the __pyr__ directory approach as well. Since the interpreter knows the suffix it is looking for, names shouldn't conflict. Using a single directory allows the name to be less cryptic, too (e.g. __pycache__). Please spell it out. Possible future doc. When CPython executes or imports a .py file with Python source code, it caches internal compiled versions in a '__pycache__' subdirectory for possible future use. tjr ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] PEP 3147: PYC Repository Directories
That still leaves the question of what to do with __file__ (for which even the solution in the PEP isn't particularly clean). Perhaps the thing to do there is to have __file__ always point to the source file and introduce a __file_cached__ that points to the bytecompiled file on disk (set to None if it doesn't exist, as may be the case for __main__ or due to writing of bytecode files being disabled). And what if there isn't a source file, because I want to deploy the byte-code only? This is possible now, but would be impossible if there was this kind of distinction. That said, and understanding Martin von Löwis objections against ZIP files, I'm +1 for something like module.some-kind-of-version.pyc instead of subdirectories. Henning ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
Silke von Bargen wrote: That still leaves the question of what to do with __file__ (for which even the solution in the PEP isn't particularly clean). Perhaps the thing to do there is to have __file__ always point to the source file and introduce a __file_cached__ that points to the bytecompiled file on disk (set to None if it doesn't exist, as may be the case for __main__ or due to writing of bytecode files being disabled). And what if there isn't a source file, because I want to deploy the byte-code only? This is possible now, but would be impossible if there was this kind of distinction. You could argue that you put the .pyc file in the __pyr__ directory if the .py file exists (the .pyc file is there for performance reasons). If there's no .py file then the .pyc file could be put in the main directory. That means that you can 'clean' your directory hierarchy by deleting the __pyr__ directories (they'll be regenerated on demand) and leave any other .pyc files (no source .py file) intact. That said, and understanding Martin von Löwis objections against ZIP files, I'm +1 for something like module.some-kind-of-version.pyc instead of subdirectories. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
foo.pyr/ cpython-25.pyc cpython-25U.pyc cpython-27.pyc cpython-27U.pyc cpython-32.pyc unladen-011.pyc wpython-11.pyc +1. It should be quite easy to assign a new name every time the magic number is updated. It would actually be possible to drop the magic numbers entirely. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
Exactly. How would you define where the pyr folder goes? At the root of a package? What if I delete the __init__.py file there? Will the existing pyr folder be orphaned and a new one created in each subfolder? Unlike VCS working copies, the package / module / script hierarchy is not formally defined in python. The module name could guide the location. If you are importing xml.dom.minidom, it could put the pyc file into a sibling of the pyc folder for xml (under the name xml.dom.minidom.label). If you then remove __init__, you are no longer able to import xml.dom, but you might import dom.minidom (assuming you put the xml folder into sys.path). Then, a new pyc file would be created in the pyc folder for the dom package. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
Not really -- much of the code I've seen that tries to guess the source file name from a __file__ value just does something like this: if fname.lower().endswith(('.pyc', '.pyo')): fname = fname[:-1] That's not compatible with using .pyr, either. If a single pyc folder is used, I think an additional __source__ attribute would be needed to indicate what source file time stamp had been checked (if any) to determine that the byte code file is current. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
On 1 February 2010 00:34, Nick Coghlan ncogh...@gmail.com wrote: __file__ would always point to the source files __file_cached__ would always point to the relevant compiled file (either pre-existing or newly created) I like this solution combined with having a single cache directory and a few other things I've added below. The pyc/pyo files are just an optimisation detail, and are essentially temporary. Given that, if they were to live in a single directory, to me it seems obvious that the default location for that should be in the system temporary directory. I an immediately think of the following advantages: 1. No one really complains too much about putting things in /tmp unless it starts taking up too much space. In which case they delete it and if it gets reused, it gets recreated. 2. /tmp is often on non-volatile memory. If it is (e.g. my Windows system temp dir is on a RAMdisk) then it seems wise to respect the obvious desire to throw away temporary files on shutdown. 3. It removes the need for people in general to even think about the existence of pyc/pyo files. They could then be relegated to even more of an implementation detail (probably while explaining the command-line options). 4. No need (in fact undesireable) to make it a hidden directory. If you wanted to package up the pyc/pyo files, I've got an idea that combines well with executing a zip file containing __main__.py (see other thread) 1. Delete /tmp/__pycache__. 2. Compiling all your source files with the versions you want to support (so long as they supported this mechanism). 3. Add a __main__.py which sets the cache directory to the directory (zip file) that __main__.py is in. __main__.py (as the initial script) doesn't use the cache. 4. Zip up the contents of /tmp/__pycache__. Note that for this to work properly it would either require an __init__.py to be automatically created in the __pycache__ module subdirectory, or have the subdirectory be named as a .pyr to indicate it's a cached module (and thus should be importable). /tmp/__pycache__ __main__.py foo.pyr/ foo.py32.pyc foo.py33.pyc Tim Delaney ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
Tim Delaney: I like this solution combined with having a single cache directory and a few other things I've added below. ... 2. /tmp is often on non-volatile memory. If it is (e.g. my Windows system temp dir is on a RAMdisk) then it seems wise to respect the obvious desire to throw away temporary files on shutdown. This may create security vulnerabilities. I could, for example, insert a manipulated .pyc that logs passwords when other users run it. I can also see advantages to allowing out of tree compiled cache directories. For example, you could have a locked down .py tree with .pycs going into per-user trees. This prevents another user from spoofing a .pyc I use as well as allowing users to install arbitrary versions of Python without getting an admin to compile the .py tree with the new compiler. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
I can also see advantages to allowing out of tree compiled cache directories. For example, you could have a locked down .py tree with .pycs going into per-user trees. This prevents another user from spoofing a .pyc I use as well as allowing users to install arbitrary versions of Python without getting an admin to compile the .py tree with the new compiler. This is PEP 304, which has been withdrawn by its author. While there is some relationship with PEP 3147, the two address orthogonal issues. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
On 1/31/2010 2:04 PM, Raymond Hettinger wrote: On Jan 30, 2010, at 4:00 PM, Barry Warsaw wrote: It does this by allowing many different byte compilation files (.pyc files) to be co-located with the Python source file (.py file). It would be nice if all the compilation files could be tucked into one single zipfile per directory to reduce directory clutter. It has several benefits besides tidiness. It hides the implementation details of when magic numbers get shifted. And it may allow faster start-up times when the zipfile is in the disk cache. On a whim, I implemented a PEP302 loader that cached any important that it could find in sys.path into a zip file. I used running bzr as a startup benchmark, and I did my best to ensure an empty cache by running sync; echo 3 /proc/sys/vm/drop_caches; time bzr. On my particular machine, the real time was at minimum 3.5 seconds without using my ZipFileCacheLoader. With the loader, I found the same was true. The average performance was all over the place (due everything else in the operating system trying to fetch from the disk), and I lack enough data points to reach statistical significance. However, if the .pyr zip file is going to contain many versions of the same module, then the performance impact could be more real, since you would be forced to pull from disk *all* of the versions of a given module. -- Scott Dial sc...@scottdial.com scod...@cs.indiana.edu ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
On Sun, Jan 31, 2010 at 11:16 AM, Terry Reedy tjre...@udel.edu wrote: 'pycache' would be pretty clear. Heh -- without the underscores, I read this as pyc ache. Seems appropriate. -- Curt Hagenlocher c...@hagenlocher.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
On 1/31/2010 4:26 PM, Tim Delaney wrote: The pyc/pyo files are just an optimisation detail, and are essentially temporary. The .pycs for /Lib and similar are*not* temporarily in the sense you are using. They are effectively permanent for as long as the version is installed. They should *not* be routinely trashed as they are not obsolete and nearly always will be reused. Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
On Sun, 31 Jan 2010 19:48:19 +0100, =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= mar...@v.loewis.de wrote: By the way, the part that caused me the most confusion in the language in the PEP was the emphasized *and their dependencies*, as if a package having dependencies somehow turned the problem into a factorial explosion. But there seems to be nothing special, according to your explanation, about dependencies in this scheme. For regular (forward) dependencies, there is indeed nothing special to consider - they would have to exist in all versions. In practice, this can (and was) problematic: python-zope.sendmail depends on python-pkg-resources, python-transaction, python-zope, and 10 other things. Before you could starting to provide python27-zope.sendmail, all of these dependencies would have to become available in a 2.7 version first, meaning that ten other Debian developers need to act before you can. With the failure rate of Debian developers (who go as often on holidays as any other volunteer), upgrading to a new Python release could often take many months. OK, that makes it clearer. It's an internal (and probably unavoidable) Debian social problem, not a technical one, and I see why it is an important issue. It seems like it would be simple enough to enhance the os packaging systems to allow the install path to be specified at install time, if that really is the only difference between the package versions. And a script that runs through all the installed python packages and installs them for a new Python version when a new version is installed should be as easy for other distributions as it is for Gentoo. However, it's also unacceptable. I can't cite the exact piece of Debian policy, but I'm fairly sure that build activities are not allowed at installation time. So actually running setup.py files is out of question. Users who want such a thing would have to switch to Gentoo; Debian users just want it to work :-) I'm less sympathetic to problems created by rigid policies, but that doesn't mean I'm not sympathetic :) But I don't understand how this answers the question. If the python26-zope.sendmail package doesn't run setup.py, then a python-zope.sendmail package where you specify at install time which directory to install the files to isn't going to run setup.py, either. If the only difference between a packaged python27-zope.sendmail and a packaged python26-zope.sendmail is the directory to which the files get written, why can't that be controlled at install time? Writing files to a directory must be an install activity, not a build activity. If the issue is that *deciding* what directory to install to is a build time activity...well, maybe I would be less sympathetic to a policy that is *that* rigid. (The os vendors are going to have to change details of their packaging systems if the PEP is accepted, so it's not as if the PEP saves the vendors work.) Again, I'm a little bit unclear on the motivation, also. I think it mostly is after years of experimentation, we have run out of ideas how to solve all related problems simultaneously without changing Python, so let's look for options that do involve changing Python. If you *really* want a list of all the simultaneous problems that need to be solved, and an explanation of why each individual solution has flaws, prepare for this conversation to take a few more weeks. Well, I certainly don't want the conversation to take a few more months. I'm not against the PEP, I'm making my comments and asking my questions in the spirit of making it a high quality PEP. If the motivation is the Debian devs have concluded, after years of experimentation..., then I suppose that's what should go in the motivation section. -- R. David Murray www.bitdance.com Business Process Automation - Network/Server Management - Routers/Firewalls ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
Antoine Pitrou wrote: Le Sat, 30 Jan 2010 21:04:14 -0800, Jeffrey Yasskin a écrit : I have a couple bikesheddy or why didn't you do this comments. I'll be perfectly satisfied with an answer or a line in the pep. 1. Why the -R flag? It seems like this is a uniform improvement, so it should be the default. Have faith in your design! ;-) -1 for making it a default. It is definitely ugly and useless for most cases. It is fine as long as it is optional and merely used by the Debian/ Ubuntu installers. Would you still be a -1 on making it the new scheme the default if it used a single cache directory instead? That would actually be cleaner than the current solution rather than messier. Regards, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia --- ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] subprocess docs patch
Hello mighty Python developers, I was wondering if someone could take a gander at, and hopefully act upon, a patch I submitted a while ago for the subprocess module's docs. It's been languishing in the bug tracker: http://bugs.python.org/issue6760 Any help you could provide would be appreciated. Cheers, Chris -- If life seems jolly rotten. There's something you've forgotten. And that's to laugh and smile and dance and sing. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
Martin v. Löwis wrote: Exactly. How would you define where the pyr folder goes? At the root of a package? What if I delete the __init__.py file there? Will the existing pyr folder be orphaned and a new one created in each subfolder? Unlike VCS working copies, the package / module / script hierarchy is not formally defined in python. The module name could guide the location. If you are importing xml.dom.minidom, it could put the pyc file into a sibling of the pyc folder for xml (under the name xml.dom.minidom.label). If you then remove __init__, you are no longer able to import xml.dom, but you might import dom.minidom (assuming you put the xml folder into sys.path). Then, a new pyc file would be created in the pyc folder for the dom package. I see three possible logical locations for the Python cache directories: 1. In each directory containing Python source files. Major Pro: easy to keep source files associated with their cached versions Major Con: proliferation of cache directories 2. In each top level directory on sys.path, flat file structure Major Pro: trivial to separate out all cached files Major Con: for path locations like the top of the standard lib, the cache directory would get a *lot* of entries 3. In each top level directory on sys.path, shadow file heirarchy Major Pro: trivial to separate out all cached files Major Con: ??? (I got nuthin') I didn't list a single global cache directory as a viable option as it would create some nasty naming conflicts due to runs with different sys.path entries and would make it impossible to create zipfiles with precached bytecode files. Note that with option two, creating a bytecode only zipfile would be trivial: just add the __pycache__ directory as the top-level directory in the zipfile and leave out everything else (assume there were no data files in the package that were still needed). Packages would still be identifiable by the existence of the cached pyc file for their __init__modules. Going back to my previous example (with one extra source file to show how a top-level module would be handled), scheme 2 would give: module.py package/ __init__.py foo.py subpackage/ __init__.py bar.py __pycache__/ module.cpython-27.pyc module.cpython-27.pyo package.__init__.cpython-27.pyc package.__init__.cpython-27.pyo package.foo.cpython-27.pyc package.foo.cpython-27.pyo package.subpackage.__init__.cpython-27.pyc package.subpackage.__init__.cpython-27.pyo package.subpackage.bar.cpython-27.pyc package.subpackage.bar.cpython-27.pyo While scheme 3 would look like: module.py package/ __init__.py foo.py subpackage/ __init__.py bar.py __pycache__/ module.cpython-27.pyc module.cpython-27.pyo package/ __init__.cpython-27.pyc __init__.cpython-27.pyo foo.cpython-27.pyc foo.cpython-27.pyo subpackage/ __init__.cpython-27.pyc __init__.cpython-27.pyo bar.cpython-27.pyc bar.cpython-27.pyo For comparison, here is what it would look like under scheme 1: module.py package/ __init__.py foo.py subpackage/ __init__.py bar.py __pycache__/ __init__.cpython-27.pyc __init__.cpython-27.pyo bar.cpython-27.pyc bar.cpython-27.pyo __pycache__/ __init__.cpython-27.pyc __init__.cpython-27.pyo foo.cpython-27.pyc foo.cpython-27.pyo __pycache__/ module.cpython-27.pyc module.cpython-27.pyo And the initial version proposed in the PEP: module.py module.pyr/ cpython-27.pyc cpython-27.pyo package/ __init__.py __init__.pyr/ cpython-27.pyc cpython-27.pyo foo.py foo.pyr/ cpython-27.pyc cpython-27.pyo subpackage/ __init__.py __init__.pyr/ cpython-27.pyc cpython-27.pyo bar.py bar.pyr/ cpython-27.pyc cpython-27.pyo My major concern with scheme 2 is the possibility of directory size limits affecting the caching of files, but scheme 3 looks pretty good to me (with the higher level cache linked to the directory that is actually on sys.path, the cache locations aren't as arbitrary as I originally feared). Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia --- ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
Silke von Bargen wrote: That still leaves the question of what to do with __file__ (for which even the solution in the PEP isn't particularly clean). Perhaps the thing to do there is to have __file__ always point to the source file and introduce a __file_cached__ that points to the bytecompiled file on disk (set to None if it doesn't exist, as may be the case for __main__ or due to writing of bytecode files being disabled). And what if there isn't a source file, because I want to deploy the byte-code only? This is possible now, but would be impossible if there was this kind of distinction. For a bytecode only deployment, __file__ would point to where the source file *would* be if it was there while __file_cached__ would point to the precompiled byte code. Yes, this would be backwards incompatible for some uses of execfile in conjunction with __file__ but those should be much rarer than uses of __file__ to locate source code (which break with bytecode only deployment anyway) and to find colocated resource files (which only care about the path to the file and not the filename itself). Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia --- ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
But I don't understand how this answers the question. If the python26-zope.sendmail package doesn't run setup.py, then a python-zope.sendmail package where you specify at install time which directory to install the files to isn't going to run setup.py, either. If the only difference between a packaged python27-zope.sendmail and a packaged python26-zope.sendmail is the directory to which the files get written, why can't that be controlled at install time? It certainly would be possible to copy the files into each Python's site-packages. They have a system that does that in place, except that it doesn't copy the files, but symlinks them. Well, I certainly don't want the conversation to take a few more months. I'm not against the PEP, I'm making my comments and asking my questions in the spirit of making it a high quality PEP. If the motivation is the Debian devs have concluded, after years of experimentation..., then I suppose that's what should go in the motivation section. I guess Barry will have to explain what the problem with the current scheme is. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
Nick Coghlan ncogh...@gmail.com writes: Would you still be a -1 on making it the new scheme the default if it used a single cache directory instead? That would actually be cleaner than the current solution rather than messier. +0 on a default of “store compiled bytecode files in a single cache directory”. It is indeed cleaner than the current default. I'm only +0 because I don't know whether that actually addresses the use case that raised the issue to begin with, so I'm postponing judgement until those who want this change in the first place chime in. -- \ “Once consumers can no longer get free music, they will have to | `\buy the music in the formats we choose to put out.” —Steve | _o__) Heckler, VP of Sony Music, 2001 | Ben Finney ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com