Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Nick Coghlan
Henning von Bargen wrote:
 I like the idea of the PEP.
 On the other hand, I dislike using directories for it.
 Others have explained enough reasons for why creating many
 directories is a bad idea; and there may be other reasons
 (file-system limits for number of directories, problems when
 the directories are located on the network).

Actually, this is the first post I've seen noting objective problems
with the use of a subdirectory. The others were just a subjective
difference in perspective that saw subdirectory clutter as somehow being
worse than file clutter.

Specific examples of filesystems with different limits on file and
subdirectory counts and network filesystems where opening a subdirectory
can result in a significant speed impact would be very helpful.

 The solution is so obvious:
 
 Why not use a .pyr file that is internally a zip file?

Agreed this should be discussed in the PEP, but one obvious problem is
the speed impact. Picking up a file from a subdirectory is going to
introduce less overhead than unpacking it from a zipfile.

That said, using a non-compressed zipfile would make a lot more sense
than inventing our own archive format if a subdirectory is eventually
deemed unsuitable.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
---
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Martin v. Löwis
 Linux distributions such as Ubuntu [2]_ and Debian [3]_ provide more
 than one Python version at the same time to their users.  For example,
 Ubuntu 9.10 Karmic Koala can install Python 2.5, 2.6, and 3.1, with
 Python 2.6 being the default.

 In order to ease the burden on operating system packagers for these
 distributions, the distribution packages do not contain Python version
 numbers [4]_; they are shared across all Python versions installed on
 the system.  Putting Python version numbers in the packages would be a
 maintenance nightmare, since all the packages - *and their
 dependencies* - would have to be updated every time a new Python
 release was added or removed from the distribution.  Because of the
 sheer number of packages available, this amount of work is infeasible.
 
 As a non-Debian user (I'm a Gentoo user), the above doesn't enlighten me,
 even after skimming the referenced document.  Perhaps an example would
 be helpful?

I think the basic question is: how do you get stuff into
/usr/lib/python2.6/site-packages/Pyrex?

One option would be to have a Debian package python26-pyrex. Then you
would also need a python25-pyrex package and a python27-pyrex package,
all essentially containing the very same files (but installed into
different directories).

What they want is a single python-pyrex package that automatically works
 for all Python versions - even those that aren't yet installed (i.e.
install python-pyrex first, and Python 2.7 later, and python-pyrex
should be available).

Having a single directory in sys.path for all Python versions currently
doesn't work, as the pyc files for each version would conflict.

The current solution consists (for package installation) of
a) installing the files in a single place
b) creating a directory hiearchy in each Python's site-package
c) symlinking all .py files into this directory hierarchy
d) byte-compiling all .py files in the hierarchy
For installation of new Python versions, they need to
a) walk over the list of installed Python packages
b) for each one, repeat steps b..d from above

With the PEP in place, for pure-Python packages, they could
a) have a system wide directory for pure-Python packages, and
b) arrange that directory to appear on sys.path for all Python
   versions
On package installation, they then could
a) install the files in that system-wide directory
b) for each Python version, run byte-code compilation of the
   new package
On Python installation, they would
a) byte-compile the entire directory.

Alternatively, to support packages that don't work with all Python
versions, they could continue to use symlinking, but restrict that
onto the top directories of each package (i.e. not create a directory
hierarchy in site-packages).

 (FYI, Gentoo just installs the pyc files into each of the installed
 Python's site-packages that is supported by the package in question...disk
 space is relatively cheap.)

I suppose Gentoo also installs .py files into each site-packages?

How does it deal with a Python installation that happens after the
package installation?

 * Would a moratorium on byte code changes, similar to the language
   moratorium described in PEP 3003 [16]_ be a better approach to
   pursue, and would that solve the problem for vendors?  At the time
   of this writing, PEP 3003 is silent on the issue.
 
 Unless the bytecode change moratorium was permanent (unlikely), how would
 this solve the vendor issues?

A vendor strategy might be to not store .pyc files on disk for some
Python versions (i.e. those that differ from the rest). Assume that
3.2, 3.3, 3.4 use the same pyc magic, and 3.5, 3.6, 3.7 also do. Then,
at any point in time, one of the Python versions is the system python
in Debian. This is the one who decides the official .pyc magic. The
other Python installations on the same system can either reuse the
existing .pyc files (if the magic matches), or not, in which case they
have to recompile (to memory) the Python source on every startup. The
longer the moratorium, the less of a problem this could cause for users.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Ben Finney
Nick Coghlan ncogh...@gmail.com writes:

 Actually, this is the first post I've seen noting objective problems
 with the use of a subdirectory. The others were just a subjective
 difference in perspective that saw subdirectory clutter as somehow
 being worse than file clutter.

Here's another one, then:

The directory where the source code files reside is often a working area
for the developer. The directory structure is an essential tool of
organising the project; the presence of an unwanted directory is clutter
to this purpose, in a way that the presence of an unwanted file is not.

-- 
 \ “Alternative explanations are always welcome in science, if |
  `\   they are better and explain more. Alternative explanations that |
_o__) explain nothing are not welcome.” —Victor J. Stenger, 2001-11-05 |
Ben Finney

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Martin v. Löwis
 Agreed this should be discussed in the PEP, but one obvious problem is
 the speed impact. Picking up a file from a subdirectory is going to
 introduce less overhead than unpacking it from a zipfile.

There is also the issue of race conditions with multiple simultaneous
accesses. The original format for the PEP had race conditions for
multiple simultaneous writers; ZIP will also have race conditions for
concurrent readers/writers (as any new writer will have to overwrite
the central directory, making the zip file temporarily unavailable -
unless they copy it, in which case we are back to writer/writer
races).

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Antoine Pitrou
Barry Warsaw barry at python.org writes:
 
 Putting Python version numbers in the packages would be a
 maintenance nightmare, since all the packages - *and their
 dependencies* - would have to be updated every time a new Python
 release was added or removed from the distribution.  Because of the
 sheer number of packages available, this amount of work is infeasible.

How is this infeasible exactly? Wouldn't it be an easy target for scripting?

 As an example of the problem, a common (though fragile) Python idiom
 for locating data files is to do something like this::

I don't think this is fragile. It is the most robust I can think of, but perhaps
I'm missing another solution :)
(well, apart from pkg_resources, that is)

 The implementation of this PEP would have to ensure that the same
 directory level is returned from `__file__` as it does without the
 `pyr` directory, so that the common idiom above continues to work::
 
  import foo
  foo.__file__
 'foo.pyr'

Would things like exec() work on the given directory?

 An earlier version of this PEP described fat Python byte code files.
 These files would contain the equivalent of multiple `pyc` files in a
 single `pyf` file, with a lookup table keyed off the appropriate magic
 number.  This was an extensible file format so that the first 5
 parallel Python implementations could be supported fairly efficiently,
 but with extension lookup tables available to scale `pyf` byte code
 objects as large as necessary.

As Martin said, this creates concurrent access problems, when several
interpreters modify the file simultaneously.

 * What about `py` source files that are compatible with most but not
   all installed Python versions.  We might need a way to say this py
   file should be hidden from Python versions X.Y or earlier.

-1. This is the distributor's job, not Python's.
If you want you can create dummy pyc's in your pyr that will raise an
ImportError or a NotImplementedError with some versions of Python. But I don't
think Python should have a stake in this.

 * Would a moratorium on byte code changes, similar to the language
   moratorium described in PEP 3003 [16]_ be a better approach to
   pursue, and would that solve the problem for vendors?  At the time
   of this writing, PEP 3003 is silent on the issue.

-1. Bytecode is an internal detail; besides, it is vital to be able to evolve 
it.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Georg Brandl
Am 31.01.2010 07:29, schrieb Nick Coghlan:
 Vitor Bosshard wrote:
 There is no one-to-one correspondence between Python version and pyc
 magic numbers. Different runtime options may change the magic number and
 different versions may reuse a magic number
 
 Good point. Runtime options would need to change the version (e.g.
 foo.25U.py), and versions that reuse magic numbers would be
 redundantly written to disk. However, the underlying issue as I see it
 is that the magic value is an implementation detail that should not be
 exposed.
 
 I think this is actually be a good point - while there needs to be a
 shared namespace to allow different Python implementations to avoid
 stepping on each others toes, CPython's bytecode compatibility magic
 number may not be the best choice as the distinguishing identifier.
 
 It may be better to give the magic numbers a meaningful corresponding
 string, such that the filenames would be more like:
 
 foo.py
 foo.pyr/
   cpython-25.pyc
   cpython-25U.pyc
   cpython-27.pyc
   cpython-27U.pyc
   cpython-32.pyc
   unladen-011.pyc
   wpython-11.pyc

+1.  It should be quite easy to assign a new name every time the magic
number is updated.

 If we don't change the bytecode for a given Python version, then the
 name of the bytecode format used wouldn't change either.

That would be the only remaining complaint for casual users. (Why doesn't
Python compile my file for 2.8?)

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Georg Brandl
Am 31.01.2010 05:18, schrieb Ben Finney:
 Nick Coghlan ncogh...@gmail.com writes:
 
 It won't be cluttered with subfolders - you will have at most one .pyr
 per source .py file.
 
 If that doesn't meet your threshold of “cluttered with subfolders”, I'm
 at a loss for words to think where that threshold might be. It meets,
 and exceeds by a long shot, my threshold for subfolder clutter.
 
 Even adding a *single* subfolder in arbitrary directories is an
 obnoxious act for a program to do automatically, and is not to be
 undertaken lightly. It might be justified in this case, but that doesn't
 mean we should open the gates to even more clutter.

Then why did Subversion choose to follow the CVS way and create a
subdirectory in each versioned directory?  IMO, this is much more
annoying given the alternative of a single .hg/.bzr/whatever directory.
For .pyc vs .pyr, you didn't have the alternative of putting all that
stuff in one directory now.

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Georg Brandl
Am 31.01.2010 10:21, schrieb Ben Finney:
 Nick Coghlan ncogh...@gmail.com writes:
 
 Actually, this is the first post I've seen noting objective problems
 with the use of a subdirectory. The others were just a subjective
 difference in perspective that saw subdirectory clutter as somehow
 being worse than file clutter.
 
 Here's another one, then:
 
 The directory where the source code files reside is often a working area
 for the developer. The directory structure is an essential tool of
 organising the project; the presence of an unwanted directory is clutter
 to this purpose, in a way that the presence of an unwanted file is not.

At least to me, this does not explain why an unwanted (why unwanted? If
it's unwanted, set PYTHONDONTWRITEBYTECODE=1) directory is worse than an
unwanted file.

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Simon Cross
On Sun, Jan 31, 2010 at 1:54 PM, Hanno Schlichting ha...@hannosch.eu wrote:
 I'd be a big +1 to using a single .pyr directory per source directory.

I don't know whether I in favour of using a single pyr folder or not
but if a single folder is used I'd definitely prefer the folder to be
called __pyr__ rather than .pyr.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Hanno Schlichting
On Sun, Jan 31, 2010 at 1:03 PM, Simon Cross
hodgestar+python...@gmail.com wrote:
 I don't know whether I in favour of using a single pyr folder or not
 but if a single folder is used I'd definitely prefer the folder to be
 called __pyr__ rather than .pyr.

Do you have any specific reason for that?

Using the leading dot notation is an established pattern to hide
non-essential information from directory views. What makes this
non-applicable in this situation and a custom Python notation better?

Hanno
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Simon Cross
On Sun, Jan 31, 2010 at 2:13 PM, Hanno Schlichting ha...@hannosch.eu wrote:
 On Sun, Jan 31, 2010 at 1:03 PM, Simon Cross
 hodgestar+python...@gmail.com wrote:
 I don't know whether I in favour of using a single pyr folder or not
 but if a single folder is used I'd definitely prefer the folder to be
 called __pyr__ rather than .pyr.

 Do you have any specific reason for that?

It rather not have the confusion caused by stray .pyc files multiplied
by having said stray files buried in a hidden folder.

 Using the leading dot notation is an established pattern to hide
 non-essential information from directory views. What makes this
 non-applicable in this situation and a custom Python notation better?

Something being an established pattern doesn't mean it's a good idea.
If we're go with an by-convention argument anyway surely Python
conventions should take precedence -- this is *Python* after all. :)

On the whole I'm against hiding folders because what information is
non-essential varies from situation to situation. People (including
me) regularly screw up dealing with .svn folders by including them in
source tarballs, copying parts of one working copy into another, etc.

Schiavo
Simon
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Nick Coghlan
Georg Brandl wrote:
 Am 31.01.2010 07:18, schrieb Nick Coghlan:
 Ben Finney wrote:
 Could we instead have a single subdirectory for each tree of module
 packages, keeping them tidily out of the way of the source files, while
 making them located just as deterministically::
 Not easily. With the scheme currently proposed in the PEP, setting a
 value for __file__ which is both reasonably accurate and backwards
 compatible with existing file manipulation techniques is
 straightforward: just use the name of the cache directory.
 
 Not really -- much of the code I've seen that tries to guess the source
 file name from a __file__ value just does something like this:
 
if fname.lower().endswith(('.pyc', '.pyo')): fname = fname[:-1]
 
 That's not compatible with using .pyr, either.

That's not the backwards compatibility I'm talking about - I'm talking
about the more common one mentioned in the PEP where __file__ is used
with os.path.split to locate adjacent resource files.

Agreed that even the .pyr idea causes backwards compatibility problems
with code like the above (fortunately we can fix the stdlib instances
ourselves).

Cheers,
Nick.


-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
---
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Nick Coghlan
Georg Brandl wrote:
 Then why did Subversion choose to follow the CVS way and create a
 subdirectory in each versioned directory?  IMO, this is much more
 annoying given the alternative of a single .hg/.bzr/whatever directory.
 For .pyc vs .pyr, you didn't have the alternative of putting all that
 stuff in one directory now.

I actually like the svn/cvs way, since each directory in the working
copy is self-contained. The DVCS way means that you can't tell just by
looking at a directory whether it is part of a working copy or not -
there is a non-local element affecting you at a higher point in the
filesystem hierarchy.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
---
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Vitor Bosshard
2010/1/31 Georg Brandl g.bra...@gmx.net:

 foo.py
 foo.pyr/
   cpython-25.pyc
   cpython-25U.pyc
   cpython-27.pyc
   cpython-27U.pyc
   cpython-32.pyc
   unladen-011.pyc
   wpython-11.pyc

 +1.  It should be quite easy to assign a new name every time the magic
 number is updated.

 If we don't change the bytecode for a given Python version, then the
 name of the bytecode format used wouldn't change either.

 That would be the only remaining complaint for casual users. (Why doesn't
 Python compile my file for 2.8?)


I think it's preferable to have a redundant copy of the compiled file
floating around rather than creating confusion as to which one each
python interpreter uses.

Optimizing disk space (and marginal compile time) is not worth the
mental overhead this would introduce. Better keep it as clear and
simple as possible, i.e. create different .pyc files even if the
bytecode doesn't change between releases.

Vitor
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Nick Coghlan
Georg Brandl wrote:
 +1.  Having a single (visible) __pyr__ directory is much less clutter than
 multiple .pyc files anyway.  Also, don't forget Windows users, for whom
 the dot convention doesn't mean anything.

I must admit I quite like the __pyr__ directory approach as well. Since
the interpreter knows the suffix it is looking for, names shouldn't
conflict. Using a single directory allows the name to be less cryptic,
too (e.g. __pycache__).

That still leaves the question of what to do with __file__ (for which
even the solution in the PEP isn't particularly clean). Perhaps the
thing to do there is to have __file__ always point to the source file
and introduce a __file_cached__ that points to the bytecompiled file on
disk (set to None if it doesn't exist, as may be the case for __main__
or due to writing of bytecode files being disabled).

With that approach, a structure given just a run under 2.7 and one under
2.7 with -O might look like:

package/
  __init__.py
  foo.py
  __pycache__/
__init__.cpython-27.pyc
__init__.cpython-27.pyo
foo.cpython-27.pyc
foo.cpython-27.pyo
  subpackage/
__init__.py
bar.py
__pycache__/
  __init__.cpython-27.pyc
  __init__.cpython-27.pyo
  bar.cpython-27.pyc
  bar.cpython-27.pyo


__file__ would always point to the source files
__file_cached__ would always point to the relevant compiled file (either
pre-existing or newly created)

To use the final step of importing package.foo as an example (ignoring
the extra backwards compatibility steps for the existing scheme):

1. Check package dir listing for __pyr__
2. It it exists, check it for a foo.cookie.ext (where the
interpreter will always know exactly which cookie and extension it wants)
3. As an CPython implementation details, use the cookie inside the file
to double-check correctness
4. If all good, run with that cached file
5. Otherwise, check package dir for foo.py
6. If the source file exists, create the cached bytecode file inside the
__pyr__ (if this fails, just run from RAM with __file_cached__ = None)
7. Run with the newly compiled source file
8. Otherwise report ImportError.

This doesn't seem to have any significant disadvantages relative to the
subdirectory-per-source-file approach (and the major advantage of
creating just a single subdirectory).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
---
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Nick Coghlan
Vitor Bosshard wrote:
 Optimizing disk space (and marginal compile time) is not worth the
 mental overhead this would introduce. Better keep it as clear and
 simple as possible, i.e. create different .pyc files even if the
 bytecode doesn't change between releases.

Yeah, makes sense. Given the level of fiddling with it these days, it
may be a very long while before the issue actually comes up anyway :)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
---
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Vitor Bosshard
2010/1/31 Nick Coghlan ncogh...@gmail.com:
 Georg Brandl wrote:
 Then why did Subversion choose to follow the CVS way and create a
 subdirectory in each versioned directory?  IMO, this is much more
 annoying given the alternative of a single .hg/.bzr/whatever directory.
 For .pyc vs .pyr, you didn't have the alternative of putting all that
 stuff in one directory now.

 I actually like the svn/cvs way, since each directory in the working
 copy is self-contained. The DVCS way means that you can't tell just by
 looking at a directory whether it is part of a working copy or not -
 there is a non-local element affecting you at a higher point in the
 filesystem hierarchy.


Exactly. How would you define where the pyr folder goes? At the root
of a package? What if I delete the __init__.py file there? Will the
existing pyr folder be orphaned and a new one created in each
subfolder? Unlike VCS working copies, the package / module / script
hierarchy is not formally defined in python.

Having one single pyr (or__pycache__ or whatever it's called)
subfolder per folder is an easy to understand, solid solution. I'm
also in favor of making this folder non-hidden. Unlike a .git or.hg
folder, it impacts the code execution itself. Think of the newbies!
.pyc files already lead to heisenbugs for inexperienced developers
(e.g. importing a lingering pyc instead of an intended module with the
same name further down sys.path), and they're in plain sight.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Georg Brandl
Am 31.01.2010 14:02, schrieb Nick Coghlan:
 Georg Brandl wrote:
 Then why did Subversion choose to follow the CVS way and create a
 subdirectory in each versioned directory?  IMO, this is much more
 annoying given the alternative of a single .hg/.bzr/whatever directory.
 For .pyc vs .pyr, you didn't have the alternative of putting all that
 stuff in one directory now.
 
 I actually like the svn/cvs way, since each directory in the working
 copy is self-contained. The DVCS way means that you can't tell just by
 looking at a directory whether it is part of a working copy or not -
 there is a non-local element affecting you at a higher point in the
 filesystem hierarchy.

Yes, but is it really so common that you need to know if the directory is
part of a working copy?  Usually either you already know, or it's unnecessary
to know.  Apart from that it's trivial to find out using hg root etc.

In contrast, those .svn directories are a pain to work with when copying
or moving stuff, either out of a working copy (they are overlooked) or
between working copies (svn says strange things when you try to do svn add).

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread R. David Murray
On Sun, 31 Jan 2010 13:13:24 +0100, Georg Brandl g.bra...@gmx.net wrote:
 Am 31.01.2010 13:03, schrieb Simon Cross:
  On Sun, Jan 31, 2010 at 1:54 PM, Hanno Schlichting ha...@hannosch.eu 
  wrote:
  I'd be a big +1 to using a single .pyr directory per source directory.
  
  I don't know whether I in favour of using a single pyr folder or not
  but if a single folder is used I'd definitely prefer the folder to be
  called __pyr__ rather than .pyr.
 
 And to come complete with standard library functions to find the corresponding
 .py for a .pyc and .pyc for a .py.

+1

--
R. David Murray  www.bitdance.com
Business Process Automation - Network/Server Management - Routers/Firewalls
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread R. David Murray
On Sun, 31 Jan 2010 09:50:16 +0100, =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= 
mar...@v.loewis.de wrote:
  Linux distributions such as Ubuntu [2]_ and Debian [3]_ provide more
  than one Python version at the same time to their users.  For example,
  Ubuntu 9.10 Karmic Koala can install Python 2.5, 2.6, and 3.1, with
  Python 2.6 being the default.
 
  In order to ease the burden on operating system packagers for these
  distributions, the distribution packages do not contain Python version
  numbers [4]_; they are shared across all Python versions installed on
  the system.  Putting Python version numbers in the packages would be a
  maintenance nightmare, since all the packages - *and their
  dependencies* - would have to be updated every time a new Python
  release was added or removed from the distribution.  Because of the
  sheer number of packages available, this amount of work is infeasible.
  
  As a non-Debian user (I'm a Gentoo user), the above doesn't enlighten me,
  even after skimming the referenced document.  Perhaps an example would
  be helpful?
 
 I think the basic question is: how do you get stuff into
 /usr/lib/python2.6/site-packages/Pyrex?
 
 One option would be to have a Debian package python26-pyrex. Then you
 would also need a python25-pyrex package and a python27-pyrex package,
 all essentially containing the very same files (but installed into
 different directories).
 
 What they want is a single python-pyrex package that automatically works
  for all Python versions - even those that aren't yet installed (i.e.
 install python-pyrex first, and Python 2.7 later, and python-pyrex
 should be available).
 
 Having a single directory in sys.path for all Python versions currently
 doesn't work, as the pyc files for each version would conflict.
 
 The current solution consists (for package installation) of
 a) installing the files in a single place
 b) creating a directory hiearchy in each Python's site-package
 c) symlinking all .py files into this directory hierarchy
 d) byte-compiling all .py files in the hierarchy
 For installation of new Python versions, they need to
 a) walk over the list of installed Python packages
 b) for each one, repeat steps b..d from above
 
 With the PEP in place, for pure-Python packages, they could
 a) have a system wide directory for pure-Python packages, and
 b) arrange that directory to appear on sys.path for all Python
versions
 On package installation, they then could
 a) install the files in that system-wide directory
 b) for each Python version, run byte-code compilation of the
new package
 On Python installation, they would
 a) byte-compile the entire directory.
 
 Alternatively, to support packages that don't work with all Python
 versions, they could continue to use symlinking, but restrict that
 onto the top directories of each package (i.e. not create a directory
 hierarchy in site-packages).

Excellent, thank you.  IMO this explanation should go in the PEP.

By the way, the part that caused me the most confusion in the language
in the PEP was the emphasized *and their dependencies*, as if a package
having dependencies somehow turned the problem into a factorial explosion.
But there seems to be nothing special, according to your explanation,
about dependencies in this scheme.

  (FYI, Gentoo just installs the pyc files into each of the installed
  Python's site-packages that is supported by the package in question...disk
  space is relatively cheap.)
 
 I suppose Gentoo also installs .py files into each site-packages?

Arg.  My fingers added the 'c' without my mind getting involved,
apparently.  I meant that the .py is installed directly in each
site-packages.

 How does it deal with a Python installation that happens after the
 package installation?

There's a tool that runs through all installed python packages and
does the install and byte compile (basically, reinstalls the package
for the new Python version).

Gentoo doesn't have the multiple-os-packages-per-Python-version problem,
since it installs from source.

It seems like it would be simple enough to enhance the os packaging
systems to allow the install path to be specified at install time, if
that really is the only difference between the package versions.  And a
script that runs through all the installed python packages and installs
them for a new Python version when a new version is installed should be
as easy for other distributions as it is for Gentoo.  That would also
mean that dependencies on the Python version would be handled by the
packaging system: it would refuse to install a given package if that
package didn't support the specified Python version.  Or is something
missing from my understanding?  If not, I think the motivation section
should address why the PEP is a better idea than improving the os
packaging systems as I've suggested.  (The os vendors are going to have
to change details of their packaging systems if the PEP is accepted,
so it's not as if the 

Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Antoine Pitrou
Le Sun, 31 Jan 2010 18:46:54 +1000, Nick Coghlan a écrit :
 
 Actually, this is the first post I've seen noting objective problems
 with the use of a subdirectory. The others were just a subjective
 difference in perspective that saw subdirectory clutter as somehow being
 worse than file clutter.
 
 Specific examples of filesystems with different limits on file and
 subdirectory counts and network filesystems where opening a subdirectory
 can result in a significant speed impact would be very helpful.

I disagree. Since the proposed scheme is optional and disabled by 
default, this is only a problem for Debian and Ubuntu to tackle. It is 
none of our (CPython) business.

I expect nobody outside of a couple of Linux distros will use this 
feature anyway (which is why, regardless of the slight ugliness of the 
proposal, I am not against it).

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Antoine Pitrou
Le Sat, 30 Jan 2010 21:04:14 -0800, Jeffrey Yasskin a écrit :
 
 I have a couple bikesheddy or why didn't you do this comments. I'll be
 perfectly satisfied with an answer or a line in the pep.
 
 1. Why the -R flag? It seems like this is a uniform improvement, so it
 should be the default. Have faith in your design! ;-)

-1 for making it a default. It is definitely ugly and useless for most 
cases. It is fine as long as it is optional and merely used by the Debian/
Ubuntu installers.

cheers

Antoine.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Michael Crute
On Sun, Jan 31, 2010 at 5:00 AM, Martin v. Löwis mar...@v.loewis.de wrote:
 Agreed this should be discussed in the PEP, but one obvious problem is
 the speed impact. Picking up a file from a subdirectory is going to
 introduce less overhead than unpacking it from a zipfile.

 There is also the issue of race conditions with multiple simultaneous
 accesses. The original format for the PEP had race conditions for
 multiple simultaneous writers; ZIP will also have race conditions for
 concurrent readers/writers (as any new writer will have to overwrite
 the central directory, making the zip file temporarily unavailable -
 unless they copy it, in which case we are back to writer/writer
 races).

Since a pyc file is just a marshaled code object prefixed with a magic
number and mtime why not change the structure of the pyc (maybe
calling it a pyr file) to store multiple code objects in a marshaled
dictionary. For example:

{ 'MAGIC': (mtime, code_obj) }

This would eliminate the read-time race condition but still
potentially allow for a write-time race condition if locking isn't
used. The benefit of this approach is that it is no less clear than
pyc is today and doesn't result in n * versions_of_python pyc files.
There is still the overhead of unmarshaling the file to check for a
code object that matches your version. If unmarshaling the entire file
each time is problematic an on-disk format with a short TOC at the
beginning followed by the marshaled data would likely solve the issue.

-- 
Michael E. Crute
http://mike.crute.org

It is a mistake to think you can solve any major problem just with
potatoes. --Douglas Adams
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread MRAB

Vitor Bosshard wrote:

2010/1/31 Georg Brandl g.bra...@gmx.net:

foo.py
foo.pyr/
  cpython-25.pyc
  cpython-25U.pyc
  cpython-27.pyc
  cpython-27U.pyc
  cpython-32.pyc
  unladen-011.pyc
  wpython-11.pyc

+1.  It should be quite easy to assign a new name every time the magic
number is updated.


If we don't change the bytecode for a given Python version, then the
name of the bytecode format used wouldn't change either.

That would be the only remaining complaint for casual users. (Why doesn't
Python compile my file for 2.8?)


I think it's preferable to have a redundant copy of the compiled file
floating around rather than creating confusion as to which one each
python interpreter uses.

Optimizing disk space (and marginal compile time) is not worth the
mental overhead this would introduce. Better keep it as clear and
simple as possible, i.e. create different .pyc files even if the
bytecode doesn't change between releases.


Additionally, it has been mentioned that the magic number might change
with different builds of alpha releases, but that shouldn't matter
because the existing .pyc files don't show the magic number anyway!
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Reid Kleckner
On Sun, Jan 31, 2010 at 8:34 AM, Nick Coghlan ncogh...@gmail.com wrote:
 That still leaves the question of what to do with __file__ (for which
 even the solution in the PEP isn't particularly clean). Perhaps the
 thing to do there is to have __file__ always point to the source file
 and introduce a __file_cached__ that points to the bytecompiled file on
 disk (set to None if it doesn't exist, as may be the case for __main__
 or due to writing of bytecode files being disabled).

+1 for this, it seems to be what most people want anyway, given the
code that munges the .pyc back to the .py.  I bet this change would
break very little code.

Reid
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Martin v. Löwis
 By the way, the part that caused me the most confusion in the language
 in the PEP was the emphasized *and their dependencies*, as if a package
 having dependencies somehow turned the problem into a factorial explosion.
 But there seems to be nothing special, according to your explanation,
 about dependencies in this scheme.

For regular (forward) dependencies, there is indeed nothing special to
consider - they would have to exist in all versions. In practice, this
can (and was) problematic: python-zope.sendmail depends on
python-pkg-resources, python-transaction, python-zope, and 10 other
things. Before you could starting to provide python27-zope.sendmail,
all of these dependencies would have to become available in a 2.7
version first, meaning that ten other Debian developers need to act
before you can. With the failure rate of Debian developers (who go
as often on holidays as any other volunteer), upgrading to a new Python
release could often take many months.

Now they have that new grand scheme involving tons of symbolic links;
actually, they have two of them (python-central and python-support).
While this is perfect in theory, it's not very robust (Barry can
probably better report all the failure modes). I guess (without knowing)
that this is really what triggered the PEP.

 It seems like it would be simple enough to enhance the os packaging
 systems to allow the install path to be specified at install time, if
 that really is the only difference between the package versions.  And a
 script that runs through all the installed python packages and installs
 them for a new Python version when a new version is installed should be
 as easy for other distributions as it is for Gentoo.

However, it's also unacceptable. I can't cite the exact piece of Debian
policy, but I'm fairly sure that build activities are not allowed at
installation time. So actually running setup.py files is out of
question. Users who want such a thing would have to switch to Gentoo;
Debian users just want it to work :-)

 (The os vendors are going to have
 to change details of their packaging systems if the PEP is accepted,
 so it's not as if the PEP saves the vendors work.)

Again, I'm a little bit unclear on the motivation, also. I think it
mostly is after years of experimentation, we have run out of ideas
how to solve all related problems simultaneously without changing
Python, so let's look for options that do involve changing Python.

If you *really* want a list of all the simultaneous problems that
need to be solved, and an explanation of why each individual solution
has flaws, prepare for this conversation to take a few more weeks.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Hanno Schlichting
On Sun, Jan 31, 2010 at 6:04 AM, Jeffrey Yasskin jyass...@gmail.com wrote:
 1. Why the -R flag? It seems like this is a uniform improvement, so it
 should be the default. Have faith in your design! ;-)

+1 for a single strategy that is used in all cases. The current
solution could be phased out across multiple releases, but in the end
there should be a single approach and no flag. Otherwise some code and
tools will only support one of the approaches, especially if this is
seen as something only a minority of Linux distributions uses.

Hanno
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Martin v. Löwis
 This would eliminate the read-time race condition but still
 potentially allow for a write-time race condition if locking isn't
 used. The benefit of this approach is that it is no less clear than
 pyc is today and doesn't result in n * versions_of_python pyc files.
 There is still the overhead of unmarshaling the file to check for a
 code object that matches your version. If unmarshaling the entire file
 each time is problematic an on-disk format with a short TOC at the
 beginning followed by the marshaled data would likely solve the issue.

This was actually what the first draft proposed. Try specifying it in
full detail, and you'll find out that
a) it is *really* complicated, and
b) locking is really tricky to achieve.

Regards,
Martin


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread MRAB

Reid Kleckner wrote:

On Sun, Jan 31, 2010 at 8:34 AM, Nick Coghlan ncogh...@gmail.com wrote:

That still leaves the question of what to do with __file__ (for which
even the solution in the PEP isn't particularly clean). Perhaps the
thing to do there is to have __file__ always point to the source file
and introduce a __file_cached__ that points to the bytecompiled file on
disk (set to None if it doesn't exist, as may be the case for __main__
or due to writing of bytecode files being disabled).


+1 for this, it seems to be what most people want anyway, given the
code that munges the .pyc back to the .py.  I bet this change would
break very little code.


Isn't .pyc just an optimisation for performance reasons? If so, then the
user is more interested in the .py file.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Raymond Hettinger

On Jan 30, 2010, at 4:00 PM, Barry Warsaw wrote:
 Abstract
 
 
 This PEP describes an extension to Python's import mechanism which
 improves sharing of Python source code files among multiple installed
 different versions of the Python interpreter.

+1 


  It does this by
 allowing many different byte compilation files (.pyc files) to be
 co-located with the Python source file (.py file).  

It would be nice if all the compilation files could be tucked
into one single zipfile per directory to reduce directory clutter.

It has several benefits besides tidiness. It hides the implementation
details of when magic numbers get shifted.  And it may allow faster
start-up times when the zipfile is in the disk cache.


Raymond

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Ron Adam


Sense this is something new, I believe it is helpful to look at all the 
possibilities so it doesn't become something we regret we did later.  This 
is something that once it gets put in place may be real hard to get rid of. 
 So here are a few questions that I think haven't seen asked yet.



What command line options will be available to alter how python uses .pyo, 
pyc., and pyr located files?



What's the easiest way to remove all pyr dirs and files?


What's the easiest way to remove pyr dirs files for one project?


Would having python command line argument(s) to remove bytecode make sense?


Would it be possible to have one single python system __pycache__ directory 
and put all bytecode in it?


Instead of subdirectories in the pyr or  __cache__ directory, would it be 
possible to just use unique names and not have multiple sub directories.


# some place on the users disk
someplace/
  foobar/
 __init__.py
 foo.py
 foobar2/
bar.py

# someplace on the system disk
___pycache__/
   v027.xyz.foobar.__init__.py
   v027.xyz.foobar.foo.pyc
   v030.xyz.foobar.__init__.py
   v030.xyz.foobar.foobar2.bar.pyc

# where 'xyz' identifies a unique location to differentiate when there is 
more than one copy of a program.  (They may not be exactly the same, but my 
have the same file names.)


# I like the version id up front because it doesn't intermix different 
version files together.



With a single cache directory, we could have an option to force writing 
bytecode to a desired location.  That might be useful on it's own for 
creating runtime bytecode only installations for installers.



Ron

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Silke von Bargen

Martin v. Löwis schrieb:

There is also the issue of race conditions with multiple simultaneous
accesses. The original format for the PEP had race conditions for
multiple simultaneous writers; ZIP will also have race conditions for
concurrent readers/writers (as any new writer will have to overwrite
the central directory, making the zip file temporarily unavailable -
unless they copy it, in which case we are back to writer/writer
races).

Regards,
Martin

  

Good point. OTOH the probability for this to happen actually is very small.
Henning
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Terry Reedy

On 1/31/2010 8:58 AM, Vitor Bosshard wrote:


Having one single pyr (or__pycache__ or whatever it's called)
subfolder per folder is an easy to understand, solid solution.


As a user who browses directories to see what is there and to find files 
to open and look at, I would like this. The near-duplicate .pyc listings 
are just noise that take up screen space. 'pycache' would be pretty clear.


A slew of directories constaining, in general, one file, each would be a 
waste of space and make it impossible to see at once what has been used 
and therefore compiled. This is main reason I can think of for humans to 
see the .pyc names.


Terry Jan Reedy


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Guido van Rossum
Whoa. This thread already exploded. I'm picking this message to
respond to because it reflects my own view after reading the PEP.

On Sun, Jan 31, 2010 at 4:13 AM, Hanno Schlichting ha...@hannosch.eu wrote:
 On Sun, Jan 31, 2010 at 1:03 PM, Simon Cross
 hodgestar+python...@gmail.com wrote:
 I don't know whether I in favour of using a single pyr folder or not
 but if a single folder is used I'd definitely prefer the folder to be
 called __pyr__ rather than .pyr.

Exactly what I would prefer. I worry that having many small
directories is a fairly poor use of the filesystem. A quick scan of
/usr/local/lib/python3.2 on my Linux box reveals 1163 .py files but
only 57 directories).

 Do you have any specific reason for that?

 Using the leading dot notation is an established pattern to hide
 non-essential information from directory views. What makes this
 non-applicable in this situation and a custom Python notation better?

Because we don't want to completely hide the pyc files. Also the dot
naming convention is somewhat platform-specific.

FWIW in Python 3, the __file__ variable always points to the .py
source filename. I agreed with Georg that there ought to be an API for
finding the pyc file for a module. This could be a small addition to
the PEP.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Terry Reedy

On 1/31/2010 8:34 AM, Nick Coghlan wrote:

Georg Brandl wrote:

+1.  Having a single (visible) __pyr__ directory is much less clutter than
multiple .pyc files anyway.  Also, don't forget Windows users, for whom
the dot convention doesn't mean anything.



I must admit I quite like the __pyr__ directory approach as well. Since
the interpreter knows the suffix it is looking for, names shouldn't
conflict. Using a single directory allows the name to be less cryptic,
too (e.g. __pycache__).


Please spell it out. Possible future doc. When CPython executes or 
imports a .py file with Python source code, it caches internal compiled 
versions in a '__pycache__' subdirectory for possible future use.


tjr


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Silke von Bargen



That still leaves the question of what to do with __file__ (for which
even the solution in the PEP isn't particularly clean). Perhaps the
thing to do there is to have __file__ always point to the source file
and introduce a __file_cached__ that points to the bytecompiled file on
disk (set to None if it doesn't exist, as may be the case for __main__
or due to writing of bytecode files being disabled).
And what if there isn't a source file, because I want to deploy the 
byte-code only?
This is possible now, but would be impossible if there was this kind of 
distinction.


That said, and understanding Martin von Löwis objections against ZIP files,
I'm +1 for something like module.some-kind-of-version.pyc instead of 
subdirectories.


Henning


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread MRAB

Silke von Bargen wrote:



That still leaves the question of what to do with __file__ (for which
even the solution in the PEP isn't particularly clean). Perhaps the
thing to do there is to have __file__ always point to the source file
and introduce a __file_cached__ that points to the bytecompiled file on
disk (set to None if it doesn't exist, as may be the case for __main__
or due to writing of bytecode files being disabled).
And what if there isn't a source file, because I want to deploy the 
byte-code only?
This is possible now, but would be impossible if there was this kind of 
distinction.



You could argue that you put the .pyc file in the __pyr__ directory if
the .py file exists (the .pyc file is there for performance reasons). If
there's no .py file then the .pyc file could be put in the main
directory.

That means that you can 'clean' your directory hierarchy by deleting the
__pyr__ directories (they'll be regenerated on demand) and leave any
other .pyc files (no source .py file) intact.


That said, and understanding Martin von Löwis objections against ZIP files,
I'm +1 for something like module.some-kind-of-version.pyc instead of 
subdirectories.




___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Martin v. Löwis
 foo.pyr/
   cpython-25.pyc
   cpython-25U.pyc
   cpython-27.pyc
   cpython-27U.pyc
   cpython-32.pyc
   unladen-011.pyc
   wpython-11.pyc
 
 +1.  It should be quite easy to assign a new name every time the magic
 number is updated.

It would actually be possible to drop the magic numbers entirely.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Martin v. Löwis
 Exactly. How would you define where the pyr folder goes? At the root
 of a package? What if I delete the __init__.py file there? Will the
 existing pyr folder be orphaned and a new one created in each
 subfolder? Unlike VCS working copies, the package / module / script
 hierarchy is not formally defined in python.

The module name could guide the location. If you are importing
xml.dom.minidom, it could put the pyc file into a sibling of the pyc
folder for xml (under the name xml.dom.minidom.label).

If you then remove __init__, you are no longer able to import xml.dom,
but you might import dom.minidom (assuming you put the xml folder into
sys.path). Then, a new pyc file would be created in the pyc folder for
the dom package.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Martin v. Löwis
 Not really -- much of the code I've seen that tries to guess the source
 file name from a __file__ value just does something like this:
 
if fname.lower().endswith(('.pyc', '.pyo')): fname = fname[:-1]
 
 That's not compatible with using .pyr, either.

If a single pyc folder is used, I think an additional __source__
attribute would be needed to indicate what source file time stamp had
been checked (if any) to determine that the byte code file is current.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Tim Delaney
On 1 February 2010 00:34, Nick Coghlan ncogh...@gmail.com wrote:


 __file__ would always point to the source files
 __file_cached__ would always point to the relevant compiled file (either
 pre-existing or newly created)



I like this solution combined with having a single cache directory and a few
other things I've added below.

The pyc/pyo files are just an optimisation detail, and are essentially
temporary. Given that, if they were to live in a single directory, to me it
seems obvious that the default location for that should be in the system
temporary directory. I an immediately think of the following advantages:

1. No one really complains too much about putting things in /tmp unless it
starts taking up too much space. In which case they delete it and if it gets
reused, it gets recreated.

2. /tmp is often on non-volatile memory. If it is (e.g. my Windows system
temp dir is on a RAMdisk) then it seems wise to respect the obvious desire
to throw away temporary files on shutdown.

3. It removes the need for people in general to even think about the
existence of pyc/pyo files. They could then be relegated to even more of an
implementation detail (probably while explaining the command-line options).

4. No need (in fact undesireable) to make it a hidden directory.

If you wanted to package up the pyc/pyo files, I've got an idea that
combines well with executing a zip file containing __main__.py (see other
thread)

1. Delete /tmp/__pycache__.
2. Compiling all your source files with the versions you want to support (so
long as they supported this mechanism).
3. Add a __main__.py which sets the cache directory to the directory (zip
file) that __main__.py is in. __main__.py (as the initial script) doesn't
use the cache.
4. Zip up the contents of /tmp/__pycache__.

Note that for this to work properly it would either require an __init__.py
to be automatically created in the __pycache__ module subdirectory, or have
the subdirectory be named as a .pyr to indicate it's a cached module (and
thus should be importable).

/tmp/__pycache__
__main__.py
foo.pyr/
foo.py32.pyc
foo.py33.pyc

Tim Delaney
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Neil Hodgson
Tim Delaney:

 I like this solution combined with having a single cache directory and a few
 other things I've added below.
 ...
 2. /tmp is often on non-volatile memory. If it is (e.g. my Windows system
 temp dir is on a RAMdisk) then it seems wise to respect the obvious desire
 to throw away temporary files on shutdown.

   This may create security vulnerabilities. I could, for example,
insert a manipulated .pyc that logs passwords when other users run it.

   I can also see advantages to allowing out of tree compiled cache
directories. For example, you could have a locked down .py tree with
.pycs going into per-user trees. This prevents another user from
spoofing a .pyc I use as well as allowing users to install arbitrary
versions of Python without getting an admin to compile the .py tree
with the new compiler.

   Neil
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Martin v. Löwis
I can also see advantages to allowing out of tree compiled cache
 directories. For example, you could have a locked down .py tree with
 .pycs going into per-user trees. This prevents another user from
 spoofing a .pyc I use as well as allowing users to install arbitrary
 versions of Python without getting an admin to compile the .py tree
 with the new compiler.

This is PEP 304, which has been withdrawn by its author. While there
is some relationship with PEP 3147, the two address orthogonal issues.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Scott Dial
On 1/31/2010 2:04 PM, Raymond Hettinger wrote:
 On Jan 30, 2010, at 4:00 PM, Barry Warsaw wrote:
 It does this by
 allowing many different byte compilation files (.pyc files) to be
 co-located with the Python source file (.py file).  
 
 It would be nice if all the compilation files could be tucked
 into one single zipfile per directory to reduce directory clutter.
 
 It has several benefits besides tidiness. It hides the implementation
 details of when magic numbers get shifted.  And it may allow faster
 start-up times when the zipfile is in the disk cache.
 

On a whim, I implemented a PEP302 loader that cached any important that
it could find in sys.path into a zip file.

I used running bzr as a startup benchmark, and I did my best to ensure
an empty cache by running sync; echo 3  /proc/sys/vm/drop_caches; time
bzr. On my particular machine, the real time was at minimum 3.5
seconds without using my ZipFileCacheLoader. With the loader, I found
the same was true. The average performance was all over the place (due
everything else in the operating system trying to fetch from the disk),
and I lack enough data points to reach statistical significance.

However, if the .pyr zip file is going to contain many versions of the
same module, then the performance impact could be more real, since you
would be forced to pull from disk *all* of the versions of a given module.

-- 
Scott Dial
sc...@scottdial.com
scod...@cs.indiana.edu
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Curt Hagenlocher
On Sun, Jan 31, 2010 at 11:16 AM, Terry Reedy tjre...@udel.edu wrote:


 'pycache' would be pretty clear.

Heh -- without the underscores, I read this as pyc ache. Seems
appropriate.

--
Curt Hagenlocher
c...@hagenlocher.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Terry Reedy

On 1/31/2010 4:26 PM, Tim Delaney wrote:





The pyc/pyo files are just an optimisation detail, and are essentially
temporary.


The .pycs for /Lib and similar are*not* temporarily in the sense you are 
using. They are effectively permanent for as long as the version is 
installed. They should *not* be routinely trashed as they are not 
obsolete and nearly always will be reused.


Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread R. David Murray
On Sun, 31 Jan 2010 19:48:19 +0100, =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= 
mar...@v.loewis.de wrote:
  By the way, the part that caused me the most confusion in the language
  in the PEP was the emphasized *and their dependencies*, as if a package
  having dependencies somehow turned the problem into a factorial explosion.
  But there seems to be nothing special, according to your explanation,
  about dependencies in this scheme.
 
 For regular (forward) dependencies, there is indeed nothing special to
 consider - they would have to exist in all versions. In practice, this
 can (and was) problematic: python-zope.sendmail depends on
 python-pkg-resources, python-transaction, python-zope, and 10 other
 things. Before you could starting to provide python27-zope.sendmail,
 all of these dependencies would have to become available in a 2.7
 version first, meaning that ten other Debian developers need to act
 before you can. With the failure rate of Debian developers (who go
 as often on holidays as any other volunteer), upgrading to a new Python
 release could often take many months.

OK, that makes it clearer.  It's an internal (and probably unavoidable)
Debian social problem, not a technical one, and I see why it is an
important issue.

  It seems like it would be simple enough to enhance the os packaging
  systems to allow the install path to be specified at install time, if
  that really is the only difference between the package versions.  And a
  script that runs through all the installed python packages and installs
  them for a new Python version when a new version is installed should be
  as easy for other distributions as it is for Gentoo.
 
 However, it's also unacceptable. I can't cite the exact piece of Debian
 policy, but I'm fairly sure that build activities are not allowed at
 installation time. So actually running setup.py files is out of
 question. Users who want such a thing would have to switch to Gentoo;
 Debian users just want it to work :-)

I'm less sympathetic to problems created by rigid policies, but that
doesn't mean I'm not sympathetic :)

But I don't understand how this answers the question.  If the
python26-zope.sendmail package doesn't run setup.py, then a
python-zope.sendmail package where you specify at install time which
directory to install the files to isn't going to run setup.py, either.
If the only difference between a packaged python27-zope.sendmail and a
packaged python26-zope.sendmail is the directory to which the files get
written, why can't that be controlled at install time?  Writing files
to a directory must be an install activity, not a build activity.  If the
issue is that *deciding* what directory to install to is a build time
activity...well, maybe I would be less sympathetic to a policy that is
*that* rigid.

  (The os vendors are going to have
  to change details of their packaging systems if the PEP is accepted,
  so it's not as if the PEP saves the vendors work.)
 
 Again, I'm a little bit unclear on the motivation, also. I think it
 mostly is after years of experimentation, we have run out of ideas
 how to solve all related problems simultaneously without changing
 Python, so let's look for options that do involve changing Python.
 
 If you *really* want a list of all the simultaneous problems that
 need to be solved, and an explanation of why each individual solution
 has flaws, prepare for this conversation to take a few more weeks.

Well, I certainly don't want the conversation to take a few more months.
I'm not against the PEP, I'm making my comments and asking my questions
in the spirit of making it a high quality PEP.  If the motivation is
the Debian devs have concluded, after years of experimentation...,
then I suppose that's what should go in the motivation section.

--
R. David Murray  www.bitdance.com
Business Process Automation - Network/Server Management - Routers/Firewalls
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Nick Coghlan
Antoine Pitrou wrote:
 Le Sat, 30 Jan 2010 21:04:14 -0800, Jeffrey Yasskin a écrit :
 I have a couple bikesheddy or why didn't you do this comments. I'll be
 perfectly satisfied with an answer or a line in the pep.

 1. Why the -R flag? It seems like this is a uniform improvement, so it
 should be the default. Have faith in your design! ;-)
 
 -1 for making it a default. It is definitely ugly and useless for most 
 cases. It is fine as long as it is optional and merely used by the Debian/
 Ubuntu installers.

Would you still be a -1 on making it the new scheme the default if it
used a single cache directory instead? That would actually be cleaner
than the current solution rather than messier.

Regards,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
---
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] subprocess docs patch

2010-01-31 Thread Chris Rebert
Hello mighty Python developers,

I was wondering if someone could take a gander at, and hopefully act
upon, a patch I submitted a while ago for the subprocess module's
docs.
It's been languishing in the bug tracker:

http://bugs.python.org/issue6760

Any help you could provide would be appreciated.

Cheers,
Chris
--
If life seems jolly rotten. There's something you've forgotten.
And that's to laugh and smile and dance and sing.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Nick Coghlan
Martin v. Löwis wrote:
 Exactly. How would you define where the pyr folder goes? At the root
 of a package? What if I delete the __init__.py file there? Will the
 existing pyr folder be orphaned and a new one created in each
 subfolder? Unlike VCS working copies, the package / module / script
 hierarchy is not formally defined in python.
 
 The module name could guide the location. If you are importing
 xml.dom.minidom, it could put the pyc file into a sibling of the pyc
 folder for xml (under the name xml.dom.minidom.label).
 
 If you then remove __init__, you are no longer able to import xml.dom,
 but you might import dom.minidom (assuming you put the xml folder into
 sys.path). Then, a new pyc file would be created in the pyc folder for
 the dom package.

I see three possible logical locations for the Python cache directories:

1. In each directory containing Python source files.
  Major Pro: easy to keep source files associated with their cached versions
  Major Con: proliferation of cache directories

2. In each top level directory on sys.path, flat file structure
  Major Pro: trivial to separate out all cached files
  Major Con: for path locations like the top of the standard lib, the
cache directory would get a *lot* of entries

3. In each top level directory on sys.path, shadow file heirarchy
  Major Pro: trivial to separate out all cached files
  Major Con: ??? (I got nuthin')

I didn't list a single global cache directory as a viable option as it
would create some nasty naming conflicts due to runs with different
sys.path entries and would make it impossible to create zipfiles with
precached bytecode files.

Note that with option two, creating a bytecode only zipfile would be
trivial: just add the __pycache__ directory as the top-level directory
in the zipfile and leave out everything else (assume there were no data
files in the package that were still needed).

Packages would still be identifiable by the existence of the cached pyc
file for their __init__modules.

Going back to my previous example (with one extra source file to show
how a top-level module would be handled), scheme 2 would give:

module.py
package/
  __init__.py
  foo.py
  subpackage/
__init__.py
bar.py
__pycache__/
  module.cpython-27.pyc
  module.cpython-27.pyo
  package.__init__.cpython-27.pyc
  package.__init__.cpython-27.pyo
  package.foo.cpython-27.pyc
  package.foo.cpython-27.pyo
  package.subpackage.__init__.cpython-27.pyc
  package.subpackage.__init__.cpython-27.pyo
  package.subpackage.bar.cpython-27.pyc
  package.subpackage.bar.cpython-27.pyo

While scheme 3 would look like:

module.py
package/
  __init__.py
  foo.py
  subpackage/
__init__.py
bar.py
__pycache__/
  module.cpython-27.pyc
  module.cpython-27.pyo
  package/
__init__.cpython-27.pyc
__init__.cpython-27.pyo
foo.cpython-27.pyc
foo.cpython-27.pyo
subpackage/
  __init__.cpython-27.pyc
  __init__.cpython-27.pyo
  bar.cpython-27.pyc
  bar.cpython-27.pyo

For comparison, here is what it would look like under scheme 1:

module.py
package/
  __init__.py
  foo.py
  subpackage/
__init__.py
bar.py
__pycache__/
  __init__.cpython-27.pyc
  __init__.cpython-27.pyo
  bar.cpython-27.pyc
  bar.cpython-27.pyo
  __pycache__/
__init__.cpython-27.pyc
__init__.cpython-27.pyo
foo.cpython-27.pyc
foo.cpython-27.pyo
__pycache__/
  module.cpython-27.pyc
  module.cpython-27.pyo

And the initial version proposed in the PEP:

module.py
module.pyr/
  cpython-27.pyc
  cpython-27.pyo
package/
  __init__.py
  __init__.pyr/
cpython-27.pyc
cpython-27.pyo
  foo.py
  foo.pyr/
cpython-27.pyc
cpython-27.pyo
  subpackage/
__init__.py
__init__.pyr/
  cpython-27.pyc
  cpython-27.pyo
bar.py
bar.pyr/
  cpython-27.pyc
  cpython-27.pyo

My major concern with scheme 2 is the possibility of directory size
limits affecting the caching of files, but scheme 3 looks pretty good to
me (with the higher level cache linked to the directory that is actually
on sys.path, the cache locations aren't as arbitrary as I originally
feared).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
---
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Nick Coghlan
Silke von Bargen wrote:
 
 That still leaves the question of what to do with __file__ (for which
 even the solution in the PEP isn't particularly clean). Perhaps the
 thing to do there is to have __file__ always point to the source file
 and introduce a __file_cached__ that points to the bytecompiled file on
 disk (set to None if it doesn't exist, as may be the case for __main__
 or due to writing of bytecode files being disabled).
 And what if there isn't a source file, because I want to deploy the
 byte-code only?
 This is possible now, but would be impossible if there was this kind of
 distinction.

For a bytecode only deployment, __file__ would point to where the source
file *would* be if it was there while __file_cached__ would point to the
precompiled byte code.

Yes, this would be backwards incompatible for some uses of execfile in
conjunction with __file__ but those should be much rarer than uses of
__file__ to locate source code (which break with bytecode only
deployment anyway) and to find colocated resource files (which only care
about the path to the file and not the filename itself).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
---
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Martin v. Löwis
 But I don't understand how this answers the question.  If the
 python26-zope.sendmail package doesn't run setup.py, then a
 python-zope.sendmail package where you specify at install time which
 directory to install the files to isn't going to run setup.py, either.
 If the only difference between a packaged python27-zope.sendmail and a
 packaged python26-zope.sendmail is the directory to which the files get
 written, why can't that be controlled at install time?

It certainly would be possible to copy the files into each Python's
site-packages. They have a system that does that in place, except that
it doesn't copy the files, but symlinks them.

 Well, I certainly don't want the conversation to take a few more months.
 I'm not against the PEP, I'm making my comments and asking my questions
 in the spirit of making it a high quality PEP.  If the motivation is
 the Debian devs have concluded, after years of experimentation...,
 then I suppose that's what should go in the motivation section.

I guess Barry will have to explain what the problem with the current
scheme is.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-01-31 Thread Ben Finney
Nick Coghlan ncogh...@gmail.com writes:

 Would you still be a -1 on making it the new scheme the default if it
 used a single cache directory instead? That would actually be cleaner
 than the current solution rather than messier.

+0 on a default of “store compiled bytecode files in a single cache
directory”. It is indeed cleaner than the current default.

I'm only +0 because I don't know whether that actually addresses the use
case that raised the issue to begin with, so I'm postponing judgement
until those who want this change in the first place chime in.

-- 
 \ “Once consumers can no longer get free music, they will have to |
  `\buy the music in the formats we choose to put out.” —Steve |
_o__)  Heckler, VP of Sony Music, 2001 |
Ben Finney

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com