Re: [Python-Dev] PEP 3147 ready for pronouncement and merging

2010-04-16 Thread Barry Warsaw
On Apr 15, 2010, at 08:01 PM, Guido van Rossum wrote:

Comments inline. Nothing showstopping, mostly just spewing obscure
background information...

Overall, congratulations! I'm fine with the implementation going in
and the PEP being marked as accepted as long as you get to the
clarifications I suggest below soon after.

Awesome, thanks Guido!  I will respond in detail and address your
clarifications before I commit to the py3k branch.  I wanted to address one
thing now though since Steve responded to it.

 Implementation strategy
 ===

 This feature is targeted for Python 3.2, solving the problem for those
 and all future versions.  It may be back-ported to Python 2.7.

Is there time given that 2.7b1 was released?

I think this would be totally up to Benjamin as the RM for 2.7.  Although I
haven't tried yet, my sense of it is that most of the patch would port pretty
easily to trunk.  I could probably generate a patch for review by mid-next
week.

Whether it should or not is a different matter.  Given that we're in beta, I'm
not sure *I* would approve it if I were the RM, but as the developer, sure,
I'd love to back port it. :)

However...

 Vendors are free to backport the changes to earlier distributions as
 they see fit.

...Steve asks if we're really going to do this.  For Debian and/or Ubuntu, we
haven't yet decided.  I plan to begin that discussion on the appropriate
distro-related mailing lists after the code lands on py3k.  It certainly won't
be enabled by default.

I don't think we've made any decisions about which versions of Python will
make it into the next release of Ubuntu about 6 months from now, but given the
Python release schedule, that could be 2.6, 2.7 and 3.1.  If we really do
include all three versions, I will push for backporting the feature (enabled
with -Xcachedir) in our releases so that we can gain the benefit of ditching
the symlink farms as soon as possible.

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147 ready for pronouncement and merging

2010-04-16 Thread Terry Reedy

On 4/15/2010 11:01 PM, Guido van Rossum wrote:

pyc files inside of the `__pycache__` directories contain a magic
identifier in their file names.  These are mnemonic tags for the
actual magic numbers used by the importer.  For example, in Python
3.2, we could use the hexlified [10]_ magic number as a unique


(Aside: when you search Wikipedia for hexlify it says did you mean:
heavily? :-)


I regard 'hexlify', as used in binascii, to be a misspelling of 
'hexify', whether it originated with Python or elsewhere. 'Hexify' 
itself may not be an official dictionary word but it at least follows a 
normal pattern of derivation. I have not bothered to say anything before 
because correcting the misspelling of the module function would break 
back compatibility. But I think its usage should stop there.


Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147 ready for pronouncement and merging

2010-04-16 Thread Guido van Rossum
On Fri, Apr 16, 2010 at 11:09 AM, Terry Reedy tjre...@udel.edu wrote:
 On 4/15/2010 11:01 PM, Guido van Rossum wrote:

 pyc files inside of the `__pycache__` directories contain a magic
 identifier in their file names.  These are mnemonic tags for the
 actual magic numbers used by the importer.  For example, in Python
 3.2, we could use the hexlified [10]_ magic number as a unique

 (Aside: when you search Wikipedia for hexlify it says did you mean:
 heavily? :-)

 I regard 'hexlify', as used in binascii, to be a misspelling of 'hexify',
 whether it originated with Python or elsewhere. 'Hexify' itself may not be
 an official dictionary word but it at least follows a normal pattern of
 derivation. I have not bothered to say anything before because correcting
 the misspelling of the module function would break back compatibility. But I
 think its usage should stop there.

To the contrary, it was invented by Barry and ought to be added to the
English language as a neologism.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147 ready for pronouncement and merging

2010-04-16 Thread Benjamin Peterson
2010/4/16 Barry Warsaw ba...@python.org:
 On Apr 15, 2010, at 08:01 PM, Guido van Rossum wrote:
 This feature is targeted for Python 3.2, solving the problem for those
 and all future versions.  It may be back-ported to Python 2.7.

Is there time given that 2.7b1 was released?

 I think this would be totally up to Benjamin as the RM for 2.7.  Although I
 haven't tried yet, my sense of it is that most of the patch would port pretty
 easily to trunk.  I could probably generate a patch for review by mid-next
 week.

I would prefer that this not be in 2.7. The patch may be simple to
port, but it represents quite a change in an eons old Python behvior.


 Whether it should or not is a different matter.  Given that we're in beta, I'm
 not sure *I* would approve it if I were the RM, but as the developer, sure,
 I'd love to back port it. :)

Thank you for sympathizing. :)



-- 
Regards,
Benjamin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147 ready for pronouncement and merging

2010-04-16 Thread Barry Warsaw
On Apr 15, 2010, at 08:01 PM, Guido van Rossum wrote:

 Byte code files contain two 32-bit numbers followed by the marshaled

big-endian

Done.

 [2]_ code object.  The 32-bit numbers represent a magic number and a
 timestamp.  The magic number changes whenever Python changes the byte
 code format, e.g. by adding new byte codes to its virtual machine.
 This ensures that pyc files built for previous versions of the VM
 won't cause problems.  The timestamp is used to make sure that the pyc
 file is not older than the py file that was used to create it.  When

is not older than - matches

(Obscure fact: the timestamp in the pyc file must match the source's
mtime exactly.)

Done.

 Rationale
 =

 Linux distributions such as Ubuntu [4]_ and Debian [5]_ provide more
 than one Python version at the same time to their users.  For example,
 Ubuntu 9.10 Karmic Koala users can install Python 2.5, 2.6, and 3.1,
 with Python 2.6 being the default.

 This causes a conflict for Python source files installed by the
 system (including third party packages), because you cannot compile a

I'd say only 3rd part packages right? (And code written by the distro,
which from Python's POV is also 3rd party.) At least ought to clarify
that the stdlib is unaffected by this conflict, because multiple
versions of the stdlib *are* installed.

Yes, good point.  Clarified.

 single Python source file for more than one Python version at a time.
 Thus if your system wanted to install a `/usr/share/python/foo.py`, it
 could not create a `/usr/share/python/foo.pyc` file usable across all
 installed Python versions.

Note that (due to the magic#) Python doesn't crash, it just falls back
on the slower approach of compiling from source.

Perhaps more important is that different Python versions (if the user
has write permission) will fight over the pyc file and rewrite it each
time the source is compiled. Worse, even though the magic# is
initially written as zero and then rewritten with the correct value,
concurrent processes running different Python versions can actually
end up reading corrupt bytecode. (Alex Martelli diagnosed this at
Google years ago.)

Good point; I've made this more clear.

 Furthermore, in order to ease the burden on operating system packagers
 for these distributions, the distribution packages do not contain
 Python version numbers [6]_; they are shared across all Python
 versions installed on the system.  Putting Python version numbers in
 the packages would be a maintenance nightmare, since all the packages
 - *and their dependencies* - would have to be updated every time a new
 Python release was added or removed from the distribution.  Because of
 the sheer number of packages available, this amount of work is
 infeasible.

 C extensions can be source compatible across multiple versions of
 Python.  Compiled extension modules are usually not compatible though,

Actually we typically make every effort to support backwards
compatibility for compiled modules, and the module initialization API
contains a version# check. This is a different version# than the
import magic# and historically has changed much less frequently.

I've rewritten this paragraph a bit.  It's not particularly relevant to this
PEP. (I'll be look at PEP 384 soon.)

 and PEP 384 [7]_ has been proposed to address this by defining a
 stable ABI for extension modules.


 Proposal
 

 Python's import machinery is extended to write and search for byte
 code cache files in a single directory inside every Python package
 directory.  This directory will be called `__pycache__`.
 Further, pyc files will contain a magic string that differentiates the

Clarify that the magic string is in the filename, not in the file contents.

Yep.

 Python version they were compiled for.  This allows multiple byte
 compiled cache files to co-exist for a single Python source file.

 This scheme has the added benefit of reducing the clutter in a Python
 package directory.

 When a Python source file is imported for the first time, a
 `__pycache__` directory will be created in the package directory, if

Is this still true? ISTR there was a lot of discussion about the
auto-creation and possible security concerns.

It is still true.  I think we determined it will usually not be an issue
because the umask will not be altered, and because normal installation
procedures typically involve byte compilation (and thus __pycache__ creation)
during installation time via tools like compileall.  This really is describing
what happens when you run Python over pure Python source code for the first
time, and it's no different from what happens now with the automatic creation
of pyc files.

 one does not already exist.  The pyc file for the imported source will
 be written to the `__pycache__` directory, using the magic-tag

By now the magic-tag format should have been defined (or a see below
inserted).

Based on this and your following comment, I've moved the description of the

Re: [Python-Dev] PEP 3147 ready for pronouncement and merging

2010-04-16 Thread Barry Warsaw
On Apr 16, 2010, at 11:52 AM, Guido van Rossum wrote:

To the contrary, it was invented by Barry and ought to be added to the
English language as a neologism.

Actually, it's an Emacs invention!

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147 ready for pronouncement and merging

2010-04-16 Thread Guido van Rossum
Thanks for all the changes!

On Fri, Apr 16, 2010 at 4:00 PM, Barry Warsaw ba...@python.org wrote:
 On Apr 15, 2010, at 08:01 PM, Guido van Rossum wrote:
Hm. I wish there was a way to find out whether the bytecode (or
whatever) actually *was* read from this file. __file__ in Python 2
supports this (though not in Python 3).

 Do you have a use case for that?  It might be interesting to know, but I can't
 think of a good way to infer that from __file__ and __cached__, or of a good
 way to expose that on module objects.   Of course, it would be totally Python
 implementation dependent too.

The only use case I can think of is a unit test that would indirectly
assess that bytecode was (or wasn't) read in specific conditions. You
can safely ignore this use case.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147 ready for pronouncement and merging

2010-04-15 Thread Guido van Rossum
Comments inline. Nothing showstopping, mostly just spewing obscure
background information...

Overall, congratulations! I'm fine with the implementation going in
and the PEP being marked as accepted as long as you get to the
clarifications I suggest below soon after.

--Guido

On Tue, Apr 13, 2010 at 1:21 PM, Barry Warsaw ba...@python.org wrote:
 I am attaching the latest revision of PEP 3147 to this message, which is also
 available here:

 http://www.python.org/dev/peps/pep-3147/

 I think the PEP is ready for pronouncement, and the patch is pretty much ready
 for merging into py3k.  The only thing that I can think of that is not
 implemented yet is this section on PEP 302 loaders:

    PEP 302 [18]_ defined loaders have a `.get_filename()` method which
    points to the `__file__` for a module.  As part of this PEP, we will
    extend this API, to include a new method `.get_paths()` which will
    return a 2-tuple containing the path to the source file and the path
    to where the matching `pyc` file is (or would be).

 I'm honestly not sure whether this is still essential, or whether the
 importlib ABC changes Brett and I talked about at Pycon are still required.  I
 now believe they are at best a minor part of the implementation if so.  Maybe
 Brett can chime in on that.

Fine with me to omit.

 Everything else is implemented, tested, and has undergone four rounds of
 Rietveld reviews (thanks Antoine, Benjamin, Brett, and Georg!).  A fifth patch
 set has been uploaded and is available here:

 http://codereview.appspot.com/842043/show

TL;DR :-)

 This addresses all previous comments, includes some fixes from Brian Curtin
 for Windows (thanks!) and fixes __main__ and -m support.  I'd like to commit
 this to py3k sooner rather than later so that we can shake out any additional
 issues that might crop up, without having to continue to maintain my external
 branches.

 Guido, what say you?
 -Barry

 PEP: 3147
 Title: PYC Repository Directories
 Version: $Revision: 80025 $
 Last-Modified: $Date: 2010-04-12 22:17:40 -0400 (Mon, 12 Apr 2010) $
 Author: Barry Warsaw ba...@python.org
 Status: Draft
 Type: Standards Track
 Content-Type: text/x-rst
 Created: 2009-12-16
 Python-Version: 3.2
 Post-History: 2010-01-30, 2010-02-25, 2010-03-03, 2010-04-12


 Abstract
 

 This PEP describes an extension to Python's import mechanism which
 improves sharing of Python source code files among multiple installed
 different versions of the Python interpreter.  It does this by
 allowing more than one byte compilation file (.pyc files) to be
 co-located with the Python source file (.py file).  The extension
 described here can also be used to support different Python
 compilation caches, such as JIT output that may be produced by an
 Unladen Swallow [1]_ enabled C Python.


 Background
 ==

 CPython compiles its source code into byte code, and for performance
 reasons, it caches this byte code on the file system whenever the
 source file has changes.  This makes loading of Python modules much
 faster because the compilation phase can be bypassed.  When your
 source file is `foo.py`, CPython caches the byte code in a `foo.pyc`
 file right next to the source.

 Byte code files contain two 32-bit numbers followed by the marshaled

big-endian

 [2]_ code object.  The 32-bit numbers represent a magic number and a
 timestamp.  The magic number changes whenever Python changes the byte
 code format, e.g. by adding new byte codes to its virtual machine.
 This ensures that pyc files built for previous versions of the VM
 won't cause problems.  The timestamp is used to make sure that the pyc
 file is not older than the py file that was used to create it.  When

is not older than - matches

(Obscure fact: the timestamp in the pyc file must match the source's
mtime exactly.)

 either the magic number or timestamp do not match, the py file is
 recompiled and a new pyc file is written.

 In practice, it is well known that pyc files are not compatible across
 Python major releases.  A reading of import.c [3]_ in the Python
 source code proves that within recent memory, every new CPython major
 release has bumped the pyc magic number.


 Rationale
 =

 Linux distributions such as Ubuntu [4]_ and Debian [5]_ provide more
 than one Python version at the same time to their users.  For example,
 Ubuntu 9.10 Karmic Koala users can install Python 2.5, 2.6, and 3.1,
 with Python 2.6 being the default.

 This causes a conflict for Python source files installed by the
 system (including third party packages), because you cannot compile a

I'd say only 3rd part packages right? (And code written by the distro,
which from Python's POV is also 3rd party.) At least ought to clarify
that the stdlib is unaffected by this conflict, because multiple
versions of the stdlib *are* installed.

 single Python source file for more than one Python version at a time.
 Thus if your system wanted to install a `/usr/share/python/foo.py`, it
 

Re: [Python-Dev] PEP 3147 ready for pronouncement and merging

2010-04-15 Thread Steve Holden
Guido van Rossum wrote:
[...]
 Implementation strategy
 ===

 This feature is targeted for Python 3.2, solving the problem for those
 and all future versions.  It may be back-ported to Python 
 
 Is there time given that 2.7b1 was released?
 
I would hope we have learned out lesson about cramming new features in
so late in the day, and this *is* a new feature, isn't it? Surely it
therefore can't be added in a bugfix release, which in turn means it
will never be implemented in Python 2 (given that 2.7 is envisaged as
the last Py2 release).

 Vendors are free to backport the changes to earlier distributions as
 they see fit.

Really?


 Effects on existing code
 

 Adoption of this PEP will affect existing code and idioms, both inside
 Python and outside.  This section enumerates some of these effects.


 __file__
 -

 In Python 3, when you import a module, its `__file__` attribute points
 to its source `py` file (in Python 2, it points to the `pyc` file).  A
 package's `__file__` points to the `py` file for its `__init__.py`.
 E.g.::

 import foo
 foo.__file__
'foo.py'
# baz is a package
 import baz
 baz.__file__
'baz/__init__.py'

 Nothing in this PEP would change the semantics of `__file__`.

 This PEP proposes the addition of an `__cached__` attribute to
 modules, which will always point to the actual `pyc` file that was
 read or written.  When the environment variable
 `$PYTHONDONTWRITEBYTECODE` is set, or the `-B` option is given, or if
 the source lives on a read-only filesystem, then the `__cached__`
 attribute will point to the location that the `pyc` file *would* have
 been written to if it didn't exist.  This location of course includes
 the `__pycache__` subdirectory in its path.
 
 Hm. I wish there was a way to find out whether the bytecode (or
 whatever) actually *was* read from this file. __file__ in Python 2
 supports this (though not in Python 3).
 
There also seems to be some complexity in this specification. Does the
intgerpreter go through the analysis of whether the __pycache__
directory could be created in order to provide the correct value for
the location that the .pyc file would have been written to if it didn't
exist?

 For alternative Python implementations which do not support `pyc`
 files, the `__cached__` attribute may point to whatever information
 makes sense.  E.g. on Jython, this might be the `.class` file for the
 module: `__pycache__/foo.jython-32.class`.  Some implementations may
 use multiple compiled files to create the module, in which case
 `__cached__` may be a tuple.  The exact contents of `__cached__` are
 Python implementation specific.

 It is recommended that when nothing sensible can be calculated,
 implementations should set the `__cached__` attribute to `None`.


[...]

regards
 Steve
-- 
Steve Holden   +1 571 484 6266   +1 800 494 3119
See PyCon Talks from Atlanta 2010  http://pycon.blip.tv/
Holden Web LLC http://www.holdenweb.com/
UPCOMING EVENTS:http://holdenweb.eventbrite.com/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147 ready for pronouncement and merging

2010-04-14 Thread Barry Warsaw
On Apr 13, 2010, at 04:44 PM, Guido van Rossum wrote:

Give me a couple of days; but I don't expect any problems given how
the earlier discussion went. If you didn't hear from me by Friday go
ahead and merge.

Thanks Guido.
-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147 ready for pronouncement and merging

2010-04-14 Thread Isaac Morland

On Tue, 13 Apr 2010, Barry Warsaw wrote:

I am attaching the latest revision of PEP 3147 to this message, which is 
also available here:


http://www.python.org/dev/peps/pep-3147/

[]

PEP: 3147
Title: PYC Repository Directories

[]

Further, pyc files will contain a magic string that differentiates the
Python version they were compiled for.  This allows multiple byte
compiled cache files to co-exist for a single Python source file.

This scheme has the added benefit of reducing the clutter in a Python
package directory.

When a Python source file is imported for the first time, a
`__pycache__` directory will be created in the package directory, if
one does not already exist.  The pyc file for the imported source will
be written to the `__pycache__` directory, using the magic-tag
formatted name.  If either the creation of the `__pycache__` directory
or the pyc file inside that fails, the import will still succeed, just
as it does in a pre-PEP-3147 world.

[]

Thank you for doing the work on this improvement.

I have one wording suggestion which I hope isn't bikeshedding: up above, I 
think the sentence containing pyc files will contain a magic string 
would be clearer if it made it clear that the file *names*, not (just?) 
the file contents, will contain the magic tag.


Isaac Morland   CSCF Web Guru
DC 2554C, x36650WWW Software Specialist
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147 ready for pronouncement and merging

2010-04-14 Thread Nick Coghlan
Isaac Morland wrote:
 I have one wording suggestion which I hope isn't bikeshedding: up above,
 I think the sentence containing pyc files will contain a magic string
 would be clearer if it made it clear that the file *names*, not (just?)
 the file contents, will contain the magic tag.

That's not bikeshedding, that's picking up a mistake in the PEP :)

Indeed, the magic tag only goes in the file names (the pyc files
themselves contain the corresponding magic number, just as they always
have).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
---
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147 ready for pronouncement and merging

2010-04-14 Thread Nick Coghlan
Brett Cannon wrote:
 And just a quick suggestion: can we standardize what
 imp.source_to_path() and friend are supposed to return if the
 interpreter doesn't support bytecode? I will probably have to rely on
 that for something so it would be best to say now whether it should be
 None or raise an exception so there is no divergence on this between VMs.

Returning None sounds like the most straightforward option. __cached__
= None will just mean for whatever reason, we have no cached filename
for this file. It may be the cached file doesn't exist, or the
interpreter simply wasn't in a position to figure it out in a user
visible way.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
---
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147 ready for pronouncement and merging

2010-04-14 Thread Barry Warsaw
On Apr 15, 2010, at 08:33 AM, Nick Coghlan wrote:

Brett Cannon wrote:
 And just a quick suggestion: can we standardize what
 imp.source_to_path() and friend are supposed to return if the
 interpreter doesn't support bytecode? I will probably have to rely on
 that for something so it would be best to say now whether it should be
 None or raise an exception so there is no divergence on this between VMs.

Returning None sounds like the most straightforward option. __cached__
= None will just mean for whatever reason, we have no cached filename
for this file. It may be the cached file doesn't exist, or the
interpreter simply wasn't in a position to figure it out in a user
visible way.

I completely agree.  The PEP already leaves __cached__ up to the
implementation, but I'll update it to be clear that None is an acceptable
return value from imp.cached_from_source() (which is the one I think you
mean), and also what __cached__=None means.

Thanks Brett and Nick.
-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP 3147 ready for pronouncement and merging

2010-04-13 Thread Barry Warsaw
I am attaching the latest revision of PEP 3147 to this message, which is also
available here:

http://www.python.org/dev/peps/pep-3147/

I think the PEP is ready for pronouncement, and the patch is pretty much ready
for merging into py3k.  The only thing that I can think of that is not
implemented yet is this section on PEP 302 loaders:

PEP 302 [18]_ defined loaders have a `.get_filename()` method which
points to the `__file__` for a module.  As part of this PEP, we will
extend this API, to include a new method `.get_paths()` which will
return a 2-tuple containing the path to the source file and the path
to where the matching `pyc` file is (or would be).

I'm honestly not sure whether this is still essential, or whether the
importlib ABC changes Brett and I talked about at Pycon are still required.  I
now believe they are at best a minor part of the implementation if so.  Maybe
Brett can chime in on that.

Everything else is implemented, tested, and has undergone four rounds of
Rietveld reviews (thanks Antoine, Benjamin, Brett, and Georg!).  A fifth patch
set has been uploaded and is available here:

http://codereview.appspot.com/842043/show

This addresses all previous comments, includes some fixes from Brian Curtin
for Windows (thanks!) and fixes __main__ and -m support.  I'd like to commit
this to py3k sooner rather than later so that we can shake out any additional
issues that might crop up, without having to continue to maintain my external
branches.

Guido, what say you?
-Barry

PEP: 3147
Title: PYC Repository Directories
Version: $Revision: 80025 $
Last-Modified: $Date: 2010-04-12 22:17:40 -0400 (Mon, 12 Apr 2010) $
Author: Barry Warsaw ba...@python.org
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 2009-12-16
Python-Version: 3.2
Post-History: 2010-01-30, 2010-02-25, 2010-03-03, 2010-04-12


Abstract


This PEP describes an extension to Python's import mechanism which
improves sharing of Python source code files among multiple installed
different versions of the Python interpreter.  It does this by
allowing more than one byte compilation file (.pyc files) to be
co-located with the Python source file (.py file).  The extension
described here can also be used to support different Python
compilation caches, such as JIT output that may be produced by an
Unladen Swallow [1]_ enabled C Python.


Background
==

CPython compiles its source code into byte code, and for performance
reasons, it caches this byte code on the file system whenever the
source file has changes.  This makes loading of Python modules much
faster because the compilation phase can be bypassed.  When your
source file is `foo.py`, CPython caches the byte code in a `foo.pyc`
file right next to the source.

Byte code files contain two 32-bit numbers followed by the marshaled
[2]_ code object.  The 32-bit numbers represent a magic number and a
timestamp.  The magic number changes whenever Python changes the byte
code format, e.g. by adding new byte codes to its virtual machine.
This ensures that pyc files built for previous versions of the VM
won't cause problems.  The timestamp is used to make sure that the pyc
file is not older than the py file that was used to create it.  When
either the magic number or timestamp do not match, the py file is
recompiled and a new pyc file is written.

In practice, it is well known that pyc files are not compatible across
Python major releases.  A reading of import.c [3]_ in the Python
source code proves that within recent memory, every new CPython major
release has bumped the pyc magic number.


Rationale
=

Linux distributions such as Ubuntu [4]_ and Debian [5]_ provide more
than one Python version at the same time to their users.  For example,
Ubuntu 9.10 Karmic Koala users can install Python 2.5, 2.6, and 3.1,
with Python 2.6 being the default.

This causes a conflict for Python source files installed by the
system (including third party packages), because you cannot compile a
single Python source file for more than one Python version at a time.
Thus if your system wanted to install a `/usr/share/python/foo.py`, it
could not create a `/usr/share/python/foo.pyc` file usable across all
installed Python versions.

Furthermore, in order to ease the burden on operating system packagers
for these distributions, the distribution packages do not contain
Python version numbers [6]_; they are shared across all Python
versions installed on the system.  Putting Python version numbers in
the packages would be a maintenance nightmare, since all the packages
- *and their dependencies* - would have to be updated every time a new
Python release was added or removed from the distribution.  Because of
the sheer number of packages available, this amount of work is
infeasible.

C extensions can be source compatible across multiple versions of
Python.  Compiled extension modules are usually not compatible though,
and PEP 384 [7]_ has been proposed to 

Re: [Python-Dev] PEP 3147 ready for pronouncement and merging

2010-04-13 Thread Guido van Rossum
On Tue, Apr 13, 2010 at 1:21 PM, Barry Warsaw ba...@python.org wrote:
 I am attaching the latest revision of PEP 3147 to this message, which is also
 available here:

 http://www.python.org/dev/peps/pep-3147/

 I think the PEP is ready for pronouncement, and the patch is pretty much ready
 for merging into py3k.  The only thing that I can think of that is not
 implemented yet is this section on PEP 302 loaders:

    PEP 302 [18]_ defined loaders have a `.get_filename()` method which
    points to the `__file__` for a module.  As part of this PEP, we will
    extend this API, to include a new method `.get_paths()` which will
    return a 2-tuple containing the path to the source file and the path
    to where the matching `pyc` file is (or would be).

 I'm honestly not sure whether this is still essential, or whether the
 importlib ABC changes Brett and I talked about at Pycon are still required.  I
 now believe they are at best a minor part of the implementation if so.  Maybe
 Brett can chime in on that.

 Everything else is implemented, tested, and has undergone four rounds of
 Rietveld reviews (thanks Antoine, Benjamin, Brett, and Georg!).  A fifth patch
 set has been uploaded and is available here:

 http://codereview.appspot.com/842043/show

 This addresses all previous comments, includes some fixes from Brian Curtin
 for Windows (thanks!) and fixes __main__ and -m support.  I'd like to commit
 this to py3k sooner rather than later so that we can shake out any additional
 issues that might crop up, without having to continue to maintain my external
 branches.

 Guido, what say you?

Give me a couple of days; but I don't expect any problems given how
the earlier discussion went. If you didn't hear from me by Friday go
ahead and merge.

 -Barry

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com