Re: [Python-ideas] Add more information in the header of pyc files

2018-04-11 Thread Nick Coghlan
On 11 April 2018 at 02:54, Antoine Pitrou  wrote:
> On Tue, 10 Apr 2018 19:29:18 +0300
> Serhiy Storchaka 
> wrote:
>>
>> A bugfix release can fix bugs in bytecode generation. See for example
>> issue27286. [1]  The part of issue33041 backported to 3.7 and 3.6 is an
>> other example. [2]  There were other examples of compatible changing the
>> bytecode. Without bumping the magic number these fixes can just not have
>> any effect if existing pyc files were generated by older compilers. But
>> bumping the magic number in a bugfix release can lead to rebuilding
>> every pyc file (even unaffected by the fix) in distributives.
>
> Sure, but I don't think rebuilding every pyc file is a significant
> problem.  It's certainly less error-prone than cherry-picking which
> files need rebuilding.

And we need to handle the old bytecode format in the eval loop anyway,
or else we'd be breaking compatibility with bytecode-only files, as
well as introducing a significant performance regression for
non-writable bytecode caches (if we were to ignore them).

It's a subtle enough problem that I think the `compileall --force`
option is a safer way of handling it, even if it regenerates some pyc
files that could have been kept.

For the "stable file signature" aspect, does that need to be
specifically the first *four* bytes? One of the benefits of PEP 552
leaving those four bytes alone is that it meant that a lot of magic
number checking code didn't need to change. If the stable marker could
be placed later (e.g. after the PEP 552 header), then we'd similarly
have the benefit that code checking the PEP 552 headers wouldn't need
to change, at the expense of folks having to read 20 bytes to see the
new signature byte (which shouldn't be a problem, given that file
defaults to reading up to 1 MiB from files it is trying to identify).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Add more information in the header of pyc files

2018-04-10 Thread Antoine Pitrou
On Tue, 10 Apr 2018 19:29:18 +0300
Serhiy Storchaka 
wrote:
> 
> A bugfix release can fix bugs in bytecode generation. See for example 
> issue27286. [1]  The part of issue33041 backported to 3.7 and 3.6 is an 
> other example. [2]  There were other examples of compatible changing the 
> bytecode. Without bumping the magic number these fixes can just not have 
> any effect if existing pyc files were generated by older compilers. But 
> bumping the magic number in a bugfix release can lead to rebuilding 
> every pyc file (even unaffected by the fix) in distributives.

Sure, but I don't think rebuilding every pyc file is a significant
problem.  It's certainly less error-prone than cherry-picking which
files need rebuilding.

Regards

Antoine.


___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Add more information in the header of pyc files

2018-04-10 Thread Serhiy Storchaka

10.04.18 18:58, Antoine Pitrou пише:

On Tue, 10 Apr 2018 18:49:36 +0300
Serhiy Storchaka 
wrote:

3. The number of compatible subversion. Currently the interpreter
supports only a single magic number. If the updated version of the
compiler produces more optimal or more correct but compatible bytecode
(like ), there is no way to say that the new bytecode is preferable, but
the old bytecode can be used too. Changing the magic number causes
invalidating all pyc files compiled by the old compiler (see [4] for the
example of problems caused by this). The header could contain two magic
numbers: the major magic number should be bumped for incompatible
changes, the minor magic number should be reset to 0 when the major
magic number is bumped, and should be bumped when the compiler become
producing different but compatible bytecode.


-1.  This is a risky move (and costly, in maintenance terms).  It's easy
to overlook subtle differencies that may translate into
incompatibilities in some production uses.  The rule « one Python
feature release == one bytecode version » is easy to remember and
understand, and is generally very well accepted.


A bugfix release can fix bugs in bytecode generation. See for example 
issue27286. [1]  The part of issue33041 backported to 3.7 and 3.6 is an 
other example. [2]  There were other examples of compatible changing the 
bytecode. Without bumping the magic number these fixes can just not have 
any effect if existing pyc files were generated by older compilers. But 
bumping the magic number in a bugfix release can lead to rebuilding 
every pyc file (even unaffected by the fix) in distributives.


[1] https://bugs.python.org/issue27286
[2] https://bugs.python.org/issue33041

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Add more information in the header of pyc files

2018-04-10 Thread Antoine Pitrou
On Tue, 10 Apr 2018 18:49:36 +0300
Serhiy Storchaka 
wrote:
> 
> 1. More stable file signature. Currently the magic number is changed in 
> every feature release. Only the third and the forth bytes are stable 
> (b'\r\n'), the first bytes are changed non-predicable. The 'py' launcher 
> and third-party software like the 'file' command should support the list 
> of magic numbers for all existing Python releases, and they can't detect 
> pyc file for future versions. There is also a chance the pyc file 
> signature will match the signature of other file type by accident. It 
> would be better if the first 4 bytes of pyc files be same for all Python 
> versions (or at least for all Python versions with the same major number).

+1.

> 2. Include the Python version. Currently the 'py' launcher needs to 
> support the table that maps magic numbers to Python version. It can 
> recognize only Python versions released before building the launcher. If 
> the two major numbers of Python version be included in the version, it 
> would not need such table.

+1.

> 3. The number of compatible subversion. Currently the interpreter 
> supports only a single magic number. If the updated version of the 
> compiler produces more optimal or more correct but compatible bytecode 
> (like ), there is no way to say that the new bytecode is preferable, but 
> the old bytecode can be used too. Changing the magic number causes 
> invalidating all pyc files compiled by the old compiler (see [4] for the 
> example of problems caused by this). The header could contain two magic 
> numbers: the major magic number should be bumped for incompatible 
> changes, the minor magic number should be reset to 0 when the major 
> magic number is bumped, and should be bumped when the compiler become 
> producing different but compatible bytecode.

-1.  This is a risky move (and costly, in maintenance terms).  It's easy
to overlook subtle differencies that may translate into
incompatibilities in some production uses.  The rule « one Python
feature release == one bytecode version » is easy to remember and
understand, and is generally very well accepted.

Regards

Antoine.


___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/