[issue29708] support reproducible Python builds

2021-04-22 Thread Felix C. Stegerman


Felix C. Stegerman  added the comment:

Hi!  I've been working on reproducible builds for python-for-android [1,2,3].

Current issues with .pyc files are:

* .pyc files differ depending on whether Python was compiled w/ liblzma-dev 
installed or not;
* many .pyc files include build paths;
* some .pyc files include paths to system utilities, like `/bin/mkdir` or 
`/usr/bin/install`, which can differ between systems (e.g. on Debian w/ merged 
/usr).

[1] https://github.com/kivy/python-for-android/pull/2390
[2] 
https://lists.reproducible-builds.org/pipermail/rb-general/2021-January/002132.html
[3] 
https://lists.reproducible-builds.org/pipermail/rb-general/2021-March/002207.html

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2021-04-22 Thread Felix C. Stegerman


Change by Felix C. Stegerman :


--
nosy: +obfusk

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2021-02-03 Thread Steve Dower


Steve Dower  added the comment:

This doesn't seem to necessarily impact distutils, so I'm leaving it open 
despite PEP 632.

--
components:  -Distutils
dependencies:  -Reproducible pyc: FLAG_REF is not stable., Reproducible pyc: 
frozenset is not serialized in a deterministic order
nosy: +steve.dower

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2021-01-04 Thread Brett Cannon


Change by Brett Cannon :


--
nosy:  -brett.cannon

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2020-12-31 Thread Benjamin Peterson


Benjamin Peterson  added the comment:

PEP 552 was a necessary but not sufficient step on the road towards fully 
deterministic pycs. The PEP says: "(Note there are other problems [1] [2] we do 
not address here that can make pycs non-deterministic.)" where [1] and [2] are 
basically the issues Inada-san has linked.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2020-12-31 Thread STINNER Victor


STINNER Victor  added the comment:

> note the optimized .pyc is deterministic. As far as I know only __debug__ is 
> set to False, or is there something else different?

Hum, maybe there is a misunderstanding on the PEP 552 purpose.

I understood that the main point of the PEP 552 is to compare hash(), rather than checking the .py and .pyc file modification time.

It doesn't magically make the PYC file content fully reproducible. Correct me 
if I misunderstood PEP 552 as well.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2020-12-31 Thread Inada Naoki


Change by Inada Naoki :


--
dependencies: +Reproducible pyc: FLAG_REF is not stable.

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2020-12-31 Thread Inada Naoki


Inada Naoki  added the comment:

> note the optimized .pyc is deterministic. As far as I know only __debug__ is 
> set to False, or is there something else different?

There is no difference between normal pyc and optimized pyc.

* frozenset is deterministic if PYTHONHASHSEED is set
* FLAG_REF is unstable. It is based on reference count but it is changed by 
various environment (environment variables, build path, order of py files, and 
any other thing using interned strings). bpo-30493 must be fixed.

--
versions: +Python 3.10 -Python 3.9

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2020-12-31 Thread Frederik Rietdijk


Frederik Rietdijk  added the comment:

note the optimized .pyc is deterministic. As far as I know only __debug__ is 
set to False, or is there something else different?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2020-12-30 Thread Inada Naoki


Inada Naoki  added the comment:

See bpo-34093 too.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2020-12-30 Thread STINNER Victor


STINNER Victor  added the comment:

> tiny bytecode differences

bpo-37596 "Reproducible pyc: frozenset is not serialized in a deterministic 
order" is not fixed yet.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2020-12-30 Thread Frederik Rietdijk


Frederik Rietdijk  added the comment:

Building Python packages reproducibly has now basically been resolved with the 
reproducible bytecode as well as changes in tools such as pip.

Unfortunately, the interpreters do not yet seem to be reproducible. After 
certain changes, a Nixpkgs build of 3.9 shows several tiny bytecode 
differences. What could have caused these differences? Please see the attached 
diffoscope report. 

As part of installation all bytecode is force regenerated using compileall. 
This is using the default checked-hash.

--
nosy: +Frederik Rietdijk
versions: +Python 3.9 -Python 3.7
Added file: https://bugs.python.org/file49708/python39_2.html

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2020-10-22 Thread Will Thompson


Change by Will Thompson :


--
nosy:  -Will Thompson

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2020-10-22 Thread Éric Araujo

Change by Éric Araujo :


--
dependencies:  -setup.py sdist --format=gztar should use (equivalent of) `gzip 
-n`, setup.py sdist should honor SOURCE_DATE_EPOCH

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2020-10-22 Thread Éric Araujo

Change by Éric Araujo :


--
dependencies: +Reproducible pyc: frozenset is not serialized in a deterministic 
order, setup.py sdist --format=gztar should use (equivalent of) `gzip -n`, 
setup.py sdist should honor SOURCE_DATE_EPOCH

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2020-04-08 Thread Jeffery To


Change by Jeffery To :


--
nosy: +jefferyto

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2019-07-15 Thread STINNER Victor


STINNER Victor  added the comment:

I created bpo-37596 "Reproducible pyc: frozenset is not serialized in a 
deterministic order".

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2018-11-28 Thread miss-islington


miss-islington  added the comment:


New changeset 24b51b1a4919e310d338629cc60371387f475a32 by Miss Islington (bot) 
in branch '3.7':
bpo-34022: Stop forcing of hash-based invalidation with SOURCE_DATE_EPOCH 
(GH-9607)
https://github.com/python/cpython/commit/24b51b1a4919e310d338629cc60371387f475a32


--
nosy: +miss-islington

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2018-11-28 Thread miss-islington


Change by miss-islington :


--
pull_requests: +10019

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2018-11-13 Thread Sascha Silbe


Change by Sascha Silbe :


--
nosy: +sascha_silbe

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2018-10-10 Thread STINNER Victor


STINNER Victor  added the comment:


New changeset a6b3ec5b6d4f6387820fccc570eea08b9615620d by Victor Stinner (Elvis 
Pranskevichus) in branch 'master':
bpo-34022: Stop forcing of hash-based invalidation with SOURCE_DATE_EPOCH 
(GH-9607)
https://github.com/python/cpython/commit/a6b3ec5b6d4f6387820fccc570eea08b9615620d


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2018-09-27 Thread Elvis Pranskevichus


Change by Elvis Pranskevichus :


--
pull_requests: +9005

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2018-07-03 Thread Bernhard M. Wiedemann


Bernhard M. Wiedemann  added the comment:

also related to this topic: https://github.com/pypa/pip/pull/5525 for pip's 
RECORD file.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2018-07-03 Thread STINNER Victor


STINNER Victor  added the comment:

I created bpo-34033: distutils is not reproducible.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2018-07-03 Thread Matej Cepl


Change by Matej Cepl :


--
nosy: +mcepl

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2018-07-03 Thread STINNER Victor


STINNER Victor  added the comment:

I created PR 8057 to upstream distutils-reproducible-compile.patch from 
OpenSUSE (context: see bpo-34022).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2018-07-03 Thread STINNER Victor


Change by STINNER Victor :


--
pull_requests: +7666
stage:  -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2018-03-07 Thread Alexandru Ardelean

Alexandru Ardelean  added the comment:

PYTHONHASHSEED does help on 3.6.4
I'll use it during build.

Thanks for help

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2018-03-07 Thread INADA Naoki

INADA Naoki  added the comment:

3e 02 00 00 00 is frozenset(size=2)
72 b6/b5 00 00 00 is reference to b5 or b6

So it seems set order changed. (or items in the set is appearance order is 
changed.)
Did you set PYTHONHASHSEED?

Anyway, I think Python 3.7 can't guarantee "reproducible" compile because 
marshal uses reference count.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2018-03-06 Thread Alexandru Ardelean

Alexandru Ardelean  added the comment:

Yeah, I also see it with 3.6.4.
I wanted to try 3.7 to see if it's fixed by chance.

Otherwise I may have to start digging deep into compilation logic.

Looking here:
https://tests.reproducible-builds.org/lede/lede_ar71xx.html

More specifically here:
https://tests.reproducible-builds.org/lede/dbd/packages/mips_24kc/packages/python3-asyncio_3.6.4-5_mips_24kc.ipk.html
it looks like 2 byte-codes are inverted

build1: 
7f80:​·​0100·​003e·​0200·​·​72b6·​·​0072·​b500·​·​.​.​.​>.​.​.​.​r.​.​.​.​r.​.​
build2: 
7f80:​·​0100·​003e·​0200·​·​72b5·​·​0072·​b600·​·​.​.​.​>.​.​.​.​r.​.​.​.​r.​.​

72b6 and 72b5 like to swap positions sometimes.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2018-03-06 Thread Will Thompson

Will Thompson  added the comment:

For what it's worth, in Endless OS we still saw slight variations between 
builds in the .pyc files, even with all the source files' mtimes set to the 
epoch (ie. equivalent to setting & supporting SOURCE_DATE_EPOCH, I believe). 
Looking at the contents of the file suggested it was just reordering of class 
fields; indeed, we only saw this on Python versions where hash randomization is 
enabled by default, and disabling hash randomization made the output 
reproducible.

--
nosy: +Will Thompson

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2018-01-31 Thread Bernhard M. Wiedemann

Bernhard M. Wiedemann  added the comment:

Any chance we can get the (somewhat related) patch for 
https://bugs.python.org/issue30693 also merged?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2018-01-30 Thread Brett Cannon

Change by Brett Cannon :


--
assignee: brett.cannon -> 
stage: resolved -> 

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2018-01-25 Thread Brett Cannon

Change by Brett Cannon :


--
resolution: fixed -> 
status: closed -> open

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2018-01-25 Thread Alexandru Ardelean

Alexandru Ardelean  added the comment:

Hey,

Sorry, if I'm a bit late to the party with this.
The road to reproducible builds has a few more steps.

The way I validate whether Python is reproducible is with this link:
https://tests.reproducible-builds.org/lede/lede_ar71xx.html

There is a need to also patch getbuildinfo.c to make Python reproducible.

I have opened a PR for this : https://github.com/python/cpython/pull/5313

I've waited for the periodic build to trigger on that reproducible page.
In OpenWrt, the packages to look for [that is affected by this getbuildinfo.c 
patch] are python-base & python3-base.

There are still some python3 packages that need patching.
Seems that python3-asyncio, pydoc, and some other pyc files need investigation.
I'll check.
Maybe this isn't an issue in 3.7.

Alex

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2018-01-24 Thread Alexandru Ardelean

Change by Alexandru Ardelean :


--
pull_requests: +5159

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2018-01-24 Thread Brett Cannon

Brett Cannon  added the comment:


New changeset cab0b2b053970982b760048acc3046363615a8dd by Brett Cannon in 
branch 'master':
bpo-29708: Add What's New entries for SOURCE_DATE_EPOCH and py_compile (GH-5306)
https://github.com/python/cpython/commit/cab0b2b053970982b760048acc3046363615a8dd


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2018-01-24 Thread Brett Cannon

Change by Brett Cannon :


--
pull_requests: +5153

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2018-01-24 Thread Brett Cannon

Brett Cannon  added the comment:

Just merged Bernhard's PR which forces hash-based .pyc files. Thanks to 
everyone who constructively helped reach this point.

--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2018-01-24 Thread Brett Cannon

Brett Cannon  added the comment:


New changeset ccbe5818af20f8c12043f5c30c277a74714405e0 by Brett Cannon 
(Bernhard M. Wiedemann) in branch 'master':
bpo-29708: Setting SOURCE_DATE_EPOCH forces hash-based .pyc files (GH-5200)
https://github.com/python/cpython/commit/ccbe5818af20f8c12043f5c30c277a74714405e0


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2018-01-20 Thread Chih-Hsuan Yen

Change by Chih-Hsuan Yen :


--
nosy:  -yan12125

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2018-01-19 Thread Brett Cannon

Brett Cannon  added the comment:

Since Barry chose an option that wasn't listed, I'm planning on accepting 
Bernhard's https://github.com/python/cpython/pull/5200 at some point next week 
barring any new, unique objections.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2018-01-16 Thread Bernhard M. Wiedemann

Change by Bernhard M. Wiedemann :


--
keywords: +patch
pull_requests: +5054
stage:  -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2018-01-15 Thread Barry A. Warsaw

Barry A. Warsaw  added the comment:

On Jan 15, 2018, at 11:31, Brett Cannon  wrote:
> 
> 1. SOURCE_DATE_EPOCH acts as an environment variable flag to forcibly 
> generate hash-based .pyc files with the check_source bit set in py_compile 
> and compileall
> 2. SOURCE_DATE_EPOCH is used to specifically set the timestamp in .pyc files 
> in py_compile and compileall

I’d suggest that if SDE is set to an integer, that is used as the timestamp.  
If it’s set to a special symbol (e.g. ‘hash’) then the hash is used.  I’m not 
volunteering to write the code though. :)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2018-01-15 Thread Brett Cannon

Brett Cannon  added the comment:

Bernhard's idea of SOURCE_DATE_EPOCH being an implicit envvar to forcibly 
switch on hash-based .pyc files in py_compile is intriguing. I assume this 
would force the check_source bit to be set? Or since SOURCE_DATE_EPOCH should 
only be used in build scenarios would you want UNCHECKED_HASH?

As the core dev who seems the most engaged and willing to commit this, I'm 
willing to make the final decision on this and commit the final PR. I see the 
options of getting this into 3.7 as the following:

1. SOURCE_DATE_EPOCH acts as an environment variable flag to forcibly generate 
hash-based .pyc files with the check_source bit set in py_compile and compileall
2. SOURCE_DATE_EPOCH is used to specifically set the timestamp in .pyc files in 
py_compile and compileall

That's it. No clamping, no changing how timestamp-based .pyc files are 
invalidated, no touching source files, etc.

If this is going to make it into Python 3.7 then a decision must be made by 
Friday, Jan 19, so have your opinions on those two options in before then (and 
in the case of the hash-based solution, would you expect CHECKED_HASH or 
UNCHECKED_HASH?). At that point I will make a decision and Bernhard can either 
update his PR or I can create a new one forked from his(I leave that up to 
Bernhard based on the decision I'll make on/by Friday).

--
assignee:  -> brett.cannon
versions:  -Python 2.7, Python 3.3, Python 3.4, Python 3.5, Python 3.6

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2018-01-15 Thread Bernhard M. Wiedemann

Bernhard M. Wiedemann  added the comment:

I think, there is no single nice and clean solution with time-based .pyc files, 
but to get a whole distribution to build reproducibly, there are two other ways:

1) if the SOURCE_DATE_EPOCH environment variable is set,
make hash-based .pyc files the default.

2) instead of storing .py mtime in the .pyc header, use the .pyc's filesystem 
mtime value - also making it more available to users.
Not sure if this would have side-effects or cause regressions.

on the side-issue: IMHO checking exact mtimes is the right thing to do, because 
sometimes users will copy back old .py files and expect mismatching .pyc files 
to not be used.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2018-01-14 Thread Brett Cannon

Brett Cannon  added the comment:

As Eli's comments are coming off as negative to/at me, I feel like I have
to defend myself here. If you look at the commit there was actually two
places where the timestamp was checked; one did an equality comparison and
one did a >= comparison. It's quite possible the semantics accidentally
changed as part of the refactoring due to the check being done in different
places and a different one was copied, although no one has even noticed
until now.

If there is a desire to change the semantics of how timestamps are checked
then that should be done in a separate issue as at this point we have lived
with the current semantics for several releases -- all releases of Python 3
still receiving security updates -- so it's passed being a bug and is now
the semantics in Python 3.

On Sat, Jan 13, 2018, 16:57 Eli Schwartz,  wrote:

>
> Eli Schwartz  added the comment:
>
> So, a couple of things.
>
> It seems to me, that properly supporting SOURCE_DATE_EPOCH means using
> exactly that and nothing else. To that end, I'm not entirely sure why
> things like --clamp-mtime even exist, as the original timestamp of a source
> file doesn't seem to have a lot of utility and it is better to be entirely
> predictable. But I'm not going to argue that, except insomuch as it seems
> IMHO to fit better for python to just keep things simple and override the
> timestamp with the value of SOURCE_DATE_EPOCH
>
> That being said, I see two problems with python implementing something
> analogous to --clamp-mtime rather than just --mtime.
>
>
> 1) Source files are extracted by some build process, and remain untouched.
> Python generates bytecode pinned to the original time, rather than
> SOURCE_DATE_EPOCH. Later, the build process packages those files and
> implements --mtime, not --clamp-mtime. Because Python and the packaging
> software disagree about which one to use, the bytecode fails.
>
> 2) Source files are extracted, and the build process even tosses all
> timestamps to the side of the road, by explicitly `touch`ing all of them to
> the date of SOURCE_DATE_EPOCH just in case. Then for whatever reason
> (distro patches, 2to3, the use of `cp`) the timestamps get updated to
> $currentime. But SOURCE_DATE_EPOCH is in the future, so the timestamps get
> downdated. Python bytecode is generated by emulating --clamp-mtime. The
> build process then uses --mtime to package the files. Again, because Python
> and the packaging software disagree about which one to use, the bytecode
> fails.
>
> Of course, in both those cases, blindly respecting SOURCE_DATE_EPOCH will
> seemingly break everything for people who use --clamp-mtime instead. I'm
> not happy with reproducible-builds.org for allowing either one.
>
> I don't think python should rely on --mtime users manually overriding the
> filesystem metadata of the source files outside of py_compile, as that is a
> hack that I think we'd like to remove if possible... that being said, Arch
> Linux will, on second thought, not be adversely affected even if py_compile
> tries to be clever and emulate --clamp-mtime to decide on its own whether
> to respect SOURCE_DATE_EPOCH.
>
> Likewise, I don't really expect people to try to reproduce builds using a
> future date for SOURCE_DATE_EPOCH. On the other hand, the reproducible
> builds spec doesn't forbid it AFAICT.
>
> But... neither of those mitigations seem "clean" to me, for the reasons
> stated above.
>
> There is something that would solve all these issues, though. From reading
> the importlib code (I haven't actually tried smoketesting actual imports),
> it appears that Python 2 accepts any bytecode that is dated at or later
> than the timestamp of its source .py, while Python 3 requires the
> timestamps to perfectly match. This seems bizarre to behave differently,
> especially as until @bmwiedemann mentioned it on the GitHub PR I blindly
> assumed that Python would not care if your bytecode is somehow dated later
> than your sources. If the user is playing monkey games with mismatched
> source and byte code, while backdating the source code to *trick* the
> interpreter into loading it... let them? They can break their stuff if they
> want to!
>
> On looking through the commit logs, it seems that Python 3 used to do the
> same, until
> https://github.com/python/cpython/commit/61b14251d3a653548f70350acb250cf23b696372
> refactored the general vicinity and modified this behavior without warning.
> In a commit that seems to be designed to do something else entirely. This
> really should have been two separate commits, and modifying the import code
> to more strictly check the timestamp should have come with an explanatory
> justification. Because I cannot think of a good reason for this behavior,
> and the commit isn't giving me an opportunity to understand either. As it
> is, I am completely confused, and have no idea whether this was even
> supposed to be deliberate.
> In hindsight it is certainly preventing nice solut

[issue29708] support reproducible Python builds

2018-01-13 Thread Eli Schwartz

Eli Schwartz  added the comment:

So, a couple of things.

It seems to me, that properly supporting SOURCE_DATE_EPOCH means using exactly 
that and nothing else. To that end, I'm not entirely sure why things like 
--clamp-mtime even exist, as the original timestamp of a source file doesn't 
seem to have a lot of utility and it is better to be entirely predictable. But 
I'm not going to argue that, except insomuch as it seems IMHO to fit better for 
python to just keep things simple and override the timestamp with the value of 
SOURCE_DATE_EPOCH

That being said, I see two problems with python implementing something 
analogous to --clamp-mtime rather than just --mtime.


1) Source files are extracted by some build process, and remain untouched. 
Python generates bytecode pinned to the original time, rather than 
SOURCE_DATE_EPOCH. Later, the build process packages those files and implements 
--mtime, not --clamp-mtime. Because Python and the packaging software disagree 
about which one to use, the bytecode fails.

2) Source files are extracted, and the build process even tosses all timestamps 
to the side of the road, by explicitly `touch`ing all of them to the date of 
SOURCE_DATE_EPOCH just in case. Then for whatever reason (distro patches, 2to3, 
the use of `cp`) the timestamps get updated to $currentime. But 
SOURCE_DATE_EPOCH is in the future, so the timestamps get downdated. Python 
bytecode is generated by emulating --clamp-mtime. The build process then uses 
--mtime to package the files. Again, because Python and the packaging software 
disagree about which one to use, the bytecode fails.

Of course, in both those cases, blindly respecting SOURCE_DATE_EPOCH will 
seemingly break everything for people who use --clamp-mtime instead. I'm not 
happy with reproducible-builds.org for allowing either one.

I don't think python should rely on --mtime users manually overriding the 
filesystem metadata of the source files outside of py_compile, as that is a 
hack that I think we'd like to remove if possible... that being said, Arch 
Linux will, on second thought, not be adversely affected even if py_compile 
tries to be clever and emulate --clamp-mtime to decide on its own whether to 
respect SOURCE_DATE_EPOCH.

Likewise, I don't really expect people to try to reproduce builds using a 
future date for SOURCE_DATE_EPOCH. On the other hand, the reproducible builds 
spec doesn't forbid it AFAICT.

But... neither of those mitigations seem "clean" to me, for the reasons stated 
above.

There is something that would solve all these issues, though. From reading the 
importlib code (I haven't actually tried smoketesting actual imports), it 
appears that Python 2 accepts any bytecode that is dated at or later than the 
timestamp of its source .py, while Python 3 requires the timestamps to 
perfectly match. This seems bizarre to behave differently, especially as until 
@bmwiedemann mentioned it on the GitHub PR I blindly assumed that Python would 
not care if your bytecode is somehow dated later than your sources. If the user 
is playing monkey games with mismatched source and byte code, while backdating 
the source code to *trick* the interpreter into loading it... let them? They 
can break their stuff if they want to!

On looking through the commit logs, it seems that Python 3 used to do the same, 
until 
https://github.com/python/cpython/commit/61b14251d3a653548f70350acb250cf23b696372
 refactored the general vicinity and modified this behavior without warning. In 
a commit that seems to be designed to do something else entirely. This really 
should have been two separate commits, and modifying the import code to more 
strictly check the timestamp should have come with an explanatory 
justification. Because I cannot think of a good reason for this behavior, and 
the commit isn't giving me an opportunity to understand either. As it is, I am 
completely confused, and have no idea whether this was even supposed to be 
deliberate.
In hindsight it is certainly preventing nice solutions to supporting 
SOURCE_DATE_EPOCH.

--
nosy: +eschwartz

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2018-01-12 Thread Brett Cannon

Brett Cannon  added the comment:

A disagreement has popped up over what the ideal solution is on the PR 
currently connected to this issue. I'm having the folks involved switch it over 
to here.

IMO I think py_compile can respect SOURCE_DATE_EPOCH and just blindly use it 
for creating .pyc files. That way builds are reproducible. Yes, it will quite 
possibly lead to those .pyc files being regenerated the instant Python starts 
running, but SOURCE_DATE_EPOCH is entirely about builds, not runtimes. Plus 
.pyc files are just optimizations and so it is not critical they not be 
regenerated again later.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2018-01-03 Thread Antoine Pitrou

Change by Antoine Pitrou :


--
nosy: +Ray Donnelly

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2018-01-03 Thread Alexandru Ardelean

Alexandru Ardelean  added the comment:

Thank you for the heads-up.
I did not follow-up too in-depth on the resolution.

I just stumbled over this last night.

Will keep an eye for 3.7, and see about 2.7.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2018-01-02 Thread Benjamin Peterson

Benjamin Peterson  added the comment:

PEP 552 has been implemented for 3.7.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2018-01-02 Thread Alexandru Ardelean

Alexandru Ardelean  added the comment:

Hey,

Allow me to join the discussion here.

Context:
- I'm the maintainer of Python & Python3 in the OpenWrt distro, and (since a 
while) we also care about reproducible builds.
- The person [Alexander Couzens] who's leading the effort for OpenWrt, has 
pinged me about Python(3) and packages [to see about making them reproducible]
- In OpenWrt we *only* ship .pyc files, because of performance considerations 
[.pyc can be 10x faster than .py on some SoCs], and size limitation [we cannot 
allow auto .pyc generation since it can be expensive on RAM [ < 32 MB systems ] 
or flash [ ~8 MB sizes ] ; believe it or not, people run Python on something 
like this

Current status:
- so far I've implemented a simple change to Python & Python3 here:
  
https://github.com/openwrt/packages/pull/5303/commits/1b6dd4781f901a769718c49f6f255c15fd376f6e
- that has improved reproduce-ability quite a bit : only binaries are not 
reproduce-able now
- when I did this [1-2 weeks ago] I did not think of checking of any bug/issue 
opened here [ I only thought if this now ]
- I only checked what other distros may do regarding Python:
  https://tests.reproducible-builds.org/debian/reproducible.html

References:
- initial discussion on OpenWrt: https://github.com/openwrt/packages/issues/5278
- PR with discussion: https://github.com/openwrt/packages/pull/5303
- current OpenWrt reproducible state [with the patch applied]: 
https://tests.reproducible-builds.org/lede/lede_ar71xx.html

I wanted to share my [and our] interest in this.

If we can help in any way, feel free to ping.

I will try to hack/patch some more stuff in the current Python releases to make 
them fully reproducible [for us], and probably share the results here.
When PEP 552 gets implemented and there will be a Python we will switch to them.
Atm, in trunk we package Python 2.7.14 & Python 3.6.4

Thanks
Alex

--
nosy: +Alexandru Ardelean

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2017-09-05 Thread Benjamin Peterson

Benjamin Peterson added the comment:

I have proposed PEP 552 to address this issue.

--
nosy: +benjamin.peterson

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2017-08-31 Thread STINNER Victor

Changes by STINNER Victor :


--
nosy: +haypo

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2017-03-06 Thread Brett Cannon

Changes by Brett Cannon :


--
nosy: +brett.cannon

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2017-03-04 Thread Chi Hsuan Yen

Changes by Chi Hsuan Yen :


--
nosy: +Chi Hsuan Yen

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2017-03-03 Thread Bernhard M. Wiedemann

Bernhard M. Wiedemann added the comment:

backports are optional.
It can help reduce duplicated work for the various distributions.
Currently, I think master and 2.7 are the most relevant targets.

--
versions: +Python 3.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2017-03-03 Thread Barry A. Warsaw

Barry A. Warsaw added the comment:

Shouldn't this at least also cover Python 3.7?  And should it be officially 
backported?  I would think that if https://github.com/python/cpython/pull/296 
gets accepted for 3.7, then distros that care can cherry pick it back into 
whatever versions they still support.  It probably needn't be officially cherry 
picked upstream.

(FWIW, this doesn't affect the Debian ecosystem since we don't ship pycs in 
debs.)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2017-03-03 Thread Barry A. Warsaw

Changes by Barry A. Warsaw :


--
nosy: +barry

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2017-03-03 Thread Eric V. Smith

Eric V. Smith added the comment:

--
Eric.

> On Mar 3, 2017, at 6:36 AM, Bernhard M. Wiedemann  
> wrote:
> 
> 
> New submission from Bernhard M. Wiedemann:
> 
> See https://reproducible-builds.org/ and 
> https://reproducible-builds.org/docs/buy-in/ for why this is a good thing to 
> have in general.
> 
> Fedora, openSUSE and possibly other Linux distributions package .pyc files as 
> part of their binary rpm packages and they are not trivial to drop [1].
> 
> A .pyc header includes the timestamp of the source .py file
> which creates non-reproducible builds when the .py file is touched during 
> build time (e.g. for a version.py).
> As of 2017-02-10 in openSUSE Factory this affected 476 packages (such as 
> python-amqp and python3-Twisted).
> 
> 
> [1] http://lists.opensuse.org/opensuse-packaging/2017-02/msg00086.html
> 
> --
> components: Build, Distutils
> messages: 20
> nosy: bmwiedemann, dstufft, merwok
> priority: normal
> pull_requests: 353
> severity: normal
> status: open
> title: support reproducible Python builds
> versions: Python 2.7, Python 3.3, Python 3.4, Python 3.5, Python 3.6
> 
> ___
> Python tracker 
> 
> ___
> ___
> New-bugs-announce mailing list
> new-bugs-annou...@python.org
> https://mail.python.org/mailman/listinfo/new-bugs-announce
>

--
nosy: +eric.smith

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29708] support reproducible Python builds

2017-03-03 Thread Bernhard M. Wiedemann

New submission from Bernhard M. Wiedemann:

See https://reproducible-builds.org/ and 
https://reproducible-builds.org/docs/buy-in/ for why this is a good thing to 
have in general.

Fedora, openSUSE and possibly other Linux distributions package .pyc files as 
part of their binary rpm packages and they are not trivial to drop [1].

A .pyc header includes the timestamp of the source .py file
which creates non-reproducible builds when the .py file is touched during build 
time (e.g. for a version.py).
As of 2017-02-10 in openSUSE Factory this affected 476 packages (such as 
python-amqp and python3-Twisted).


[1] http://lists.opensuse.org/opensuse-packaging/2017-02/msg00086.html

--
components: Build, Distutils
messages: 20
nosy: bmwiedemann, dstufft, merwok
priority: normal
pull_requests: 353
severity: normal
status: open
title: support reproducible Python builds
versions: Python 2.7, Python 3.3, Python 3.4, Python 3.5, Python 3.6

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com