New submission from STINNER Victor <vstin...@redhat.com>:

See bpo-29708 meta issue and https://reproducible-builds.org/ for reproducible 
builds.

pyc files are not fully reproducible yet: frozenset items are not serialized in 
a deterministic order

One solution would be to modify marshal to sort frozenset items before 
serializing them. The issue is how to handle items which cannot be compared. 
Example:

>>> l=[float("nan"), b'bytes', 'unicode']
>>> l.sort()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: '<' not supported between instances of 'bytes' and 'float'

One workaround for types which cannot be compared is to use the type name in 
the key used to compare items:

>>> l.sort(key=lambda x: (type(x).__name__, x))
>>> l
[b'bytes', nan, 'unicode']

Note: comparison between bytes and str raises a BytesWarning exception when 
using python3 -bb.

Second problem: how to handle exceptions when comparison raises an error anyway?


Another solution would be to use the PYTHONHASHSEED environment variable. For 
example, if SOURCE_DATE_EPOCH is set, PYTHONHASHSEED would be set to 0. This 
option is not my favorite because it disables a security fix against denial of 
service on dict and set:
https://python-security.readthedocs.io/vuln/hash-dos.html

--

Previous discussions on reproducible frozenset:

* https://mail.python.org/pipermail/python-dev/2018-July/154604.html
* https://bugs.python.org/issue34093#msg321523

See also bpo-34093: "Reproducible pyc: FLAG_REF is not stable" and PEP 552 
"Deterministic pycs".

----------
components: Interpreter Core
messages: 347969
nosy: vstinner
priority: normal
severity: normal
status: open
title: Reproducible pyc: frozenset is not serialized in a deterministic order
versions: Python 3.9

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue37596>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to