Re: RFC: Python minimization in Fedora
On 15. 01. 20 23:59, Zbigniew Jędrzejewski-Szmek wrote: ### Solution 5: Stop shipping mandatory bytecode cache This solution sounds simple: We do no longer ship the bytecode cache mandatorily. Technically, we move the `.pyc` files to a subpackage of `python3-libs` (or three different subpackages, that is not important here). And we only*Recommend* them from `python3-libs` -- by default, the users get them, but for space critical Fedora flavors (such as container images) the maintainers can opt-out and so can the powerusers. This would **save 18.6 MiB / 50%** -- quite a lot. However, as said earlier, if the bytecode cache files are not there, Python attempts to create them upon first import. That can result in several problems, here we will try to propose how to workaround them. Below using a flag file in each __pycache__ directory is suggested. What about a different route: having a flag file for all descendants of a directory? For example, /usr/lib/python3.8/.dont_write_bytecode would cover all modules under /usr/lib/python3.8/. If a .pyc file is present, python could still make use of it. This would be a nicer solution because it wouldn't require modifying individual packages, but would still avoid the selinux issues and slowdowns from failed attempts to write the optimized files. The __pycache__ files wouldn't need to exist at all. To follow up on this, I got an idea recently. If we add a reason to this marker file, Python can warn properly, without distro-specific patches. Something like: echo "Install python3-libs-bytecode-opt-0 or python3-libs-bytecode-opt-1 to get the cache." > /usr/lib64/python3.8/.dont_write_bytecode python -0 ... Warning: Bytecode cashe for the selected optimization level was not found and the /usr/lib64/python3.8/.dont_write_bytecode file prevents it to be created. Python startup and imports may be slower. Install python3-libs-bytecode-opt-0 or python3-libs-bytecode-opt-1 to get the cache. -- Miro Hrončok -- Phone: +420777974800 IRC: mhroncok ___ python-devel mailing list -- python-devel@lists.fedoraproject.org To unsubscribe send an email to python-devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/python-devel@lists.fedoraproject.org
Re: RFC: Python minimization in Fedora
On 15. 01. 20 23:59, Zbigniew Jędrzejewski-Szmek wrote: ### Solution 5: Stop shipping mandatory bytecode cache This solution sounds simple: We do no longer ship the bytecode cache mandatorily. Technically, we move the `.pyc` files to a subpackage of `python3-libs` (or three different subpackages, that is not important here). And we only*Recommend* them from `python3-libs` -- by default, the users get them, but for space critical Fedora flavors (such as container images) the maintainers can opt-out and so can the powerusers. This would **save 18.6 MiB / 50%** -- quite a lot. However, as said earlier, if the bytecode cache files are not there, Python attempts to create them upon first import. That can result in several problems, here we will try to propose how to workaround them. Below using a flag file in each __pycache__ directory is suggested. What about a different route: having a flag file for all descendants of a directory? For example, /usr/lib/python3.8/.dont_write_bytecode would cover all modules under /usr/lib/python3.8/. If a .pyc file is present, python could still make use of it. This would be a nicer solution because it wouldn't require modifying individual packages, but would still avoid the selinux issues and slowdowns from failed attempts to write the optimized files. The __pycache__ files wouldn't need to exist at all. To follow up on this, I got an idea recently. If we add a reason to this marker file, Python can warn properly, without distro-specific patches. Something like: echo "Install python3-libs-bytecode-opt-0 or python3-libs-bytecode-opt-1 to get the cache." > /usr/lib64/python3.8/.dont_write_bytecode python -0 ... Warning: Bytecode cashe for the selected optimization level was not found and the /usr/lib64/python3.8/.dont_write_bytecode file prevents it to be created. Python startup and imports may be slower. Install python3-libs-bytecode-opt-0 or python3-libs-bytecode-opt-1 to get the cache. -- Miro Hrončok -- Phone: +420777974800 IRC: mhroncok ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: RFC: Python minimization in Fedora
On Sat, Jan 18, 2020 at 03:35:29PM -0500, James Cassell wrote: > > On Thu, Jan 16, 2020, at 5:16 AM, Zbigniew Jędrzejewski-Szmek wrote: > > > > A quick benchmark: > > $ time python3 -c 'import importlib as i, pydoc_data.topics as t; > > [i.reload(t) for _ in range(1)]' > > python3 -c 4.16s user 0.45s system 99% cpu 4.646 total > [...] > > sudo rm /usr/lib64/python3.7/pydoc_data/__pycache__/topics.cpython-37.* > > > > $ time python3 -c 'import importlib as i, pydoc_data.topics as t; > > [i.reload(t) for _ in range(1000)]' > > python3 -c 13.73s user 0.46s system 96% cpu 14.728 total > [...] > > But the effect of having *some* .pyc file is not. For this file (which > > is 600+kb), the difference is 147.28/4.646 ≈ 30 times. So we clearly > > need to keep the possibility of installing .pyc files, at least optionally. > > > > Thanks for doing these benchmarks! I think you misplaced a decimal in the > analysis, though; it's closer to 3 times performance difference, not 30 > times. (Unless I missed something.) The number of loops is different (10k vs 1k), so the ratio I posted is correct. Zbyszek ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: RFC: Python minimization in Fedora
On Thu, Jan 16, 2020, at 5:16 AM, Zbigniew Jędrzejewski-Szmek wrote: > > A quick benchmark: > $ time python3 -c 'import importlib as i, pydoc_data.topics as t; > [i.reload(t) for _ in range(1)]' > python3 -c 4.16s user 0.45s system 99% cpu 4.646 total [...] > sudo rm /usr/lib64/python3.7/pydoc_data/__pycache__/topics.cpython-37.* > > $ time python3 -c 'import importlib as i, pydoc_data.topics as t; > [i.reload(t) for _ in range(1000)]' > python3 -c 13.73s user 0.46s system 96% cpu 14.728 total [...] > But the effect of having *some* .pyc file is not. For this file (which > is 600+kb), the difference is 147.28/4.646 ≈ 30 times. So we clearly > need to keep the possibility of installing .pyc files, at least optionally. > Thanks for doing these benchmarks! I think you misplaced a decimal in the analysis, though; it's closer to 3 times performance difference, not 30 times. (Unless I missed something.) V/r, James Cassell ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: RFC: Python minimization in Fedora
> On 15 Jan 2020, at 17:05, Miro Hrončok wrote: > > In Python Maint, we sat down and we came up with several ideas how to > minimize the filesystem footprint of Python. Unfortunately, the result is > horribly long, sorry about that. Did you calculate file sizes including rounding up by the "filesystem block size" (statvfs f_bsize)? What was the f_bsize of the file system you collected stats on? The work to stop needing libpython is going to drop the number of files by 1 for the min install. Can you link some of the .so's from stdlib into the main python image? If all stdlib .so are linked into the main python and you have .zip for the .py/.pyc files you can get python down to a handful of files. Python app making software often exploits a trick that you can concatenate the .zip on the end of the python image. I'm guessing that would break too many of the constraints. I'm not sure how you would do it but what if you created a SquashFS image for python to lose the f_bsize overhead and use its compression? Today python has 2 optimised file types. But the python devs have been talking about ways to implement more optimisations and cache those results as well. I'll failed to track down the discussion on python dev. I recall wanting to reduce the file size by storing line number data for traceback outside of the .pyc. Barry ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: RFC: Python minimization in Fedora
On 16. 01. 20 21:55, David Malcolm wrote: If a traceback for an exception includes files from the .zip, can the traceback-printing machinery still print the pertinent lines of source? Apparently no: $ echo 0/0 > t.py $ zip t.zip t.py adding: t.py (stored 0%) $ python -c 'import t' Traceback (most recent call last): File "", line 1, in File "/home/churchyard/tmp/test/t.py", line 1, in 0/0 ZeroDivisionError: division by zero $ rm t.py $ python -c 'import sys; sys.path.insert(0, "t.zip"); import t' Traceback (most recent call last): File "", line 1, in File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 668, in _load_unlocked File "", line 638, in _load_backward_compatible File "t.zip/t.py", line 1, in ZeroDivisionError: division by zero That's bad UX. But maybe something that can be fixed in Python? -- Miro Hrončok -- Phone: +420777974800 IRC: mhroncok ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: RFC: Python minimization in Fedora
Le jeudi 16 janvier 2020 à 22:00 +0100, Felix Schwarz a écrit : > > If I understood Nicolas correctly this was about installing multiple > versions > of the same *library* in the global Python site-packages directory? Whatever you wish to call it :) The non stdlib parts projects do not seem to agree on, forcing venv use Regards, -- Nicolas Mailhot ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: RFC: Python minimization in Fedora
Am 16.01.20 um 21:15 schrieb Zbigniew Jędrzejewski-Szmek: >> Accommodating component versioning would mean deploying >> >> /usr/lib/pythonxx/site-packages/something-semver.zip > > This path includes xx, which contains the major and minor numbers. So > adding "semver" would only allow accommodating different patch levels. > Would that be useful? Different patch levels are supposed to be about > bug fix only changes, so there's usually very little reason to carry > anything except the latest one for any specific major.minor combination. If I understood Nicolas correctly this was about installing multiple versions of the same *library* in the global Python site-packages directory? Felix ___ python-devel mailing list -- python-devel@lists.fedoraproject.org To unsubscribe send an email to python-devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/python-devel@lists.fedoraproject.org
Re: RFC: Python minimization in Fedora
Am 16.01.20 um 21:15 schrieb Zbigniew Jędrzejewski-Szmek: >> Accommodating component versioning would mean deploying >> >> /usr/lib/pythonxx/site-packages/something-semver.zip > > This path includes xx, which contains the major and minor numbers. So > adding "semver" would only allow accommodating different patch levels. > Would that be useful? Different patch levels are supposed to be about > bug fix only changes, so there's usually very little reason to carry > anything except the latest one for any specific major.minor combination. If I understood Nicolas correctly this was about installing multiple versions of the same *library* in the global Python site-packages directory? Felix ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: RFC: Python minimization in Fedora
On Thu, 2020-01-16 at 10:27 +0100, Miro Hrončok wrote: > On 15. 01. 20 23:11, Victor Stinner wrote: > > > Solution 4: ZIP the entire standard library > > > (...) > > > Nevertheless, this might (in theory) **save 17.8 MiB / 47 %**. > > > > It's my favorite option. Almost 50% smaller is quite good! It would > > be > > very efficient to have such disk space gain! > > > > Using a ZIP file for the stdlib is commonly suggested solution when > > the slow Python startup time is discussed. Python does tons of > > system > > calls to load stdlib modules at startup: many stat() and open() > > calls. > > Having a single large ZIP file allows to do more work in pure > > userland. > > > > This solution is well supported by unmodified Python: it's part of > > the > > default sys.path search path: > > > > $ python3 > > Python 3.7.6 (default, Dec 19 2019, 22:52:49) > > > > > import sys; sys.path > > ['', '/usr/lib64/python37.zip', ...] > > > > It's the second item of sys.path ;-) > > It is, yet modules in the standard library still do read files next > to __file__ > and will blow up when zipped. That makes me believe we can put some > modules into > /usr/lib64/python38.zip, but not the entire unmodified stdlib at this > moment. We > can certainly work towards this goal if we get somebody to drive it. > > > I'm ok to discourage users to override *system files* by modifying > > them as root. It's too easy to mess up your system this way. > > Discouraging users is hard. We discourage users to use sudo pip and > yet **you** > still do it Victor :D > > > It is easy to extract the ZIP file in your home directory, hack > > some > > files and use PYTHONPATH environment variable to force loading your > > modified stdlib. > > > > * faster startup > > * less disk space > > * harder to mess up your system > > > > Where are drawbacks by the way? ;-) > > Behind the corner. If a traceback for an exception includes files from the .zip, can the traceback-printing machinery still print the pertinent lines of source? Dave ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: RFC: Python minimization in Fedora
On Thu, Jan 16, 2020 at 03:36:11PM +0100, Nicolas Mailhot via devel wrote: > Le 2020-01-16 15:10, Felix Schwarz a écrit : > >Am 16.01.20 um 13:37 schrieb Nicolas Mailhot via devel: > >>If we start messing with the Python tree it would be nice to put > >>each shared > >>python component in a separate zip/xz/whatever, and allow > >>versioning those > >>archives > >> > >>(ie use the highest semver zip present unless the code > >>explicitely requests > >>another version, and this version is available on the filesystem) > >> > >>That would heal the breach between venv users and Fedora/rpm. > >>We’re alienating > >>a lot of users, because un-versioned python components, do not > >>permit the > >>version divergence, some third party software requires > > > >Could you give a specific example? Even though my $DAYJOB is > >mostly about > >working with Python I don't have a clue which "un-versioned python > >components" > >you are referring to. > > Right now we (in Fedora) deploy things like > > /usr/lib/pythonxx/site-packages/something > > That means only one something may exist on-disk at a given time. > Python users workaround this with venvs and blame rpm and Fedora for > making a single something possible. > > Accommodating component versioning would mean deploying > > /usr/lib/pythonxx/site-packages/something-semver.zip This path includes xx, which contains the major and minor numbers. So adding "semver" would only allow accommodating different patch levels. Would that be useful? Different patch levels are supposed to be about bug fix only changes, so there's usually very little reason to carry anything except the latest one for any specific major.minor combination. Zbyszek ___ python-devel mailing list -- python-devel@lists.fedoraproject.org To unsubscribe send an email to python-devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/python-devel@lists.fedoraproject.org
Re: RFC: Python minimization in Fedora
On Thu, Jan 16, 2020 at 03:36:11PM +0100, Nicolas Mailhot via devel wrote: > Le 2020-01-16 15:10, Felix Schwarz a écrit : > >Am 16.01.20 um 13:37 schrieb Nicolas Mailhot via devel: > >>If we start messing with the Python tree it would be nice to put > >>each shared > >>python component in a separate zip/xz/whatever, and allow > >>versioning those > >>archives > >> > >>(ie use the highest semver zip present unless the code > >>explicitely requests > >>another version, and this version is available on the filesystem) > >> > >>That would heal the breach between venv users and Fedora/rpm. > >>We’re alienating > >>a lot of users, because un-versioned python components, do not > >>permit the > >>version divergence, some third party software requires > > > >Could you give a specific example? Even though my $DAYJOB is > >mostly about > >working with Python I don't have a clue which "un-versioned python > >components" > >you are referring to. > > Right now we (in Fedora) deploy things like > > /usr/lib/pythonxx/site-packages/something > > That means only one something may exist on-disk at a given time. > Python users workaround this with venvs and blame rpm and Fedora for > making a single something possible. > > Accommodating component versioning would mean deploying > > /usr/lib/pythonxx/site-packages/something-semver.zip This path includes xx, which contains the major and minor numbers. So adding "semver" would only allow accommodating different patch levels. Would that be useful? Different patch levels are supposed to be about bug fix only changes, so there's usually very little reason to carry anything except the latest one for any specific major.minor combination. Zbyszek ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: RFC: Python minimization in Fedora
On Thu, Jan 16, 2020 at 2:13 PM Christian Glombek wrote: > > On a side note (and without reading all of the above in detail), I'd like to > note that Fedora CoreOS (aka FCOS) is completely Python free by now - > probably not achievable for Desktop, but it may be for IoT. > Unfortunately neither FCOS nor Silverblue are terribly useful options right now. Most people aren't doing work in containers because using containers that way is too hard or too brittle. It's also important to note that the major trade-off those systems make is not one everyone is willing to accept: a massive expansion of storage usage. -- 真実はいつも一つ!/ Always, there's only one truth! ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: RFC: Python minimization in Fedora
On a side note (and without reading all of the above in detail), I'd like to note that Fedora CoreOS (aka FCOS) is completely Python free by now - probably not achievable for Desktop, but it may be for IoT. Miroslav Suchý schrieb am Do., 16. Jan. 2020, 16:35: > Dne 15. 01. 20 v 18:05 Miro Hrončok napsal(a): > > ### Solution 2: Move developer oriented modules to python3-devel (or > split the stdlib into pieces) > > +1 > > > ### Solution 5: Stop shipping mandatory bytecode cache > > +1 > > > Problem 5.1: Slower starts without bytecode cache > > Especially in container's world, the applications are Flask or Django > applications and the slower startup is not IMHO > important. We are speaking about fractions of seconds. Run-time speed will > not be affected. > > Desktop application does not need to minimize storage requirements and get > the recommended pyc files and will not be > affected at all. > > -- > Miroslav Suchy, RHCA > Red Hat, Associate Manager ABRT/Copr, #brno, #fedora-buildsys > ___ > devel mailing list -- devel@lists.fedoraproject.org > To unsubscribe send an email to devel-le...@lists.fedoraproject.org > Fedora Code of Conduct: > https://docs.fedoraproject.org/en-US/project/code-of-conduct/ > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines > List Archives: > https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org > ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: RFC: Python minimization in Fedora
Dne 15. 01. 20 v 18:05 Miro Hrončok napsal(a): > ### Solution 2: Move developer oriented modules to python3-devel (or split > the stdlib into pieces) +1 > ### Solution 5: Stop shipping mandatory bytecode cache +1 > Problem 5.1: Slower starts without bytecode cache Especially in container's world, the applications are Flask or Django applications and the slower startup is not IMHO important. We are speaking about fractions of seconds. Run-time speed will not be affected. Desktop application does not need to minimize storage requirements and get the recommended pyc files and will not be affected at all. -- Miroslav Suchy, RHCA Red Hat, Associate Manager ABRT/Copr, #brno, #fedora-buildsys ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: RFC: Python minimization in Fedora
Le 2020-01-16 15:10, Felix Schwarz a écrit : Am 16.01.20 um 13:37 schrieb Nicolas Mailhot via devel: If we start messing with the Python tree it would be nice to put each shared python component in a separate zip/xz/whatever, and allow versioning those archives (ie use the highest semver zip present unless the code explicitely requests another version, and this version is available on the filesystem) That would heal the breach between venv users and Fedora/rpm. We’re alienating a lot of users, because un-versioned python components, do not permit the version divergence, some third party software requires Could you give a specific example? Even though my $DAYJOB is mostly about working with Python I don't have a clue which "un-versioned python components" you are referring to. Right now we (in Fedora) deploy things like /usr/lib/pythonxx/site-packages/something That means only one something may exist on-disk at a given time. Python users workaround this with venvs and blame rpm and Fedora for making a single something possible. Accommodating component versioning would mean deploying /usr/lib/pythonxx/site-packages/something-semver.zip Regards, -- Nicolas Mailhot ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: RFC: Python minimization in Fedora
Am 16.01.20 um 13:37 schrieb Nicolas Mailhot via devel: > If we start messing with the Python tree it would be nice to put each shared > python component in a separate zip/xz/whatever, and allow versioning those > archives > > (ie use the highest semver zip present unless the code explicitely requests > another version, and this version is available on the filesystem) > > That would heal the breach between venv users and Fedora/rpm. We’re alienating > a lot of users, because un-versioned python components, do not permit the > version divergence, some third party software requires Could you give a specific example? Even though my $DAYJOB is mostly about working with Python I don't have a clue which "un-versioned python components" you are referring to. Felix ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: RFC: Python minimization in Fedora
Le 2020-01-16 11:18, Felix Schwarz a écrit : Am 15.01.20 um 23:11 schrieb Victor Stinner: This solution is well supported by unmodified Python: it's part of the default sys.path search path: $ python3 Python 3.7.6 (default, Dec 19 2019, 22:52:49) import sys; sys.path ['', '/usr/lib64/python37.zip', ...] It's the second item of sys.path ;-) Also CPython provides an "embedded" variant in the downloads (IIRC for Windows, everything in a zip file without installer) which provides its standard library in a zip file by default. The standard library in these zip files is only a subset of the regular stdlib so that might be a good starting point to see which modules could be zipped without modification. If we start messing with the Python tree it would be nice to put each shared python component in a separate zip/xz/whatever, and allow versioning those archives (ie use the highest semver zip present unless the code explicitely requests another version, and this version is available on the filesystem) That would heal the breach between venv users and Fedora/rpm. We’re alienating a lot of users, because un-versioned python components, do not permit the version divergence, some third party software requires Regards, -- Nicolas Mailhot ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: RFC: Python minimization in Fedora
On 16. 01. 20 11:16, Zbigniew Jędrzejewski-Szmek wrote: So we clearly need to keep the possibility of installing .pyc files, at least optionally. To clarify: We would keep the .pyc files by default. We would only provide an opt-out. No, it will not (TTBMK). The file (or the lack of it) will be cached in the dentry cache, so the kernel will give an answer extremely quickly. And the python process can easily store the directories is checked in a lru_cache or something like that, to avoid the round trip to the kernel. Good. I'll update the document to mention the possibility. -- Miro Hrončok -- Phone: +420777974800 IRC: mhroncok ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: RFC: Python minimization in Fedora
Am 15.01.20 um 23:11 schrieb Victor Stinner: > This solution is well supported by unmodified Python: it's part of the > default sys.path search path: > > $ python3 > Python 3.7.6 (default, Dec 19 2019, 22:52:49) import sys; sys.path > ['', '/usr/lib64/python37.zip', ...] > > It's the second item of sys.path ;-) Also CPython provides an "embedded" variant in the downloads (IIRC for Windows, everything in a zip file without installer) which provides its standard library in a zip file by default. The standard library in these zip files is only a subset of the regular stdlib so that might be a good starting point to see which modules could be zipped without modification. Felix ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: RFC: Python minimization in Fedora
On Thu, Jan 16, 2020 at 10:33:32AM +0100, Miro Hrončok wrote: > On 15. 01. 20 23:59, Zbigniew Jędrzejewski-Szmek wrote: > >On Wed, Jan 15, 2020 at 06:05:42PM +0100, Miro Hrončok wrote: > >>### File types (and bytecode caches) > >> > >>The orthogonal dimension is the file type. Python standard library > >>contains directories with both "extension modules" (written in C > >>(usually) and compiled to `*.cpython-38-x86_64-linux-gnu.so` shared > >>object file) and "pure Python" modules (written in Python and saved > >>as `*.py` source file). > >> > >>Each pure Python module comes in 4 files: > >> > >>- `module.py` -- the source > >>- `__pycache__/module.cpython-38.pyc` -- regular (not optimized) bytecode > >>cache > >>- `__pycache__/module.cpython-38.opt-1.pyc` -- optimized bytecode cache > >>(level 1) > >>- `__pycache__/module.cpython-38.opt-2.pyc` -- optimized bytecode cache > >>(level 2) > > > >I suspect that the difference in speed between loading various .pyc > >files is negligible. Do you have actual benchmarks for this? > > Loading time is theoretically faster for smaller files. Generality, > the opt-2 files in the stdlib are a bit smaller, but the opt-1 are > not. Technically, I agree that the loading time difference is > negligible. A quick benchmark: $ time python3 -c 'import importlib as i, pydoc_data.topics as t; [i.reload(t) for _ in range(1)]' python3 -c 4.16s user 0.45s system 99% cpu 4.646 total $ time python3 -O -c 'import importlib as i, pydoc_data.topics as t; [i.reload(t) for _ in range(1)]' python3 -O -c 4.01s user 0.45s system 99% cpu 4.492 total $ time python3 -OO -c 'import importlib as i, pydoc_data.topics as t; [i.reload(t) for _ in range(1)]' python3 -OO -c 3.97s user 0.42s system 98% cpu 4.467 total sudo rm /usr/lib64/python3.7/pydoc_data/__pycache__/topics.cpython-37.* $ time python3 -c 'import importlib as i, pydoc_data.topics as t; [i.reload(t) for _ in range(1000)]' python3 -c 13.73s user 0.46s system 96% cpu 14.728 total $ time python3 -O -c 'import importlib as i, pydoc_data.topics as t; [i.reload(t) for _ in range(1000)]' python3 -O -c 13.01s user 0.33s system 98% cpu 13.480 total $ time python3 -OO -c 'import importlib as i, pydoc_data.topics as t; [i.reload(t) for _ in range(1000)]' python3 -OO -c 11.95s user 0.15s system 99% cpu 12.156 total So... the benefit from -O and -OO is negligible for most scenarios. (Note that here the page cache is hot, so what is being measured is the time Pythons spends doing CPU crunching. Normally, the latency to load the file from disk would be in play, and this latency would be the same for files of similar size. The difference of a few percent between opt levels would become negligible. And note that this is particularly big file, so the time required for parsing would be even less important for small files which are much more common.) But the effect of having *some* .pyc file is not. For this file (which is 600+kb), the difference is 147.28/4.646 ≈ 30 times. So we clearly need to keep the possibility of installing .pyc files, at least optionally. > But no, we didn't do any benchmarking (yet anyway) at the scale of > the current document, that would take a lot of time and energy. The > plan is to only do them for solutions we actually decide to go for > (but only if we anticipate a change -- for example not with the > hardlink-based deduplication, but yet with the zipped stdlib). > > >>### Solution 5: Stop shipping mandatory bytecode cache > >> > >>This solution sounds simple: We do no longer ship the bytecode cache > >>mandatorily. Technically, we move the `.pyc` files to a subpackage > >>of `python3-libs` (or three different subpackages, that is not > >>important here). And we only *Recommend* them from `python3-libs` -- > >>by default, the users get them, but for space critical Fedora > >>flavors (such as container images) the maintainers can opt-out and > >>so can the powerusers. > >> > >>This would **save 18.6 MiB / 50%** -- quite a lot. > >> > >>However, as said earlier, if the bytecode cache files are not there, > >>Python attempts to create them upon first import. That can result in > >>several problems, here we will try to propose how to workaround > >>them. > > > >Below using a flag file in each __pycache__ directory is suggested. > >What about a different route: having a flag file for all descendants > >of a directory? > > The idea was to avoid traversing up, as that can potentially slow > down Python invocation from a deep PATH. But yes, that is possible > as well. No, it will not (TTBMK). The file (or the lack of it) will be cached in the dentry cache, so the kernel will give an answer extremely quickly. And the python process can easily store the directories is checked in a lru_cache or something like that, to avoid the round trip to the kernel. > >For example, /usr/lib/python3.8/.dont_write_bytecode > >would cover all modules under /usr/lib/python3.8/. > >If a .pyc file is present,
Re: RFC: Python minimization in Fedora
On 15. 01. 20 23:59, Zbigniew Jędrzejewski-Szmek wrote: On Wed, Jan 15, 2020 at 06:05:42PM +0100, Miro Hrončok wrote: ### File types (and bytecode caches) The orthogonal dimension is the file type. Python standard library contains directories with both "extension modules" (written in C (usually) and compiled to `*.cpython-38-x86_64-linux-gnu.so` shared object file) and "pure Python" modules (written in Python and saved as `*.py` source file). Each pure Python module comes in 4 files: - `module.py` -- the source - `__pycache__/module.cpython-38.pyc` -- regular (not optimized) bytecode cache - `__pycache__/module.cpython-38.opt-1.pyc` -- optimized bytecode cache (level 1) - `__pycache__/module.cpython-38.opt-2.pyc` -- optimized bytecode cache (level 2) I suspect that the difference in speed between loading various .pyc files is negligible. Do you have actual benchmarks for this? Loading time is theoretically faster for smaller files. Generality, the opt-2 files in the stdlib are a bit smaller, but the opt-1 are not. Technically, I agree that the loading time difference is negligible. But no, we didn't do any benchmarking (yet anyway) at the scale of the current document, that would take a lot of time and energy. The plan is to only do them for solutions we actually decide to go for (but only if we anticipate a change -- for example not with the hardlink-based deduplication, but yet with the zipped stdlib). ### Solution 5: Stop shipping mandatory bytecode cache This solution sounds simple: We do no longer ship the bytecode cache mandatorily. Technically, we move the `.pyc` files to a subpackage of `python3-libs` (or three different subpackages, that is not important here). And we only *Recommend* them from `python3-libs` -- by default, the users get them, but for space critical Fedora flavors (such as container images) the maintainers can opt-out and so can the powerusers. This would **save 18.6 MiB / 50%** -- quite a lot. However, as said earlier, if the bytecode cache files are not there, Python attempts to create them upon first import. That can result in several problems, here we will try to propose how to workaround them. Below using a flag file in each __pycache__ directory is suggested. What about a different route: having a flag file for all descendants of a directory? The idea was to avoid traversing up, as that can potentially slow down Python invocation from a deep PATH. But yes, that is possible as well. For example, /usr/lib/python3.8/.dont_write_bytecode would cover all modules under /usr/lib/python3.8/. If a .pyc file is present, python could still make use of it. This would be a nicer solution because it wouldn't require modifying individual packages, but would still avoid the selinux issues and slowdowns from failed attempts to write the optimized files. The __pycache__ files wouldn't need to exist at all. Correct. -- Miro Hrončok -- Phone: +420777974800 IRC: mhroncok ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: RFC: Python minimization in Fedora
On 15. 01. 20 23:11, Victor Stinner wrote: Solution 4: ZIP the entire standard library (...) Nevertheless, this might (in theory) **save 17.8 MiB / 47 %**. It's my favorite option. Almost 50% smaller is quite good! It would be very efficient to have such disk space gain! Using a ZIP file for the stdlib is commonly suggested solution when the slow Python startup time is discussed. Python does tons of system calls to load stdlib modules at startup: many stat() and open() calls. Having a single large ZIP file allows to do more work in pure userland. This solution is well supported by unmodified Python: it's part of the default sys.path search path: $ python3 Python 3.7.6 (default, Dec 19 2019, 22:52:49) import sys; sys.path ['', '/usr/lib64/python37.zip', ...] It's the second item of sys.path ;-) It is, yet modules in the standard library still do read files next to __file__ and will blow up when zipped. That makes me believe we can put some modules into /usr/lib64/python38.zip, but not the entire unmodified stdlib at this moment. We can certainly work towards this goal if we get somebody to drive it. I'm ok to discourage users to override *system files* by modifying them as root. It's too easy to mess up your system this way. Discouraging users is hard. We discourage users to use sudo pip and yet **you** still do it Victor :D It is easy to extract the ZIP file in your home directory, hack some files and use PYTHONPATH environment variable to force loading your modified stdlib. * faster startup * less disk space * harder to mess up your system Where are drawbacks by the way? ;-) Behind the corner. -- Miro Hrončok -- Phone: +420777974800 IRC: mhroncok ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: RFC: Python minimization in Fedora
On 15. 01. 20 18:56, Chris wrote: That's an amazing amount of work! Thanks. My only criticism would be: - the quest for reducing disk space is getting a bit over the top. I mean to make comparisons to 3.5" floppy disks which haven't been around for 20 years? That is obviously only used to lighten the text up and make it easier to read. We don't actually use floppy disks count to justify the need. Why is ~100MB so much? If you scale up from floppy disks and even reference a 8GB USB stick (which you can barely find any more), you'll fit just fine. Most Raspberry Pi's (out of the box solutions) even ship with Python, so the size has never been their concern either (where otherwise space would be). Mostly for contianer images. Fedora is huge there. -- Miro Hrončok -- Phone: +420777974800 IRC: mhroncok ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: RFC: Python minimization in Fedora
On Wed, Jan 15, 2020 at 06:05:42PM +0100, Miro Hrončok wrote: > ### File types (and bytecode caches) > > The orthogonal dimension is the file type. Python standard library > contains directories with both "extension modules" (written in C > (usually) and compiled to `*.cpython-38-x86_64-linux-gnu.so` shared > object file) and "pure Python" modules (written in Python and saved > as `*.py` source file). > > Each pure Python module comes in 4 files: > > - `module.py` -- the source > - `__pycache__/module.cpython-38.pyc` -- regular (not optimized) bytecode > cache > - `__pycache__/module.cpython-38.opt-1.pyc` -- optimized bytecode cache > (level 1) > - `__pycache__/module.cpython-38.opt-2.pyc` -- optimized bytecode cache > (level 2) I suspect that the difference in speed between loading various .pyc files is negligible. Do you have actual benchmarks for this? > ### Solution 5: Stop shipping mandatory bytecode cache > > This solution sounds simple: We do no longer ship the bytecode cache > mandatorily. Technically, we move the `.pyc` files to a subpackage > of `python3-libs` (or three different subpackages, that is not > important here). And we only *Recommend* them from `python3-libs` -- > by default, the users get them, but for space critical Fedora > flavors (such as container images) the maintainers can opt-out and > so can the powerusers. > > This would **save 18.6 MiB / 50%** -- quite a lot. > > However, as said earlier, if the bytecode cache files are not there, > Python attempts to create them upon first import. That can result in > several problems, here we will try to propose how to workaround > them. Below using a flag file in each __pycache__ directory is suggested. What about a different route: having a flag file for all descendants of a directory? For example, /usr/lib/python3.8/.dont_write_bytecode would cover all modules under /usr/lib/python3.8/. If a .pyc file is present, python could still make use of it. This would be a nicer solution because it wouldn't require modifying individual packages, but would still avoid the selinux issues and slowdowns from failed attempts to write the optimized files. The __pycache__ files wouldn't need to exist at all. Zbyszek ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: RFC: Python minimization in Fedora
On Wed, Jan 15, 2020 at 06:05:42PM +0100, Miro Hrončok wrote: > ### File types (and bytecode caches) > > The orthogonal dimension is the file type. Python standard library > contains directories with both "extension modules" (written in C > (usually) and compiled to `*.cpython-38-x86_64-linux-gnu.so` shared > object file) and "pure Python" modules (written in Python and saved > as `*.py` source file). > > Each pure Python module comes in 4 files: > > - `module.py` -- the source > - `__pycache__/module.cpython-38.pyc` -- regular (not optimized) bytecode > cache > - `__pycache__/module.cpython-38.opt-1.pyc` -- optimized bytecode cache > (level 1) > - `__pycache__/module.cpython-38.opt-2.pyc` -- optimized bytecode cache > (level 2) I suspect that the difference in speed between loading various .pyc files is negligible. Do you have actual benchmarks for this? > ### Solution 5: Stop shipping mandatory bytecode cache > > This solution sounds simple: We do no longer ship the bytecode cache > mandatorily. Technically, we move the `.pyc` files to a subpackage > of `python3-libs` (or three different subpackages, that is not > important here). And we only *Recommend* them from `python3-libs` -- > by default, the users get them, but for space critical Fedora > flavors (such as container images) the maintainers can opt-out and > so can the powerusers. > > This would **save 18.6 MiB / 50%** -- quite a lot. > > However, as said earlier, if the bytecode cache files are not there, > Python attempts to create them upon first import. That can result in > several problems, here we will try to propose how to workaround > them. Below using a flag file in each __pycache__ directory is suggested. What about a different route: having a flag file for all descendants of a directory? For example, /usr/lib/python3.8/.dont_write_bytecode would cover all modules under /usr/lib/python3.8/. If a .pyc file is present, python could still make use of it. This would be a nicer solution because it wouldn't require modifying individual packages, but would still avoid the selinux issues and slowdowns from failed attempts to write the optimized files. The __pycache__ files wouldn't need to exist at all. Zbyszek ___ python-devel mailing list -- python-devel@lists.fedoraproject.org To unsubscribe send an email to python-devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/python-devel@lists.fedoraproject.org
Re: RFC: Python minimization in Fedora
> Solution 4: ZIP the entire standard library > (...) > Nevertheless, this might (in theory) **save 17.8 MiB / 47 %**. It's my favorite option. Almost 50% smaller is quite good! It would be very efficient to have such disk space gain! Using a ZIP file for the stdlib is commonly suggested solution when the slow Python startup time is discussed. Python does tons of system calls to load stdlib modules at startup: many stat() and open() calls. Having a single large ZIP file allows to do more work in pure userland. This solution is well supported by unmodified Python: it's part of the default sys.path search path: $ python3 Python 3.7.6 (default, Dec 19 2019, 22:52:49) >>> import sys; sys.path ['', '/usr/lib64/python37.zip', ...] It's the second item of sys.path ;-) I'm ok to discourage users to override *system files* by modifying them as root. It's too easy to mess up your system this way. It is easy to extract the ZIP file in your home directory, hack some files and use PYTHONPATH environment variable to force loading your modified stdlib. * faster startup * less disk space * harder to mess up your system Where are drawbacks by the way? ;-) Victor ___ python-devel mailing list -- python-devel@lists.fedoraproject.org To unsubscribe send an email to python-devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/python-devel@lists.fedoraproject.org
Re: RFC: Python minimization in Fedora
> Solution 4: ZIP the entire standard library > (...) > Nevertheless, this might (in theory) **save 17.8 MiB / 47 %**. It's my favorite option. Almost 50% smaller is quite good! It would be very efficient to have such disk space gain! Using a ZIP file for the stdlib is commonly suggested solution when the slow Python startup time is discussed. Python does tons of system calls to load stdlib modules at startup: many stat() and open() calls. Having a single large ZIP file allows to do more work in pure userland. This solution is well supported by unmodified Python: it's part of the default sys.path search path: $ python3 Python 3.7.6 (default, Dec 19 2019, 22:52:49) >>> import sys; sys.path ['', '/usr/lib64/python37.zip', ...] It's the second item of sys.path ;-) I'm ok to discourage users to override *system files* by modifying them as root. It's too easy to mess up your system this way. It is easy to extract the ZIP file in your home directory, hack some files and use PYTHONPATH environment variable to force loading your modified stdlib. * faster startup * less disk space * harder to mess up your system Where are drawbacks by the way? ;-) Victor ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: RFC: Python minimization in Fedora
Wed, 15 Jan 2020 18:05:42 +0100 Miro Hrončok : > Hello Fedora! > > In Python Maint, we sat down and we came up with several ideas how to > minimize the filesystem footprint of Python. Unfortunately, the > result is horribly long, sorry about that. It was delightfull to read. I have some better understanding of what Python setup. Maybe even Python core may adopt some ideas in the future. It that not what Fedora does? :) I used Micropython in Mirobit boards and that Python *is* tiny. Standard Fedora Python is not really that big, but more code on disk (even unused) is always a potential security problem. I sometimes build my own Python without extras like tkinter, curses, or xml. It's super easy and I always get the version I want. I was suprised to see a proposal to remove pyc files. I know they're big and, unless something is constantly using particular module, mostly useless. Python is creating that files during "make install" and every other module does that, during install. Python is faster with them. Compressing data in modules is also nice. While zip is not the best, it's what we have in Python. I'm suprised that this is largest part of Python installation. > Optimization level 2 is already broken. That is a good point. Almost no one uses pure Python. > ### Solution 10: Stop shipping mandatory Python, rewrite dnf to Rust No I just started to work really well. I didn't know about libdnf, sounds interesting. -- Łukasz Posadowski ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: RFC: Python minimization in Fedora
On 1/15/20 12:56 PM, Chris wrote: That's an amazing amount of work! My only criticism would be: - the quest for reducing disk space is getting a bit over the top. I mean to make comparisons to 3.5" floppy disks which haven't been around for 20 years? Why is ~100MB so much? If you scale up from floppy disks and even reference a 8GB USB stick (which you can barely find any more), you'll fit just fine. Most Raspberry Pi's (out of the box solutions) even ship with Python, so the size has never been their concern either (where otherwise space would be). I am bias, because I absolutely adore Python and it's added bloat to basically be the swiss army knife that can solve any problem isn't worth the few MB you're trying to cut out of it. Me too---but it's useful to have Python in super-small environments. For comparison, people squeezed Python onto ARM Arduino Nano-class platforms, using Cortex M0 chips on a 1"x2" board costing 5$. The total memory is on the order of 256kB; of course it doesn't run Linux, but you do get a Python REPL over a serial/USB link. https://makezine.com/2017/08/11/circuitpython-snakes-way-adafruit-hardware/ ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Re: RFC: Python minimization in Fedora
That's an amazing amount of work! My only criticism would be: - the quest for reducing disk space is getting a bit over the top. I mean to make comparisons to 3.5" floppy disks which haven't been around for 20 years? Why is ~100MB so much? If you scale up from floppy disks and even reference a 8GB USB stick (which you can barely find any more), you'll fit just fine. Most Raspberry Pi's (out of the box solutions) even ship with Python, so the size has never been their concern either (where otherwise space would be). I am bias, because I absolutely adore Python and it's added bloat to basically be the swiss army knife that can solve any problem isn't worth the few MB you're trying to cut out of it. That all said: as a dev, I've got no problems with the solution that just involves removing dev-related packages from the main core build of Python unless you pull in python-devel. Solution 5 seems also seems good (stop shipping .pyc files)... Just pick one (.pyc, or .pyo) file to ship with the distribution; I'm not sure if both are really required. Just my two cents; I don't comment to much here, i enjoy seeing you all debate though! :) Chris On Wed, Jan 15, 2020 at 12:15 PM Miro Hrončok wrote: > Hello Fedora! > > In Python Maint, we sat down and we came up with several ideas how to > minimize > the filesystem footprint of Python. Unfortunately, the result is horribly > long, > sorry about that. > > Please, share your feedback, additional solutions, comments etc. > > Version with formatting and pictures is available at: > > https://github.com/hroncok/python-minimization/blob/master/document.md > > > Enclosing here for better in-line responses: > > > > # Python minimization in Fedora > > > While Fedora is well suited for traditional physical/virtual > workstations and > servers, it is often overlooked for use cases beyond traditional installs. > > > > Some modern types of deployments — such as IoT and containers — are > quite > sensitive to size. For IoT that's usually slow data connections (for > updates/management) and for cloud and containers it’s the massive scale. > > -- the preamble of the [Fedora Minimization > Objective](https://docs.fedoraproject.org/en-US/minimization/) > > One of the biggest things in Fedora is Python. Because [Fedora loves > Python](https://fedoralovespython.org/) and because the package manager > for > Fedora packages -- dnf -- happens to be written in Python, the Python > interpreter and its standard library comes pre-installed on many (if not > all) > Fedora systems and is often not possible to remove it without destroying > the > system completely or making it unmanageable. > > Python comes with [Batteries > Included](https://en.wikipedia.org/wiki/Batteries_Included) -- the > standard > library is quite big. While pleasant for the programmers, this comes with > a > large filesystem footprint not entirely desired in Fedora. In this > document, we > will analyze the footprint and offer several minimization solutions/ideas > with > their challenges, pros (MiB saved) and cons. It is a list of ideas; > **we're not > promising to do any of this**. > > > **Goal:** > > 1. Significantly lower the filesystem footprint of the mandatory Python > installation in Fedora. > > **Non-goals:** > > 1. We don't aim to lower the filesystem footprint of all Python > installations > in Fedora -- the default may remain big, if there is an opt-out mechanism. > 2. We don't aim to lower the filesystem footprint of all Fedora Python > RPM > packages, just the `python3` package and its subpackages -- the > interpreter and > the standard library. > > However, if any non-goal becomes a side effect of the solution of our > goal, good. > > **Constraints:** > > 1. Do not break Python users' expectations. As an example, we don't > strip > Python standard library to the bare minimum and still call it Python. > 2. Do not break Fedora users' expectations. As an example, we don't > break the > ability to hot patch Python files on a live system by default. > 3. Do not break Fedora packagers' expectations. As an example, we don't > [require "system tools" to use a custom Python > entrypoint](https://fedoraproject.org/wiki/Changes/System_Python), such > as > `/usr/libexec/platform-python` or `/usr/libexec/system-python`. > 4. Do not significantly increase the filesystem footprint of the default > Python installation. As an example, we don't package [two separate > versions (and > stacks) of Python]( > https://fedoraproject.org/wiki/Changes/Platform_Python_Stack) > -- one minimal for dnf (or Ansible) and another "normal" for the users. > 5. Do not diverge from upstream significantly (but we can drive upstream > change). As an example, we don't reinvent the import machinery of Python > downstream only, but we might do it in upstream and even [use Fedora to > pioneer > the change](https://fedoraproject.org/wiki/Changes/python3_c.utf-8_locale > ). > > The listed constraints are not absolute. We will
RFC: Python minimization in Fedora
Hello Fedora! In Python Maint, we sat down and we came up with several ideas how to minimize the filesystem footprint of Python. Unfortunately, the result is horribly long, sorry about that. Please, share your feedback, additional solutions, comments etc. Version with formatting and pictures is available at: https://github.com/hroncok/python-minimization/blob/master/document.md Enclosing here for better in-line responses: # Python minimization in Fedora > While Fedora is well suited for traditional physical/virtual workstations and servers, it is often overlooked for use cases beyond traditional installs. > > Some modern types of deployments — such as IoT and containers — are quite sensitive to size. For IoT that's usually slow data connections (for updates/management) and for cloud and containers it’s the massive scale. -- the preamble of the [Fedora Minimization Objective](https://docs.fedoraproject.org/en-US/minimization/) One of the biggest things in Fedora is Python. Because [Fedora loves Python](https://fedoralovespython.org/) and because the package manager for Fedora packages -- dnf -- happens to be written in Python, the Python interpreter and its standard library comes pre-installed on many (if not all) Fedora systems and is often not possible to remove it without destroying the system completely or making it unmanageable. Python comes with [Batteries Included](https://en.wikipedia.org/wiki/Batteries_Included) -- the standard library is quite big. While pleasant for the programmers, this comes with a large filesystem footprint not entirely desired in Fedora. In this document, we will analyze the footprint and offer several minimization solutions/ideas with their challenges, pros (MiB saved) and cons. It is a list of ideas; **we're not promising to do any of this**. **Goal:** 1. Significantly lower the filesystem footprint of the mandatory Python installation in Fedora. **Non-goals:** 1. We don't aim to lower the filesystem footprint of all Python installations in Fedora -- the default may remain big, if there is an opt-out mechanism. 2. We don't aim to lower the filesystem footprint of all Fedora Python RPM packages, just the `python3` package and its subpackages -- the interpreter and the standard library. However, if any non-goal becomes a side effect of the solution of our goal, good. **Constraints:** 1. Do not break Python users' expectations. As an example, we don't strip Python standard library to the bare minimum and still call it Python. 2. Do not break Fedora users' expectations. As an example, we don't break the ability to hot patch Python files on a live system by default. 3. Do not break Fedora packagers' expectations. As an example, we don't [require "system tools" to use a custom Python entrypoint](https://fedoraproject.org/wiki/Changes/System_Python), such as `/usr/libexec/platform-python` or `/usr/libexec/system-python`. 4. Do not significantly increase the filesystem footprint of the default Python installation. As an example, we don't package [two separate versions (and stacks) of Python](https://fedoraproject.org/wiki/Changes/Platform_Python_Stack) -- one minimal for dnf (or Ansible) and another "normal" for the users. 5. Do not diverge from upstream significantly (but we can drive upstream change). As an example, we don't reinvent the import machinery of Python downstream only, but we might do it in upstream and even [use Fedora to pioneer the change](https://fedoraproject.org/wiki/Changes/python3_c.utf-8_locale). The listed constraints are not absolute. We will mention in each solution, whether we feel that some constraints are violated, but that doesn't mean we shall outright discard the solution. ## How large is Python, actually tl;dr Python 3.8.1 in Fedora has 111 MiB (approximately 77 3.5" floppy disks), but we only **install 37.5 MiB by default** (26 floppy disks). ![77 3.5" floppy disks](https://github.com/hroncok/python-minimization/raw/master/77-floppy-disks-gray.jpg) *77 3.5" floppy disks, courtesy of Dana Walker. Imagine one of them is faulty.* (All numbers are real installed disk sizes based on the `python38` package installed on Fedora 31, x86_64. The split into subpackages is based on the `python3` package from Fedora 32. Slight differences between Fedora 31 and 32 or between various architectures are irrelevant here, we aim for a long term minimization. See the [source of the numbers][source].) In Fedora we split the Python interpreter into various RPM subpackages, some of them are optional. This is what you get all the time: - `python3` contains `/usr/bin/python3` and friends; has 21 KiB. - `python3-libs` contains `/usr/lib64/libpython3.8.so.1.0` and the majority of the standard library, is required by `python3`; has 37.5 MiB. And this is what you get optionally: - `python3-devel` contains the "development files" and makes it possible to compile extension
RFC: Python minimization in Fedora
Hello Fedora! In Python Maint, we sat down and we came up with several ideas how to minimize the filesystem footprint of Python. Unfortunately, the result is horribly long, sorry about that. Please, share your feedback, additional solutions, comments etc. Version with formatting and pictures is available at: https://github.com/hroncok/python-minimization/blob/master/document.md Enclosing here for better in-line responses: # Python minimization in Fedora > While Fedora is well suited for traditional physical/virtual workstations and servers, it is often overlooked for use cases beyond traditional installs. > > Some modern types of deployments — such as IoT and containers — are quite sensitive to size. For IoT that's usually slow data connections (for updates/management) and for cloud and containers it’s the massive scale. -- the preamble of the [Fedora Minimization Objective](https://docs.fedoraproject.org/en-US/minimization/) One of the biggest things in Fedora is Python. Because [Fedora loves Python](https://fedoralovespython.org/) and because the package manager for Fedora packages -- dnf -- happens to be written in Python, the Python interpreter and its standard library comes pre-installed on many (if not all) Fedora systems and is often not possible to remove it without destroying the system completely or making it unmanageable. Python comes with [Batteries Included](https://en.wikipedia.org/wiki/Batteries_Included) -- the standard library is quite big. While pleasant for the programmers, this comes with a large filesystem footprint not entirely desired in Fedora. In this document, we will analyze the footprint and offer several minimization solutions/ideas with their challenges, pros (MiB saved) and cons. It is a list of ideas; **we're not promising to do any of this**. **Goal:** 1. Significantly lower the filesystem footprint of the mandatory Python installation in Fedora. **Non-goals:** 1. We don't aim to lower the filesystem footprint of all Python installations in Fedora -- the default may remain big, if there is an opt-out mechanism. 2. We don't aim to lower the filesystem footprint of all Fedora Python RPM packages, just the `python3` package and its subpackages -- the interpreter and the standard library. However, if any non-goal becomes a side effect of the solution of our goal, good. **Constraints:** 1. Do not break Python users' expectations. As an example, we don't strip Python standard library to the bare minimum and still call it Python. 2. Do not break Fedora users' expectations. As an example, we don't break the ability to hot patch Python files on a live system by default. 3. Do not break Fedora packagers' expectations. As an example, we don't [require "system tools" to use a custom Python entrypoint](https://fedoraproject.org/wiki/Changes/System_Python), such as `/usr/libexec/platform-python` or `/usr/libexec/system-python`. 4. Do not significantly increase the filesystem footprint of the default Python installation. As an example, we don't package [two separate versions (and stacks) of Python](https://fedoraproject.org/wiki/Changes/Platform_Python_Stack) -- one minimal for dnf (or Ansible) and another "normal" for the users. 5. Do not diverge from upstream significantly (but we can drive upstream change). As an example, we don't reinvent the import machinery of Python downstream only, but we might do it in upstream and even [use Fedora to pioneer the change](https://fedoraproject.org/wiki/Changes/python3_c.utf-8_locale). The listed constraints are not absolute. We will mention in each solution, whether we feel that some constraints are violated, but that doesn't mean we shall outright discard the solution. ## How large is Python, actually tl;dr Python 3.8.1 in Fedora has 111 MiB (approximately 77 3.5" floppy disks), but we only **install 37.5 MiB by default** (26 floppy disks). ![77 3.5" floppy disks](https://github.com/hroncok/python-minimization/raw/master/77-floppy-disks-gray.jpg) *77 3.5" floppy disks, courtesy of Dana Walker. Imagine one of them is faulty.* (All numbers are real installed disk sizes based on the `python38` package installed on Fedora 31, x86_64. The split into subpackages is based on the `python3` package from Fedora 32. Slight differences between Fedora 31 and 32 or between various architectures are irrelevant here, we aim for a long term minimization. See the [source of the numbers][source].) In Fedora we split the Python interpreter into various RPM subpackages, some of them are optional. This is what you get all the time: - `python3` contains `/usr/bin/python3` and friends; has 21 KiB. - `python3-libs` contains `/usr/lib64/libpython3.8.so.1.0` and the majority of the standard library, is required by `python3`; has 37.5 MiB. And this is what you get optionally: - `python3-devel` contains the "development files" and makes it possible to compile extension