Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
On Fri Apr 18 2014 at 5:03:33 PM, Ezio Melotti ezio.melo...@gmail.com wrote: Hi, On Thu, Apr 17, 2014 at 9:09 PM, Brett Cannon bcan...@gmail.com wrote: On Thu Apr 17 2014 at 1:34:23 PM, Jurko Gospodnetić jurko.gospodne...@pke.hr wrote: Hi. On 14.4.2014. 23:51, Brett Cannon wrote: Now the question is whether the maintenance cost of having to rebuild Python for a select number of stdlib modules is enough to warrant putting in the effort to make this work. I would really love to have better startup times in production, but I would also really hate to lose the ability to hack around in stdlib sources during development just to get better startup performance. In general, what I really like about using Python for software development is the ability to open any stdlib file and easily go poking around using stuff like 'import pdb;pdb.set_trace()' or simple print statements. Researching mysterious behaviour is generally much much MUCH! easier (read: takes less hours/days/weeks) if it ends up leading into a stdlib Python module than if it takes you down into the bowels of some C module (think zipimport.c *grin*). Not to mention the effect that being able to quickly resolve a mystery by hacking on some Python internals leaves you feeling very satisfied, while having to entrench yourself in those internals for a long time just to find out you've made something foolish on your end leaves you feeling exhausted at best. Freezing modules does not affect the ability to use gdb. And as long as you set the appropriate __file__ values then tracebacks will contain even the file line and location. Will the tracebacks only contain the line number or the line of code too? Yes. I've seen tracebacks involving importlib,_bootstrap that didn't include the code, and I'm wondering if we will get something similar for all the other modules you are freezing: Traceback (most recent call last): File /tmp/foo.py, line 7, in module import email File frozen importlib._bootstrap, line 1561, in _find_and_load File frozen importlib._bootstrap, line 1519, in _find_and_load_unlocked File frozen importlib._bootstrap, line 1473, in _find_module File frozen importlib._bootstrap, line 1308, in find_module File frozen importlib._bootstrap, line 1284, in _get_loader File frozen importlib._bootstrap, line 1273, in _path_importer_cache File frozen importlib._bootstrap, line 1254, in _path_hooks TypeError: 'NoneType' object is not iterable Best Regards, Ezio Melotti That's because the frozen importer doesn't define get_source(). But since we have the source in this instance the __loader__ can be updated to be SourceFileLoader so that get_source() is available: http://bugs.python.org/issue21335 . ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
On 18.04.2014 23:03, Ezio Melotti wrote: Hi, On Thu, Apr 17, 2014 at 9:09 PM, Brett Cannon bcan...@gmail.com wrote: On Thu Apr 17 2014 at 1:34:23 PM, Jurko Gospodnetić jurko.gospodne...@pke.hr wrote: Hi. On 14.4.2014. 23:51, Brett Cannon wrote: Now the question is whether the maintenance cost of having to rebuild Python for a select number of stdlib modules is enough to warrant putting in the effort to make this work. I would really love to have better startup times in production, but I would also really hate to lose the ability to hack around in stdlib sources during development just to get better startup performance. In general, what I really like about using Python for software development is the ability to open any stdlib file and easily go poking around using stuff like 'import pdb;pdb.set_trace()' or simple print statements. Researching mysterious behaviour is generally much much MUCH! easier (read: takes less hours/days/weeks) if it ends up leading into a stdlib Python module than if it takes you down into the bowels of some C module (think zipimport.c *grin*). Not to mention the effect that being able to quickly resolve a mystery by hacking on some Python internals leaves you feeling very satisfied, while having to entrench yourself in those internals for a long time just to find out you've made something foolish on your end leaves you feeling exhausted at best. Freezing modules does not affect the ability to use gdb. And as long as you set the appropriate __file__ values then tracebacks will contain even the file line and location. Will the tracebacks only contain the line number or the line of code too? I've seen tracebacks involving importlib,_bootstrap that didn't include the code, and I'm wondering if we will get something similar for all the other modules you are freezing: Traceback (most recent call last): File /tmp/foo.py, line 7, in module import email File frozen importlib._bootstrap, line 1561, in _find_and_load File frozen importlib._bootstrap, line 1519, in _find_and_load_unlocked File frozen importlib._bootstrap, line 1473, in _find_module File frozen importlib._bootstrap, line 1308, in find_module File frozen importlib._bootstrap, line 1284, in _get_loader File frozen importlib._bootstrap, line 1273, in _path_importer_cache File frozen importlib._bootstrap, line 1254, in _path_hooks TypeError: 'NoneType' object is not iterable Yes, this is what you get for frozen modules. If you'd like to play around with a frozen stdlib this, you can have a look at PyRun (http://pyrun.org), which does this for Python 2 and will hopefully work for Python 3.4 soonish too. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ ::: Try our new mxODBC.Connect Python Database Interface for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
Hi, On Thu, Apr 17, 2014 at 9:09 PM, Brett Cannon bcan...@gmail.com wrote: On Thu Apr 17 2014 at 1:34:23 PM, Jurko Gospodnetić jurko.gospodne...@pke.hr wrote: Hi. On 14.4.2014. 23:51, Brett Cannon wrote: Now the question is whether the maintenance cost of having to rebuild Python for a select number of stdlib modules is enough to warrant putting in the effort to make this work. I would really love to have better startup times in production, but I would also really hate to lose the ability to hack around in stdlib sources during development just to get better startup performance. In general, what I really like about using Python for software development is the ability to open any stdlib file and easily go poking around using stuff like 'import pdb;pdb.set_trace()' or simple print statements. Researching mysterious behaviour is generally much much MUCH! easier (read: takes less hours/days/weeks) if it ends up leading into a stdlib Python module than if it takes you down into the bowels of some C module (think zipimport.c *grin*). Not to mention the effect that being able to quickly resolve a mystery by hacking on some Python internals leaves you feeling very satisfied, while having to entrench yourself in those internals for a long time just to find out you've made something foolish on your end leaves you feeling exhausted at best. Freezing modules does not affect the ability to use gdb. And as long as you set the appropriate __file__ values then tracebacks will contain even the file line and location. Will the tracebacks only contain the line number or the line of code too? I've seen tracebacks involving importlib,_bootstrap that didn't include the code, and I'm wondering if we will get something similar for all the other modules you are freezing: Traceback (most recent call last): File /tmp/foo.py, line 7, in module import email File frozen importlib._bootstrap, line 1561, in _find_and_load File frozen importlib._bootstrap, line 1519, in _find_and_load_unlocked File frozen importlib._bootstrap, line 1473, in _find_module File frozen importlib._bootstrap, line 1308, in find_module File frozen importlib._bootstrap, line 1284, in _get_loader File frozen importlib._bootstrap, line 1273, in _path_importer_cache File frozen importlib._bootstrap, line 1254, in _path_hooks TypeError: 'NoneType' object is not iterable Best Regards, Ezio Melotti -Brett ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ezio.melotti%40gmail.com ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
On Wed Apr 16 2014 at 4:53:25 PM, Terry Reedy tjre...@udel.edu wrote: On Wednesday, April 16, 2014 2:57:35 PM, Terry Reedy tjre...@udel.edu mailto:tjre...@udel.edu wrote: PS. In the user process sys.modules, there are numerous null entries like these: sys.modules['idlelib.os'] sys.modules['idlelib.tokenize'__] sys.modules['idlelib.io http://idlelib.io'] etcetera On 4/16/2014 3:10 PM, Dr. Brett Cannon wrote: Is this Python 2 or 3? Py 2. I should have said so. The entries do not appear in py3. In Python 2 it means an attempt to perform a relative import failed but an absolute in succeeded, e.g. from idlelib you imported os, so import tried idlelib.os and then os. *I* have not done anything. For tokenize, for instance, the existing code just does what I though were absolute imports, in 2 files. import tokenize That's not an absolute import if it's within a package and you didn't declare `from __future__ import absolute_import`. Perhaps the extra entries have something to do with the fact that these startup imports are invisible to user code, just like those done by the interpreter itself on startup. 2.7 uses spawnv (and 3.4 uses subprocces) to run something like one of the following. python -c __import__('idlelib.run').run.main(False) python -c __import__('run').main(False) Nope, it has to simply do with how Python 2 does implicit relative imports. Add the __future__ statement and they will go away. -Brett run.py has several normal lines with import stdlib module from idlelib import idlelib module and ditto for some of the imported idlelib modules. You should definitely consider using a future import to guarantee absolute imports. -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ brett%40python.org ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
Hi. On 14.4.2014. 23:51, Brett Cannon wrote: Now the question is whether the maintenance cost of having to rebuild Python for a select number of stdlib modules is enough to warrant putting in the effort to make this work. I would really love to have better startup times in production, but I would also really hate to lose the ability to hack around in stdlib sources during development just to get better startup performance. In general, what I really like about using Python for software development is the ability to open any stdlib file and easily go poking around using stuff like 'import pdb;pdb.set_trace()' or simple print statements. Researching mysterious behaviour is generally much much MUCH! easier (read: takes less hours/days/weeks) if it ends up leading into a stdlib Python module than if it takes you down into the bowels of some C module (think zipimport.c *grin*). Not to mention the effect that being able to quickly resolve a mystery by hacking on some Python internals leaves you feeling very satisfied, while having to entrench yourself in those internals for a long time just to find out you've made something foolish on your end leaves you feeling exhausted at best. Not considering the zipped stdlib technique mentioned in other posts, would it perhaps be better to support two different CPython builds: - one with all the needed stdlib parts frozen - to be used in production - one with only the minimal needed number of stdlib parts frozen - to have as much of the stdlib sources readily accessible to application developers as possible The installer could then perhaps install both executables, or the frozen stdlib parts could perhaps be built as a separate DLL to be loaded at runtime instead of its content being used from their Python sources. OK... just my 2 cents worth... :-) Best regards, Jurko Gospodnetić ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
On Thu, Apr 17, 2014 at 10:33 AM, Jurko Gospodnetić jurko.gospodne...@pke.hr wrote: I would really love to have better startup times in production, What's your use case? I understand why startup time is important for Hg, but I'd like to understand what other situations occur frequently enough to worry about it. -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
On Thu Apr 17 2014 at 1:34:23 PM, Jurko Gospodnetić jurko.gospodne...@pke.hr wrote: Hi. On 14.4.2014. 23:51, Brett Cannon wrote: Now the question is whether the maintenance cost of having to rebuild Python for a select number of stdlib modules is enough to warrant putting in the effort to make this work. I would really love to have better startup times in production, but I would also really hate to lose the ability to hack around in stdlib sources during development just to get better startup performance. In general, what I really like about using Python for software development is the ability to open any stdlib file and easily go poking around using stuff like 'import pdb;pdb.set_trace()' or simple print statements. Researching mysterious behaviour is generally much much MUCH! easier (read: takes less hours/days/weeks) if it ends up leading into a stdlib Python module than if it takes you down into the bowels of some C module (think zipimport.c *grin*). Not to mention the effect that being able to quickly resolve a mystery by hacking on some Python internals leaves you feeling very satisfied, while having to entrench yourself in those internals for a long time just to find out you've made something foolish on your end leaves you feeling exhausted at best. Freezing modules does not affect the ability to use gdb. And as long as you set the appropriate __file__ values then tracebacks will contain even the file line and location. -Brett ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
I think he meant modifying the source files themselves for debugging purposes (e.g. putting print statements in itertools.py). 2014-04-17 14:09 GMT-04:00 Brett Cannon bcan...@gmail.com: On Thu Apr 17 2014 at 1:34:23 PM, Jurko Gospodnetić jurko.gospodne...@pke.hr wrote: Hi. On 14.4.2014. 23:51, Brett Cannon wrote: Now the question is whether the maintenance cost of having to rebuild Python for a select number of stdlib modules is enough to warrant putting in the effort to make this work. I would really love to have better startup times in production, but I would also really hate to lose the ability to hack around in stdlib sources during development just to get better startup performance. In general, what I really like about using Python for software development is the ability to open any stdlib file and easily go poking around using stuff like 'import pdb;pdb.set_trace()' or simple print statements. Researching mysterious behaviour is generally much much MUCH! easier (read: takes less hours/days/weeks) if it ends up leading into a stdlib Python module than if it takes you down into the bowels of some C module (think zipimport.c *grin*). Not to mention the effect that being able to quickly resolve a mystery by hacking on some Python internals leaves you feeling very satisfied, while having to entrench yourself in those internals for a long time just to find out you've made something foolish on your end leaves you feeling exhausted at best. Freezing modules does not affect the ability to use gdb. And as long as you set the appropriate __file__ values then tracebacks will contain even the file line and location. -Brett ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/marky1991%40gmail.com ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
On Thu, 17 Apr 2014 18:09:22 + Brett Cannon bcan...@gmail.com wrote: I would really love to have better startup times in production, but I would also really hate to lose the ability to hack around in stdlib sources during development just to get better startup performance. In general, what I really like about using Python for software development is the ability to open any stdlib file and easily go poking around using stuff like 'import pdb;pdb.set_trace()' or simple print statements. Researching mysterious behaviour is generally much much MUCH! easier (read: takes less hours/days/weeks) if it ends up leading into a stdlib Python module than if it takes you down into the bowels of some C module (think zipimport.c *grin*). Not to mention the effect that being able to quickly resolve a mystery by hacking on some Python internals leaves you feeling very satisfied, while having to entrench yourself in those internals for a long time just to find out you've made something foolish on your end leaves you feeling exhausted at best. Freezing modules does not affect the ability to use gdb. And as long as you set the appropriate __file__ values then tracebacks will contain even the file line and location. I sympathize with Jurko's opinion. Being able to poke inside stdlib source files makes Python more approachable. I'm sure several of us got into Python that way. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
I still do that! On Thu, Apr 17, 2014 at 11:17 AM, Antoine Pitrou solip...@pitrou.netwrote: On Thu, 17 Apr 2014 18:09:22 + Brett Cannon bcan...@gmail.com wrote: I would really love to have better startup times in production, but I would also really hate to lose the ability to hack around in stdlib sources during development just to get better startup performance. In general, what I really like about using Python for software development is the ability to open any stdlib file and easily go poking around using stuff like 'import pdb;pdb.set_trace()' or simple print statements. Researching mysterious behaviour is generally much much MUCH! easier (read: takes less hours/days/weeks) if it ends up leading into a stdlib Python module than if it takes you down into the bowels of some C module (think zipimport.c *grin*). Not to mention the effect that being able to quickly resolve a mystery by hacking on some Python internals leaves you feeling very satisfied, while having to entrench yourself in those internals for a long time just to find out you've made something foolish on your end leaves you feeling exhausted at best. Freezing modules does not affect the ability to use gdb. And as long as you set the appropriate __file__ values then tracebacks will contain even the file line and location. I sympathize with Jurko's opinion. Being able to poke inside stdlib source files makes Python more approachable. I'm sure several of us got into Python that way. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
Hi. On 17.4.2014. 20:15, Mark Young wrote: I think he meant modifying the source files themselves for debugging purposes (e.g. putting print statements in itertools.py). Exactly! :-) Best regards, Jurko Gospodnetić ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
On 04/17/2014 10:33 AM, Jurko Gospodnetić wrote: In general, what I really like about using Python for software development is the ability to open any stdlib file and easily go poking around using stuff like 'import pdb;pdb.set_trace()' or simple print statements. +1 -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
On Thu, Apr 17, 2014 at 8:17 PM, Antoine Pitrou solip...@pitrou.net wrote: On Thu, 17 Apr 2014 18:09:22 + Brett Cannon bcan...@gmail.com wrote: I would really love to have better startup times in production, but I would also really hate to lose the ability to hack around in stdlib sources during development just to get better startup performance. In general, what I really like about using Python for software development is the ability to open any stdlib file and easily go poking around using stuff like 'import pdb;pdb.set_trace()' or simple print statements. Researching mysterious behaviour is generally much much MUCH! easier (read: takes less hours/days/weeks) if it ends up leading into a stdlib Python module than if it takes you down into the bowels of some C module (think zipimport.c *grin*). Not to mention the effect that being able to quickly resolve a mystery by hacking on some Python internals leaves you feeling very satisfied, while having to entrench yourself in those internals for a long time just to find out you've made something foolish on your end leaves you feeling exhausted at best. Freezing modules does not affect the ability to use gdb. And as long as you set the appropriate __file__ values then tracebacks will contain even the file line and location. I sympathize with Jurko's opinion. Being able to poke inside stdlib source files makes Python more approachable. I'm sure several of us got into Python that way. Regards Antoine. I also wouldn't want that to be the default but Martin also suggested a -Z cmdline option which sounds like an interesting idea to me. ...Or maybe simply use the existent -O option, which doesn't really optimize much AFAIK. -- Giampaolo - http://grodola.blogspot.com ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
Hi. On 17.4.2014. 19:57, Guido van Rossum wrote: On Thu, Apr 17, 2014 at 10:33 AM, Jurko Gospodnetić jurko.gospodne...@pke.hr mailto:jurko.gospodne...@pke.hr wrote: I would really love to have better startup times in production, What's your use case? I understand why startup time is important for Hg, but I'd like to understand what other situations occur frequently enough to worry about it. The first one that pops to mind is scripting when automating different system administration tasks. When you automate something that ends up calling lots of different Python scripts - the startup times add up. Yes, I know you can update the system so that the scripts get called inside a single Python process, but that often requires major refactoring, e.g.: - you have to refactor those scripts to be importable while they were originally prepared to be used as 'stand-alone executables' - you either have to use Python as your external automation tool or you need to implement some sort of a Python based tool runner daemon process Another example is the speed at which some automated test suits run that need to call external Python scripts. Such suites often call thousands of such scripts so their startup times add up to such numbers that Python gets a bad rep. And shaving off unnecessarily wasted seconds or minutes in a test suite is always good, as it speeds up the whole develop/test cycle. :-) I've been in situations where I got a request to 'convert those Python scripts to batch files so they would run faster'. :-) And, while I really love Python as a development language, simple scripts implemented in it often do make the system feel kind of sluggish. :-( And with that in mind, the effect of systems becoming 'even more sluggish' when upgrading them to use the new 'Python 3' version, even if that slowdown is not all startup related, often comes as an additional slap in the face. :-( Best regards, Jurko Gospodnetić ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
I'm sorry to keep asking dumb questions, but your description didn't job my understanding of what you are comparing here. What is slower than what? On Thu, Apr 17, 2014 at 11:47 AM, Brett Cannon bcan...@gmail.com wrote: Because people keep bringing it up, below is the results of hacking up the interpreter to include a sys.path entry for ./python35.zip instead of hard-coding to /usr/lib/python35.zip and simply zipped up Lib/ recursively. TL;DR, zipimport performance no longer measures up (probably because of stat caching and such that importlib introduced). ### normal_startup ### Min: 0.510211 - 2.667958: 5.23x slower Avg: 0.521073 - 2.694876: 5.17x slower Significant (t=-1129.54) Stddev: 0.00478 - 0.01274: 2.6681x larger ### startup_nosite ### Min: 0.304090 - 0.908059: 2.99x slower Avg: 0.312374 - 0.921807: 2.95x slower Significant (t=-797.79) Stddev: 0.00372 - 0.00667: 1.7956x larger -Brett On Mon Apr 14 2014 at 5:51:23 PM, Brett Cannon bcan...@gmail.com wrote: It was realized during PyCon that since we are freezing importlib we could now consider freezing all the modules to cut out having to stat or read them from disk. So for day 1 of the sprints I I decided to hack up a proof-of-concept to see what kind of performance gain it would get. Freezing everything except encodings.__init__, os, and _sysconfigdata, it speeds up startup by 11% compared to default. Compared to 2.7 it shaves 14% from the slowdown (27% slower vs. 41% slower). The full results are at the end of the email. Now the question is whether the maintenance cost of having to rebuild Python for a select number of stdlib modules is enough to warrant putting in the effort to make this work. My guess is the best approach would be adding a Lib/_frozen directory where any modules that we treat like this would be kept to act as a reminder that you need to rebuild for them (I would probably move importlib/_boostrap.py as well to make this consistent). Thoughts? -- default vs the freezing: ### normal_startup ### Min: 0.524812 - 0.473339: 1.11x faster Avg: 0.534403 - 0.481245: 1.11x faster Significant (t=61.80) Stddev: 0.00466 - 0.00391: 1.1909x smaller ### startup_nosite ### Min: 0.307359 - 0.291939: 1.05x faster Avg: 0.317667 - 0.300156: 1.06x faster Significant (t=26.29) Stddev: 0.00543 - 0.00385: 1.4099x smaller - 2.7 vs the freezing: ### normal_startup ### Min: 0.367571 - 0.465264: 1.27x slower Avg: 0.374404 - 0.476662: 1.27x slower Significant (t=-90.26) Stddev: 0.00313 - 0.00738: 2.3603x larger ### startup_nosite ### Min: 0.164510 - 0.290544: 1.77x slower Avg: 0.169833 - 0.301109: 1.77x slower Significant (t=-286.30) Stddev: 0.00211 - 0.00407: 1.9310x larger - As a baseline, 2.7 vs default: ### normal_startup ### Min: 0.368916 - 0.521758: 1.41x slower Avg: 0.376784 - 0.531883: 1.41x slower Significant (t=-172.82) Stddev: 0.00423 - 0.00474: 1.1207x larger ### startup_nosite ### Min: 0.165156 - 0.309090: 1.87x slower Avg: 0.171516 - 0.319004: 1.86x slower Significant (t=-283.45) Stddev: 0.00334 - 0.00399: 1.1948x larger ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
On Apr 17, 2014, at 2:23 PM, Jurko Gospodnetić jurko.gospodne...@pke.hr wrote: Hi. On 17.4.2014. 19:57, Guido van Rossum wrote: On Thu, Apr 17, 2014 at 10:33 AM, Jurko Gospodnetić jurko.gospodne...@pke.hr mailto:jurko.gospodne...@pke.hr wrote: I would really love to have better startup times in production, What's your use case? I understand why startup time is important for Hg, but I'd like to understand what other situations occur frequently enough to worry about it. The first one that pops to mind is scripting when automating different system administration tasks. When you automate something that ends up calling lots of different Python scripts - the startup times add up. Yes, I know you can update the system so that the scripts get called inside a single Python process, but that often requires major refactoring, e.g.: - you have to refactor those scripts to be importable while they were originally prepared to be used as 'stand-alone executables' - you either have to use Python as your external automation tool or you need to implement some sort of a Python based tool runner daemon process Another example is the speed at which some automated test suits run that need to call external Python scripts. Such suites often call thousands of such scripts so their startup times add up to such numbers that Python gets a bad rep. And shaving off unnecessarily wasted seconds or minutes in a test suite is always good, as it speeds up the whole develop/test cycle. :-) pip invokes a ton of pythons in a subprocess in it’s test suite, and the “install from sdist” stuff tends to invoke 1-3 python’s per thing you install too. So any speed up there would make installing stuff faster. I've been in situations where I got a request to 'convert those Python scripts to batch files so they would run faster'. :-) And, while I really love Python as a development language, simple scripts implemented in it often do make the system feel kind of sluggish. :-( And with that in mind, the effect of systems becoming 'even more sluggish' when upgrading them to use the new 'Python 3' version, even if that slowdown is not all startup related, often comes as an additional slap in the face. :-( Best regards, Jurko Gospodnetić ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/donald%40stufft.io - Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA signature.asc Description: Message signed with OpenPGP using GPGMail ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
On Thu Apr 17 2014 at 3:21:49 PM, Guido van Rossum gu...@python.org wrote: I'm sorry to keep asking dumb questions, but your description didn't job my understanding of what you are comparing here. What is slower than what? Startup where the stdlib is entirely in a zip file is slower than the status quo of reading from files. IOW it looks like speeding up startup from an import perspective requires either freezing modules -- for about a 10% boost -- or some fundamental change in import that no one has thought of yet. -Brett On Thu, Apr 17, 2014 at 11:47 AM, Brett Cannon bcan...@gmail.com wrote: Because people keep bringing it up, below is the results of hacking up the interpreter to include a sys.path entry for ./python35.zip instead of hard-coding to /usr/lib/python35.zip and simply zipped up Lib/ recursively. TL;DR, zipimport performance no longer measures up (probably because of stat caching and such that importlib introduced). ### normal_startup ### Min: 0.510211 - 2.667958: 5.23x slower Avg: 0.521073 - 2.694876: 5.17x slower Significant (t=-1129.54) Stddev: 0.00478 - 0.01274: 2.6681x larger ### startup_nosite ### Min: 0.304090 - 0.908059: 2.99x slower Avg: 0.312374 - 0.921807: 2.95x slower Significant (t=-797.79) Stddev: 0.00372 - 0.00667: 1.7956x larger -Brett On Mon Apr 14 2014 at 5:51:23 PM, Brett Cannon bcan...@gmail.com wrote: It was realized during PyCon that since we are freezing importlib we could now consider freezing all the modules to cut out having to stat or read them from disk. So for day 1 of the sprints I I decided to hack up a proof-of-concept to see what kind of performance gain it would get. Freezing everything except encodings.__init__, os, and _sysconfigdata, it speeds up startup by 11% compared to default. Compared to 2.7 it shaves 14% from the slowdown (27% slower vs. 41% slower). The full results are at the end of the email. Now the question is whether the maintenance cost of having to rebuild Python for a select number of stdlib modules is enough to warrant putting in the effort to make this work. My guess is the best approach would be adding a Lib/_frozen directory where any modules that we treat like this would be kept to act as a reminder that you need to rebuild for them (I would probably move importlib/_boostrap.py as well to make this consistent). Thoughts? -- default vs the freezing: ### normal_startup ### Min: 0.524812 - 0.473339: 1.11x faster Avg: 0.534403 - 0.481245: 1.11x faster Significant (t=61.80) Stddev: 0.00466 - 0.00391: 1.1909x smaller ### startup_nosite ### Min: 0.307359 - 0.291939: 1.05x faster Avg: 0.317667 - 0.300156: 1.06x faster Significant (t=26.29) Stddev: 0.00543 - 0.00385: 1.4099x smaller - 2.7 vs the freezing: ### normal_startup ### Min: 0.367571 - 0.465264: 1.27x slower Avg: 0.374404 - 0.476662: 1.27x slower Significant (t=-90.26) Stddev: 0.00313 - 0.00738: 2.3603x larger ### startup_nosite ### Min: 0.164510 - 0.290544: 1.77x slower Avg: 0.169833 - 0.301109: 1.77x slower Significant (t=-286.30) Stddev: 0.00211 - 0.00407: 1.9310x larger - As a baseline, 2.7 vs default: ### normal_startup ### Min: 0.368916 - 0.521758: 1.41x slower Avg: 0.376784 - 0.531883: 1.41x slower Significant (t=-172.82) Stddev: 0.00423 - 0.00474: 1.1207x larger ### startup_nosite ### Min: 0.165156 - 0.309090: 1.87x slower Avg: 0.171516 - 0.319004: 1.86x slower Significant (t=-283.45) Stddev: 0.00334 - 0.00399: 1.1948x larger ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
On Thu, Apr 17, 2014 at 1:31 PM, Brett Cannon bcan...@gmail.com wrote: On Thu Apr 17 2014 at 3:21:49 PM, Guido van Rossum gu...@python.org wrote: I'm sorry to keep asking dumb questions, but your description didn't job my understanding of what you are comparing here. What is slower than what? Startup where the stdlib is entirely in a zip file is slower than the status quo of reading from files. That deserves more research. I'm not sure I believe we understand exactly what goes on in each case -- perhaps our zip reading code isn't as efficient as it could be? It would also be interesting to compare different platforms. IOW it looks like speeding up startup from an import perspective requires either freezing modules -- for about a 10% boost -- or some fundamental change in import that no one has thought of yet. And it's probably premature. (Unless you already have a prototype and it shows a solid speedup.) -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
Am 17.04.14 20:47, schrieb Brett Cannon: Because people keep bringing it up, below is the results of hacking up the interpreter to include a sys.path entry for ./python35.zip instead of hard-coding to /usr/lib/python35.zip and simply zipped up Lib/ recursively. TL;DR, zipimport performance no longer measures up (probably because of stat caching and such that importlib introduced). ### normal_startup ### Min: 0.510211 - 2.667958: 5.23x slower Not sure how to interpret this: what is 5.23x slower than what? Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
Am 17.04.14 20:47, schrieb Brett Cannon: Because people keep bringing it up, below is the results of hacking up the interpreter to include a sys.path entry for ./python35.zip instead of hard-coding to /usr/lib/python35.zip and simply zipped up Lib/ recursively. TL;DR, zipimport performance no longer measures up (probably because of stat caching and such that importlib introduced). [I found the answer on what is being compared in replies] So how did you create the zip file? Any chance that you may have compressed the pyc files? Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
On Thu Apr 17 2014 at 5:21:14 PM, Martin v. Löwis mar...@v.loewis.de wrote: Am 17.04.14 20:47, schrieb Brett Cannon: Because people keep bringing it up, below is the results of hacking up the interpreter to include a sys.path entry for ./python35.zip instead of hard-coding to /usr/lib/python35.zip and simply zipped up Lib/ recursively. TL;DR, zipimport performance no longer measures up (probably because of stat caching and such that importlib introduced). [I found the answer on what is being compared in replies] Yeah, I did it in under 5 minutes on a whim so I wasn't entirely thinking when I posted the numbers. So how did you create the zip file? zip ../python35.zip -r . Any chance that you may have compressed the pyc files? Yes. -Brett ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 04/17/2014 06:06 PM, Brett Cannon wrote: On Thu Apr 17 2014 at 5:21:14 PM, Martin v. Löwis mar...@v.loewis.de wrote: Am 17.04.14 20:47, schrieb Brett Cannon: Because people keep bringing it up, below is the results of hacking up the interpreter to include a sys.path entry for ./python35.zip instead of hard-coding to /usr/lib/python35.zip and simply zipped up Lib/ recursively. TL;DR, zipimport performance no longer measures up (probably because of stat caching and such that importlib introduced). [I found the answer on what is being compared in replies] Yeah, I did it in under 5 minutes on a whim so I wasn't entirely thinking when I posted the numbers. So how did you create the zip file? zip ../python35.zip -r . Any chance that you may have compressed the pyc files? I think you want 'zip -0' to avoid the compression. Tres. - -- === Tres Seaver +1 540-429-0999 tsea...@palladion.com Palladion Software Excellence by Designhttp://palladion.com -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlNQUzMACgkQ+gerLs4ltQ53XACcCihQVdb9h4RSnOphhkzu8AjU JsAAoJXClEcf4/McqA610Lh5SDdeHdhW =6pNL -END PGP SIGNATURE- ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
Am 14.04.14 23:51, schrieb Brett Cannon: It was realized during PyCon that since we are freezing importlib we could now consider freezing all the modules to cut out having to stat or read them from disk. [...] Thoughts? They still get read from disk, except that it is the operating system that does the reading. So what you really save is the access to many tiny files; something that can also be achieved with the zipfile import. So I wonder how your all-frozen binary compares to a standard binary with a python35.zip. If it is comparable, I'd rather extend on that route, i.e. promote putting the standard library into a zip file in the default installation, and also find a way where (say) /usr/bin/hg could conveniently specify a zip file that will contain the Mercurial byte code. For example, we could support a -Z option for the interpreter which would allow to append a zip file to a script that gets put on sys.path. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
On 4/16/2014 12:25 PM, Martin v. Löwis wrote: Am 14.04.14 23:51, schrieb Brett Cannon: It was realized during PyCon that since we are freezing importlib we could now consider freezing all the modules to cut out having to stat or read them from disk. [...] Thoughts? They still get read from disk, except that it is the operating system that does the reading. So what you really save is the access to many tiny files; something that can also be achieved with the zipfile import. So I wonder how your all-frozen binary compares to a standard binary with a python35.zip. If it is comparable, I'd rather extend on that route, i.e. promote putting the standard library into a zip file in the default installation, and also find a way where (say) /usr/bin/hg could conveniently specify a zip file that will contain the Mercurial byte code. For example, we could support a -Z option for the interpreter which would allow to append a zip file to a script that gets put on sys.path. This could be useful for Idle also, as its startup is noticeably sluggish and could definitely stand to be zippier. About 50 Idle modules are imported in the user process and, I presume, at least as many in the Idle process. PS. In the user process sys.modules, there are numerous null entries like these: sys.modules['idlelib.os'] sys.modules['idlelib.tokenize'] sys.modules['idlelib.io'] etcetera Does anyone know the most likely reason? -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
Is this Python 2 or 3? In Python 2 it means an attempt to perform a relative import failed but an absolute in succeeded, e.g. from idlelib you imported os, so import tried idlelib.is and then os. You should definitely consider using a future import to guarantee absolute imports. On Wednesday, April 16, 2014 2:57:35 PM, Terry Reedy tjre...@udel.edu wrote: On 4/16/2014 12:25 PM, Martin v. Löwis wrote: Am 14.04.14 23:51, schrieb Brett Cannon: It was realized during PyCon that since we are freezing importlib we could now consider freezing all the modules to cut out having to stat or read them from disk. [...] Thoughts? They still get read from disk, except that it is the operating system that does the reading. So what you really save is the access to many tiny files; something that can also be achieved with the zipfile import. So I wonder how your all-frozen binary compares to a standard binary with a python35.zip. If it is comparable, I'd rather extend on that route, i.e. promote putting the standard library into a zip file in the default installation, and also find a way where (say) /usr/bin/hg could conveniently specify a zip file that will contain the Mercurial byte code. For example, we could support a -Z option for the interpreter which would allow to append a zip file to a script that gets put on sys.path. This could be useful for Idle also, as its startup is noticeably sluggish and could definitely stand to be zippier. About 50 Idle modules are imported in the user process and, I presume, at least as many in the Idle process. PS. In the user process sys.modules, there are numerous null entries like these: sys.modules['idlelib.os'] sys.modules['idlelib.tokenize'] sys.modules['idlelib.io'] etcetera Does anyone know the most likely reason? -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ brett%40python.org ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
On 16 April 2014 12:25, Martin v. Löwis mar...@v.loewis.de wrote: Am 14.04.14 23:51, schrieb Brett Cannon: It was realized during PyCon that since we are freezing importlib we could now consider freezing all the modules to cut out having to stat or read them from disk. [...] Thoughts? They still get read from disk, except that it is the operating system that does the reading. So what you really save is the access to many tiny files; something that can also be achieved with the zipfile import. So I wonder how your all-frozen binary compares to a standard binary with a python35.zip. If it is comparable, I'd rather extend on that route, i.e. promote putting the standard library into a zip file in the default installation, and also find a way where (say) /usr/bin/hg could conveniently specify a zip file that will contain the Mercurial byte code. For example, we could support a -Z option for the interpreter which would allow to append a zip file to a script that gets put on sys.path. Has anyone tried running mercurial as a zipfile with __main__.py and a prepended shebang line rather than as a collection of independent files? Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
On Wednesday, April 16, 2014 2:57:35 PM, Terry Reedy tjre...@udel.edu mailto:tjre...@udel.edu wrote: PS. In the user process sys.modules, there are numerous null entries like these: sys.modules['idlelib.os'] sys.modules['idlelib.tokenize'__] sys.modules['idlelib.io http://idlelib.io'] etcetera On 4/16/2014 3:10 PM, Dr. Brett Cannon wrote: Is this Python 2 or 3? Py 2. I should have said so. The entries do not appear in py3. In Python 2 it means an attempt to perform a relative import failed but an absolute in succeeded, e.g. from idlelib you imported os, so import tried idlelib.is http://idlelib.is and then os. *I* have not done anything. For tokenize, for instance, the existing code just does what I though were absolute imports, in 2 files. import tokenize Perhaps the extra entries have something to do with the fact that these startup imports are invisible to user code, just like those done by the interpreter itself on startup. 2.7 uses spawnv (and 3.4 uses subprocces) to run something like one of the following. python -c __import__('idlelib.run').run.main(False) python -c __import__('run').main(False) run.py has several normal lines with import stdlib module from idlelib import idlelib module and ditto for some of the imported idlelib modules. You should definitely consider using a future import to guarantee absolute imports. -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
On Tue, Apr 15, 2014 at 8:21 AM, Brett Cannon bcan...@gmail.com wrote: In my work environment (Python 2.7.2, all the heavy lifting done in C++), startup costs are dominated by dynamic linking of all our C++ libraries and their Boost wrappers: Sure, but not everyone uses Boost or has long running processes where startup time is minuscule compared to the total execution time. Specific use-case that I can see: Mercurial. In a git vs hg shoot-out, git will usually win on performance, and hg is using Py2; migrating hg to Py3 will (if I understand the above figures correctly) widen that gap, so any improvement done to startup performance will give a very real advantage. ChrisA ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
On 14 Apr 2014 18:37, Glenn Linderman v+pyt...@g.nevcal.com wrote: On 4/14/2014 2:51 PM, Brett Cannon wrote: Freezing everything except encodings.__init__, os, and _sysconfigdata, I suppose these are omitted because they can vary in different environments? But isn't Python built for a particular environment... seems like os could be included? Seems like it would be helpful to have the utf8 encoding preloaded both to encourage people to use it rather than something else for the load-time performance gain (although likely minuscule for one encoding), and because they might as well, since they are spending the memory on it anyway! :) Via some moderately arcane hackery, UTF-8 support is already built in to the Py3 interpreter :) Cheers, Nick. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
On 15.04.2014 09:45, Chris Angelico wrote: On Tue, Apr 15, 2014 at 8:21 AM, Brett Cannon bcan...@gmail.com wrote: In my work environment (Python 2.7.2, all the heavy lifting done in C++), startup costs are dominated by dynamic linking of all our C++ libraries and their Boost wrappers: Sure, but not everyone uses Boost or has long running processes where startup time is minuscule compared to the total execution time. Specific use-case that I can see: Mercurial. In a git vs hg shoot-out, git will usually win on performance, and hg is using Py2; migrating hg to Py3 will (if I understand the above figures correctly) widen that gap, so any improvement done to startup performance will give a very real advantage. You might want to have a look at this project: http://pyrun.org/ It's currently Python 2 only, but we'll try to get it to work with Python 3.4 as well, now that freeze.py and some other bits have been fixed to make it work again. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ ::: Try our new mxODBC.Connect Python Database Interface for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
On Mon, Apr 14, 2014 at 3:51 PM, Brett Cannon bcan...@gmail.com wrote: It was realized during PyCon that since we are freezing importlib we could now consider freezing all the modules to cut out having to stat or read them from disk. So for day 1 of the sprints I I decided to hack up a proof-of-concept to see what kind of performance gain it would get. Freezing everything except encodings.__init__, os, and _sysconfigdata, it speeds up startup by 11% compared to default. Compared to 2.7 it shaves 14% from the slowdown (27% slower vs. 41% slower). The full results are at the end of the email. Nice. I was hoping it would be even bigger (given the hyper-focus people put on the impact of FS-access on startup time imports), but this is definitely a significant improvement. I wonder then where the remaining slowdown lies; are there any remaining low hanging fruit elsewhere? Now the question is whether the maintenance cost of having to rebuild Python for a select number of stdlib modules is enough to warrant putting in the effort to make this work. Yeah. Definitely the big question. Who cares the most about startup time? Would this improvement please them? Does that alone make it worth the increased maintenance burden? Is that group big enough or important enough to justify it? At the very least it may be good for the PR value alone, but the maintenance cost will long outlive the PR benefit. :) My guess is the best approach would be adding a Lib/_frozen directory where any modules that we treat like this would be kept to act as a reminder that you need to rebuild for them (I would probably move importlib/_boostrap.py as well to make this consistent). That makes sense. I also wonder if we could accomplish the same thing with a marker (e.g. a comment) in each related module (and leave them where they are). A marker would allow for easily finding the freezable modules. Personally, I think the speedup would be worth it if it doesn't add significant to the maintenance burden. -eric ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
On Tue, Apr 15, 2014 at 1:45 AM, Chris Angelico ros...@gmail.com wrote: Specific use-case that I can see: Mercurial. In a git vs hg shoot-out, git will usually win on performance, and hg is using Py2; migrating hg to Py3 will (if I understand the above figures correctly) widen that gap, so any improvement done to startup performance will give a very real advantage. Perhaps not so much a very real advantage as less of a distraction. It's still significantly slower than 2.7. :) -eric ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
IIRC it is no longer the case that ZIP imports (involving only one file for a lot of modules) are much faster than regular FS imports? On Tue, Apr 15, 2014 at 10:34 AM, Eric Snow ericsnowcurren...@gmail.com wrote: On Tue, Apr 15, 2014 at 1:45 AM, Chris Angelico ros...@gmail.com wrote: Specific use-case that I can see: Mercurial. In a git vs hg shoot-out, git will usually win on performance, and hg is using Py2; migrating hg to Py3 will (if I understand the above figures correctly) widen that gap, so any improvement done to startup performance will give a very real advantage. Perhaps not so much a very real advantage as less of a distraction. It's still significantly slower than 2.7. :) -eric ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/dholth%40gmail.com ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
On Tue, Apr 15, 2014 at 11:19 AM, Daniel Holth dho...@gmail.com wrote: IIRC it is no longer the case that ZIP imports (involving only one file for a lot of modules) are much faster than regular FS imports? It's definitely minimized since Python 3.3 and the caching of stat results at the directory level for a small amount of time. -Brett On Tue, Apr 15, 2014 at 10:34 AM, Eric Snow ericsnowcurren...@gmail.com wrote: On Tue, Apr 15, 2014 at 1:45 AM, Chris Angelico ros...@gmail.com wrote: Specific use-case that I can see: Mercurial. In a git vs hg shoot-out, git will usually win on performance, and hg is using Py2; migrating hg to Py3 will (if I understand the above figures correctly) widen that gap, so any improvement done to startup performance will give a very real advantage. Perhaps not so much a very real advantage as less of a distraction. It's still significantly slower than 2.7. :) -eric ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/dholth%40gmail.com ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/brett%40python.org ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
Le 15/04/2014 09:45, Chris Angelico a écrit : Specific use-case that I can see: Mercurial. In a git vs hg shoot-out, git will usually win on performance, and hg is using Py2; Keep in mind those shoot-outs usually rely on large repositories and/or non-trivial operations, so startup time is not necessarily a significant contributor in Mercurial being slower (when it actually is slower than git, which may not be all the time). Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
Le 14/04/2014 23:51, Brett Cannon a écrit : It was realized during PyCon that since we are freezing importlib we could now consider freezing all the modules to cut out having to stat or read them from disk. So for day 1 of the sprints I I decided to hack up a proof-of-concept to see what kind of performance gain it would get. Freezing everything except encodings.__init__, os, and _sysconfigdata, it speeds up startup by 11% compared to default. That sounds like a rather small number for the amount of complication and opacity it adds into the build and startup process. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
On Wed, Apr 16, 2014 at 2:40 AM, Antoine Pitrou solip...@pitrou.net wrote: Le 15/04/2014 09:45, Chris Angelico a écrit : Specific use-case that I can see: Mercurial. In a git vs hg shoot-out, git will usually win on performance, and hg is using Py2; Keep in mind those shoot-outs usually rely on large repositories and/or non-trivial operations, so startup time is not necessarily a significant contributor in Mercurial being slower (when it actually is slower than git, which may not be all the time). I'm talking also about the feel of actual daily use, partly on big repos like git (git), CPython (hg), and Pike (git), and partly on some smaller ones. Whether it's startup cost or operational cost I don't know, but if I want it consistently fast, I generally go for git. ChrisA ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
Can we please stop the argument about Hg vs. Git? On Tue, Apr 15, 2014 at 12:54 PM, Chris Angelico ros...@gmail.com wrote: On Wed, Apr 16, 2014 at 2:40 AM, Antoine Pitrou solip...@pitrou.net wrote: Le 15/04/2014 09:45, Chris Angelico a écrit : Specific use-case that I can see: Mercurial. In a git vs hg shoot-out, git will usually win on performance, and hg is using Py2; Keep in mind those shoot-outs usually rely on large repositories and/or non-trivial operations, so startup time is not necessarily a significant contributor in Mercurial being slower (when it actually is slower than git, which may not be all the time). I'm talking also about the feel of actual daily use, partly on big repos like git (git), CPython (hg), and Pike (git), and partly on some smaller ones. Whether it's startup cost or operational cost I don't know, but if I want it consistently fast, I generally go for git. ChrisA ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
On Wed, Apr 16, 2014 at 4:54 AM, Guido van Rossum gu...@python.org wrote: Can we please stop the argument about Hg vs. Git? My apologies. All I was saying was that hg is a use case where startup performance really does matter, as opposed to the ones presented earlier in the thread where a process stays in memory longer. It wasn't meant to devolve like that. ChrisA ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
Brett Cannon, 14.04.2014 23:51: It was realized during PyCon that since we are freezing importlib we could now consider freezing all the modules to cut out having to stat or read them from disk. So for day 1 of the sprints I I decided to hack up a proof-of-concept to see what kind of performance gain it would get. Freezing everything except encodings.__init__, os, and _sysconfigdata, it speeds up startup by 11% compared to default. Compared to 2.7 it shaves 14% from the slowdown (27% slower vs. 41% slower). The full results are at the end of the email. Now the question is whether the maintenance cost of having to rebuild Python for a select number of stdlib modules is enough to warrant putting in the effort to make this work. My guess is the best approach would be adding a Lib/_frozen directory where any modules that we treat like this would be kept to act as a reminder that you need to rebuild for them (I would probably move importlib/_boostrap.py as well to make this consistent). Thoughts? Alternatively, the modules could be compiled with Cython. That would not only speed up the loading time but also the initialisation time and runtime. And since you'd keep the original .py files next to the .so files, you'd still get proper tracebacks. Compiling the modules natively would also enable linking them right into the interpreter core, BTW. But that would substantially increase its size. Maybe some of them could still be worth being linked in. Stefan ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
On 4/14/2014 2:51 PM, Brett Cannon wrote: consider freezing all the modules ... Now the question is whether the maintenance cost of having to rebuild Python for a select number of stdlib modules all versus select number. So I'm guessing the proposal is to freeze all the modules that Python imports just to get itself running, which would consume no additional memory when frozen, and saves time per your performance numbers, rather than the whole stdlib, which is what is sort of implied by all. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
On Mon, Apr 14, 2014 at 4:51 PM, Brett Cannon bcan...@gmail.com wrote: Thoughts? Interesting idea, but YAGNI? In my work environment (Python 2.7.2, all the heavy lifting done in C++), startup costs are dominated by dynamic linking of all our C++ libraries and their Boost wrappers: % time python -c 'import tradelink.snake.v11_2 ; raise SystemExit' real 0m0.671s user 0m0.405s sys 0m0.044s % time python -c 'raise SystemExit' real 0m0.022s user 0m0.011s sys 0m0.009s Skip ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
On Mon, Apr 14, 2014 at 6:15 PM, Skip Montanaro s...@pobox.com wrote: On Mon, Apr 14, 2014 at 4:51 PM, Brett Cannon bcan...@gmail.com wrote: Thoughts? Interesting idea, but YAGNI? Not at all. Think of every script you execute that's written in Python. One of the things the Mercurial folks say is hindering any motivation to switch to Python 3 is the startup performance. In my work environment (Python 2.7.2, all the heavy lifting done in C++), startup costs are dominated by dynamic linking of all our C++ libraries and their Boost wrappers: Sure, but not everyone uses Boost or has long running processes where startup time is minuscule compared to the total execution time. -Brett % time python -c 'import tradelink.snake.v11_2 ; raise SystemExit' real 0m0.671s user 0m0.405s sys 0m0.044s % time python -c 'raise SystemExit' real 0m0.022s user 0m0.011s sys 0m0.009s Skip ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
On Mon, Apr 14, 2014 at 6:07 PM, Glenn Linderman v+pyt...@g.nevcal.comwrote: On 4/14/2014 2:51 PM, Brett Cannon wrote: consider freezing all the modules ... Now the question is whether the maintenance cost of having to rebuild Python for a select number of stdlib modules all versus select number. So I'm guessing the proposal is to freeze all the modules that Python imports just to get itself running, which would consume no additional memory when frozen, and saves time per your performance numbers, rather than the whole stdlib, which is what is sort of implied by all. Yes, exactly. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] this is what happens if you freeze all the modules required for startup
On 4/14/2014 2:51 PM, Brett Cannon wrote: Freezing everything except encodings.__init__, os, and _sysconfigdata, I suppose these are omitted because they can vary in different environments? But isn't Python built for a particular environment... seems like os could be included? Seems like it would be helpful to have the utf8 encoding preloaded both to encourage people to use it rather than something else for the load-time performance gain (although likely minuscule for one encoding), and because they might as well, since they are spending the memory on it anyway! :) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com