Re: [Python-Dev] More optimisation ideas
On Fri, Feb 5, 2016, at 10:33 AM, Emile van Sebille wrote: > On 2/5/2016 9:37 AM, Alexander Walters wrote: > > > > On 2/5/2016 12:27, Emile van Sebille wrote: > >> On 2/1/2016 9:20 AM, Ethan Furman wrote: > >>> On 02/01/2016 08:40 AM, R. David Murray wrote: > >> > On the other hand, if the distros go the way Nick has (I think) been > advocating, and have a separate 'system python for system scripts' that > is independent of the one installed for user use, having the > system-only > python be frozen and sourceless would actually make sense on a > couple of > levels. > >>> > >>> Agreed. > >> > >> Except for that nasty licensing issue requiring source code. > >> > >> Emile > > Licensing requires, in the GPL at least, that the *modified* sources be > > made *available*, not that they be shipped with the product. Looking at > > the Python license, and what tools already do, there is zero need to > > ship the source to stay compliant. > > Hmm, the annotated Open Source Definition explicitly states "The program > must include source code" -- how did I misinterpret that? Couple things. First, the OSD is not authoritative. Python's license establishes the rules of its distribution: that Python's license is considered compatible with the OSD doesn't actually mean your reading of anything on the OSD page as having any binding meaning. Second, OSD's Rule 2 means that those who are distributing Python -- the PSF, originally -- must provide source code if they're distributing it under Python's license, but it doesn't actually mean it must be packaged with it in every download. In fact, its not today. The standard library source is included in normal downloads, but the C source of Python isn't. But you can download it readily though, so that's fine. Its fully compliant with the OSD. But! If Debian (pulling them out of a hat randomly) is distributing Python, they aren't the PSF, and notably are not bound by the OSD rules, only by Python's license terms. The PSF satisfied their requirements to the licensing terms when releasing Python, but now Debian has Python, and they are distributing it-- that's an entirely separate act, and you must look at them as a separate actor in terms of the license. They don't have to distribute it in the same license. They must be ABLE to (as OSD's Rule 3 says), but they don't HAVE to. Some random person can take Python, rename it Snakey, and release it under almost any license they want and give no one the source code at all. Python has from the beginning allowed this:its actually in quite a few closed source / proprietary products without ever advertising it and providing no source, entirely legally and ethically -- Python's gone out of its way to support this sort of use-case. As it happens, Debian usually distributes something very close to the official release (sometimes they backport patches and such), and always does so under the same license as Python (AFAICT), but they don't *have* to. GPL is copyleft and requires its derivative works to be GPL'd (or at least, no more restrictive then GPL)-- so in GPL, to distribute it you MUST distribute it under GPL-compatible terms. Python is a permissive license and allows anyone to do basically anything, INCLUDING produce closed source releases if someone wanted to, or just release modifications or modules that are available under different licenses. The OSD encompasses both ends of the spectrum: the GPL's mandate of source access and the OSD's mandate of the receiver to be able to distribute in the same terms they received (notably, NOT the same terms it was originally released under). -- Stephen Hansen m e @ i x o k a i . i o ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
On 2/5/2016 12:27, Emile van Sebille wrote: On 2/1/2016 9:20 AM, Ethan Furman wrote: On 02/01/2016 08:40 AM, R. David Murray wrote: On the other hand, if the distros go the way Nick has (I think) been advocating, and have a separate 'system python for system scripts' that is independent of the one installed for user use, having the system-only python be frozen and sourceless would actually make sense on a couple of levels. Agreed. Except for that nasty licensing issue requiring source code. Emile Licensing requires, in the GPL at least, that the *modified* sources be made *available*, not that they be shipped with the product. Looking at the Python license, and what tools already do, there is zero need to ship the source to stay compliant. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
On Fri, 5 Feb 2016 at 10:34 Emile van Sebillewrote: > On 2/5/2016 9:37 AM, Alexander Walters wrote: > > > > > > On 2/5/2016 12:27, Emile van Sebille wrote: > >> On 2/1/2016 9:20 AM, Ethan Furman wrote: > >>> On 02/01/2016 08:40 AM, R. David Murray wrote: > >> > On the other hand, if the distros go the way Nick has (I think) been > advocating, and have a separate 'system python for system scripts' > that > is independent of the one installed for user use, having the > system-only > python be frozen and sourceless would actually make sense on a > couple of > levels. > >>> > >>> Agreed. > >> > >> Except for that nasty licensing issue requiring source code. > >> > >> Emile > > Licensing requires, in the GPL at least, that the *modified* sources be > > made *available*, not that they be shipped with the product. Looking at > > the Python license, and what tools already do, there is zero need to > > ship the source to stay compliant. > > Hmm, the annotated Open Source Definition explicitly states "The program > must include source code" -- how did I misinterpret that? > Because you left off the part following: "... and must allow distribution in source code as well as compiled form". This is entirely a discussion of distribution in a compiled form. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
On 2/1/2016 9:20 AM, Ethan Furman wrote: On 02/01/2016 08:40 AM, R. David Murray wrote: On the other hand, if the distros go the way Nick has (I think) been advocating, and have a separate 'system python for system scripts' that is independent of the one installed for user use, having the system-only python be frozen and sourceless would actually make sense on a couple of levels. Agreed. Except for that nasty licensing issue requiring source code. Emile ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
On 2/5/2016 9:37 AM, Alexander Walters wrote: On 2/5/2016 12:27, Emile van Sebille wrote: On 2/1/2016 9:20 AM, Ethan Furman wrote: On 02/01/2016 08:40 AM, R. David Murray wrote: On the other hand, if the distros go the way Nick has (I think) been advocating, and have a separate 'system python for system scripts' that is independent of the one installed for user use, having the system-only python be frozen and sourceless would actually make sense on a couple of levels. Agreed. Except for that nasty licensing issue requiring source code. Emile Licensing requires, in the GPL at least, that the *modified* sources be made *available*, not that they be shipped with the product. Looking at the Python license, and what tools already do, there is zero need to ship the source to stay compliant. Hmm, the annotated Open Source Definition explicitly states "The program must include source code" -- how did I misinterpret that? Emile http://opensource.org/osd-annotated ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
On Friday, February 5, 2016 11:57 AM, Emile van Sebillewrote: > Aah, 'must' is less restrictive in this context than I expected. When > you combine the two halves the first part might be more accurately > phrased as 'The program must make source code available' rather than > 'must include' which I understood to mean 'ship with'. First, step back and think of this in common sense terms: If being open source required any Python installation to have the .py source to the .pyc or .zip files in the stdlib, surely it would also require any Python installation to have the .c source to the interpreter too. But lots of people have Python without having the .c source. Also, the GPL isn't typical of all open source licenses, it's only typical of _copyleft_ licenses. Permissive licenses, like Python's, are very different. Copyleft licenses are designed to make sure that all derived works are also copylefted; permissive licenses are designed to permit derived works as widely as possible. As the Python license specifically says, "All Python licenses, unlike the GPL, let you distribute a modified version without making your changes open source." Meanwhile, the fact that someone has decided that the Python license qualifies under the Open Source Definition doesn't mean the OSD is the right way to understand it. Read the license itself, or one of the summaries at opensource.org or fsf.org. (And if you still can't figure something out, and it's important to your work, you almost certainly need to ask a lawyer.) So, if you think the first sentence of section 2 of the OSD contradicts the explanation in the rest of the paragraph--well, even if you're right, that doesn't affect Python's license at all. Finally, if you want to see what it takes to actually make all the terms unambiguous both to ordinary human beings and to legal codes, see the GPL FAQ sections on their definitions of "propagate" and "convey". It may take you lots of careful reading to understand it, but when you finally do, it's definitely unambiguous. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
On 2/5/2016 10:38 AM, Brett Cannon wrote: On Fri, 5 Feb 2016 at 10:34 Emile van Sebille> wrote: >> Except for that nasty licensing issue requiring source code. >> >> Emile > Licensing requires, in the GPL at least, that the *modified* sources be > made *available*, not that they be shipped with the product. Looking at > the Python license, and what tools already do, there is zero need to > ship the source to stay compliant. Hmm, the annotated Open Source Definition explicitly states "The program must include source code" -- how did I misinterpret that? Because you left off the part following: "... and must allow distribution in source code as well as compiled form". This is entirely a discussion of distribution in a compiled form. Aah, 'must' is less restrictive in this context than I expected. When you combine the two halves the first part might be more accurately phrased as 'The program must make source code available' rather than 'must include' which I understood to mean 'ship with'. Emile ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
On 5 February 2016 at 15:05, Steven D'Apranowrote: > (I'm not even sure if this suggestion makes sense, since I'm not really > sure what "freezing" the stdlib entails. Is it documented anywhere?) It's not particularly well documented - most of the docs you'll find are about freeze utilities that don't explain how they work, or the FrozenImporter, which doesn't explain how to *create* a frozen module and link it into your Python executable. Your approach of thinking of a frozen module as a generated .pyc file that has been converted to a builtin module is a pretty good working model, though. (It isn't *entirely* accurate, but the discrepancies are sufficiently arcane that they aren't going to matter in any case that doesn't involve specifically poking around at the import related attributes). Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
On 2/4/2016 12:18 PM, Sven R. Kunze wrote: On 04.02.2016 14:09, Nick Coghlan wrote: On 2 February 2016 at 06:39, Andrew Barnert via Python-Devwrote: On Feb 1, 2016, at 09:59,mike.romb...@comcast.net wrote: If the stdlib were to use implicit namespace packages (https://www.python.org/dev/peps/pep-0420/ ) and the various loaders/importers as well, then python could do what I've done with an embedded python application for years. Freeze the stdlib (or put it in a zipfile or whatever is fast). Then arrange PYTHONPATH to first look on the filesystem and then look in the frozen/ziped storage. This is a great solution for experienced developers, but I think it would be pretty bad for novices or transplants from other languages (maybe even including Python 2). There are already multiple duplicate questions every month on StackOverflow from people asking "how do I find the source to stdlib module X". The canonical answer starts off by explaining how to import the module and use its __file__, which everyone is able to handle. If we have to instead explain how to work out the .py name from the qualified module name, how to work out the stdlib path from sys.path, and then how to find the source from those two things, with the caveat that it may not be installed at all on some platforms, and how to make sure what they're asking about really is a stdlib module, and how to make sure they aren't shadowing it with a module elsewhere on sys.path, that's a lot more complicated. Especially when you consider that some people on Windows and Mac are writing Py thon scripts without ever learning how to use the terminal or find their Python packages via Explorer/Finder. For folks that *do* know how to use the terminal: $ python3 -m inspect --details inspect Target: inspect Origin: /usr/lib64/python3.4/inspect.py Cached: /usr/lib64/python3.4/__pycache__/inspect.cpython-34.pyc Loader: <_frozen_importlib.SourceFileLoader object at 0x7f0d8d23d9b0> (And if they just want to *read* the source code, then leaving out "--details" prints the full module source, and would work even if the standard library were in a zip archive) This is completely inadequate as a replacement for loading source into an editor, even if just for reading. First, on Windows, the console defaults to 300 lines. Print more and only the last 300 lines remain. The max is buffer size is . But setting the buffer to that is obnoxious because the buffer is then padded with blank lines to make lines. The little rectangle that one grabs in the scrollbar is then scaled down to almost nothing, becoming hard to grab. Second is navigation. No Find, Find-next, or Find-all. Because of padding, moving to the unpadded 'bottom of file' is difficult. Third, for a repository version, I would have to type, without error, instead of 'python3', some version of, for instance, some suffix of 'F:/python/dev/35/PcBuild//python_d.exe'. "" depends, I believe, on the build options. I want to see and debug also core Python in PyCharm and this is not acceptable. If you want to make it opt-in, fine. But opt-out is a no-go. I have a side-by-side comparison as we use Java and Python in production. It's the *ease of access* that makes Python great compared to Java. @Andrew Even for experienced developers it just sucks and there are more important things to do. I agree that removing stdlib python source files by default is an poor idea. The disk space saved is trivial. So, for me, would be nearly all of the time saving. Over recent versions, more and more source files have been linked to in the docs. Guido recently approved of linking the rest. Removing source contradicts this trend. Easily loading modules, including stdlib modules, into an IDLE Editor Window is a documented feature that goes back to the original commit in Aug 2000. We not not usually break stdlib features without acknowledgement, some decussion, and a positive decision to do so. Someone has already mentioned the degredation of tracebacks. So why not just leave the source files alone in /Lib. As far as I can see, they would not hurt anything At least on Windows, zip files are treated as directories and python35.zip comes before /Lib on sys.path. The Windows installer currently has an option, selected by default I believe, to run compileall. Add to compileall an option to compile all to python35.zip rather than __pycache and and use that in that installer. Even if the zip is including in the installer, compileall-zip + source files would let adventurous people patch their stdlib files. Editing a stdlib file, to see if a confirmed bug disappeared (it did), was how I made my first code contribution. If I had had to download and setup svn and maybe visual c to try a one line change, I would not have done it. -- Terry Jan Reedy ___ Python-Dev mailing list
Re: [Python-Dev] More optimisation ideas
On 2 February 2016 at 02:40, R. David Murraywrote: > On the other hand, if the distros go the way Nick has (I think) been > advocating, and have a separate 'system python for system scripts' that > is independent of the one installed for user use, having the system-only > python be frozen and sourceless would actually make sense on a couple of > levels. While omitting Python source files does let us reduce base image sizes (quite significantly), the current perspective in Fedora and Project Atomic is that going bytecode-only (whether frozen or not) breaks too many things to be worthwhile. As one simple example, it means tracebacks no longer include source code lines, dramatically increasing the difficulty of debugging failures. As such, we're more likely to pursue minimisation efforts by splitting the standard library up into "stuff essential distro components use" and "the rest of the standard library that upstream defines" than by figuring out how to avoid shipping source files (I believe Debian already makes this distinction with the python-minimal vs python split). Zipping up the standard library doesn't break tracebacks though, so it's potentially worth exploring that option further. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
On 2 February 2016 at 06:39, Andrew Barnert via Python-Devwrote: > On Feb 1, 2016, at 09:59, mike.romb...@comcast.net wrote: >> >> If the stdlib were to use implicit namespace packages >> ( https://www.python.org/dev/peps/pep-0420/ ) and the various >> loaders/importers as well, then python could do what I've done with an >> embedded python application for years. Freeze the stdlib (or put it >> in a zipfile or whatever is fast). Then arrange PYTHONPATH to first >> look on the filesystem and then look in the frozen/ziped storage. > > This is a great solution for experienced developers, but I think it would be > pretty bad for novices or transplants from other languages (maybe even > including Python 2). > > There are already multiple duplicate questions every month on StackOverflow > from people asking "how do I find the source to stdlib module X". The > canonical answer starts off by explaining how to import the module and use > its __file__, which everyone is able to handle. If we have to instead explain > how to work out the .py name from the qualified module name, how to work out > the stdlib path from sys.path, and then how to find the source from those two > things, with the caveat that it may not be installed at all on some > platforms, and how to make sure what they're asking about really is a stdlib > module, and how to make sure they aren't shadowing it with a module elsewhere > on sys.path, that's a lot more complicated. Especially when you consider that > some people on Windows and Mac are writing Python scripts without ever > learning how to use the terminal or find their Python packages via > Explorer/Finder. For folks that *do* know how to use the terminal: $ python3 -m inspect --details inspect Target: inspect Origin: /usr/lib64/python3.4/inspect.py Cached: /usr/lib64/python3.4/__pycache__/inspect.cpython-34.pyc Loader: <_frozen_importlib.SourceFileLoader object at 0x7f0d8d23d9b0> (And if they just want to *read* the source code, then leaving out "--details" prints the full module source, and would work even if the standard library were in a zip archive) Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
On Thu, Feb 04, 2016 at 07:58:30PM -0500, Terry Reedy wrote: > >>For folks that *do* know how to use the terminal: > >> > >>$ python3 -m inspect --details inspect > >>Target: inspect > >>Origin: /usr/lib64/python3.4/inspect.py > >>Cached: /usr/lib64/python3.4/__pycache__/inspect.cpython-34.pyc > >>Loader: <_frozen_importlib.SourceFileLoader object at 0x7f0d8d23d9b0> > >> > >>(And if they just want to *read* the source code, then leaving out > >>"--details" prints the full module source, and would work even if the > >>standard library were in a zip archive) > > This is completely inadequate as a replacement for loading source into > an editor, even if just for reading. [...] I agree with Terry. The inspect trick Nick describes above is a great feature to have, but it's not a substitute for opening the source in an editor, not even on OSes where the command line tools are more powerful than Windows' default tools. [...] > I agree that removing stdlib python source files by default is an poor > idea. The disk space saved is trivial. So, for me, would be nearly all > of the time saving. I too would be very reluctant to remove the source files from Python by default, but I have an alternative. I don't know if this is a ridiculous idea or not, but now that the .pyc bytecode files are kept in a separate __pycache__ directory, could we freeze that directory and leave the source files available for reading? (I'm not even sure if this suggestion makes sense, since I'm not really sure what "freezing" the stdlib entails. Is it documented anywhere?) -- Steve ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
On 04.02.2016 14:09, Nick Coghlan wrote: On 2 February 2016 at 06:39, Andrew Barnert via Python-Devwrote: On Feb 1, 2016, at 09:59, mike.romb...@comcast.net wrote: If the stdlib were to use implicit namespace packages ( https://www.python.org/dev/peps/pep-0420/ ) and the various loaders/importers as well, then python could do what I've done with an embedded python application for years. Freeze the stdlib (or put it in a zipfile or whatever is fast). Then arrange PYTHONPATH to first look on the filesystem and then look in the frozen/ziped storage. This is a great solution for experienced developers, but I think it would be pretty bad for novices or transplants from other languages (maybe even including Python 2). There are already multiple duplicate questions every month on StackOverflow from people asking "how do I find the source to stdlib module X". The canonical answer starts off by explaining how to import the module and use its __file__, which everyone is able to handle. If we have to instead explain how to work out the .py name from the qualified module name, how to work out the stdlib path from sys.path, and then how to find the source from those two things, with the caveat that it may not be installed at all on some platforms, and how to make sure what they're asking about really is a stdlib module, and how to make sure they aren't shadowing it with a module elsewhere on sys.path, that's a lot more complicated. Especially when you consider that some people on Windows and Mac are writing Python scripts without ever learning how to use the terminal or find their Python packages via Explorer/Finder. For folks that *do* know how to use the terminal: $ python3 -m inspect --details inspect Target: inspect Origin: /usr/lib64/python3.4/inspect.py Cached: /usr/lib64/python3.4/__pycache__/inspect.cpython-34.pyc Loader: <_frozen_importlib.SourceFileLoader object at 0x7f0d8d23d9b0> (And if they just want to *read* the source code, then leaving out "--details" prints the full module source, and would work even if the standard library were in a zip archive) I want to see and debug also core Python in PyCharm and this is not acceptable. If you want to make it opt-in, fine. But opt-out is a no-go. I have a side-by-side comparison as we use Java and Python in production. It's the *ease of access* that makes Python great compared to Java. @Andrew Even for experienced developers it just sucks and there are more important things to do. Best, Sven ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
On 02/01/2016 08:40 AM, R. David Murray wrote: On Mon, 01 Feb 2016 14:12:27 +1100, Steven D'Aprano wrote: I find that being able to easily open stdlib .py files in a text editor to read the source is extremely valuable. I've learned much more from reading the source than from (e.g.) StackOverflow. Likewise, it's often handy to do a grep over the stdlib. When you talk about freezing the stdlib, what exactly does that mean? - will the source files still be there? Well, Brett said it would be optional, though perhaps the above paragraph is asking about doing it in our Windows build. But the linux distros might make also use the option if it exists, so the question is very meaningful. However, you'd have to ask the distro if the source would be shipped in the linux case, and I'd guess not in most cases. I don't know about anyone else, but on my own development systems it is not that unusual for me to *edit* the stdlib files (to add debug prints) while debugging my own programs. Freeze would definitely interfere with that. I could, of course, install a separate source build on my dev system, but I thought it worth mentioning as a factor. Yup, so do I. On the other hand, if the distros go the way Nick has (I think) been advocating, and have a separate 'system python for system scripts' that is independent of the one installed for user use, having the system-only python be frozen and sourceless would actually make sense on a couple of levels. Agreed. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
On Mon, 01 Feb 2016 14:12:27 +1100, Steven D'Apranowrote: > On Sun, Jan 31, 2016 at 08:23:00PM +, Brett Cannon wrote: > > So freezing the stdlib helps on UNIX and not on OS X (if my old testing is > > still accurate). I guess the next question is what it does on Windows and > > if we would want to ever consider freezing the stdlib as part of the build > > process (and if we would want to change the order of importers on > > sys.meta_path so frozen modules came after file-based ones). > > I find that being able to easily open stdlib .py files in a text editor > to read the source is extremely valuable. I've learned much more from > reading the source than from (e.g.) StackOverflow. Likewise, it's often > handy to do a grep over the stdlib. When you talk about freezing the > stdlib, what exactly does that mean? > > - will the source files still be there? Well, Brett said it would be optional, though perhaps the above paragraph is asking about doing it in our Windows build. But the linux distros might make also use the option if it exists, so the question is very meaningful. However, you'd have to ask the distro if the source would be shipped in the linux case, and I'd guess not in most cases. I don't know about anyone else, but on my own development systems it is not that unusual for me to *edit* the stdlib files (to add debug prints) while debugging my own programs. Freeze would definitely interfere with that. I could, of course, install a separate source build on my dev system, but I thought it worth mentioning as a factor. On the other hand, if the distros go the way Nick has (I think) been advocating, and have a separate 'system python for system scripts' that is independent of the one installed for user use, having the system-only python be frozen and sourceless would actually make sense on a couple of levels. --David ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
Thanks, Brett. Wasn't aware of lazy imports as well. I think that one is even better reducing startup time as freezing stdlib. On 31.01.2016 18:57, Brett Cannon wrote: I have opened http://bugs.python.org/issue26252 to track writing the example (and before ppl go playing with the lazy loader, be aware of http://bugs.python.org/issue26186). On Sun, 31 Jan 2016 at 09:26 Brett Cannon> wrote: There are no example docs for it yet, but enough people have asked this week about how to set up a custom importer that I will write up a generic example case which will make sense for a lazy loader (need to file the issue before I forget). On Sun, 31 Jan 2016, 09:11 Donald Stufft > wrote: On Jan 31, 2016, at 12:02 PM, Brett Cannon > wrote: A lazy importer was added in Python 3.5 Is there any docs on how to actually use the LazyLoader in 3.5? I can’t seem to find any but I don’t really know the import system that well. - Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/srkunze%40mail.de ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
On Feb 1, 2016, at 09:59, mike.romb...@comcast.net wrote: > > If the stdlib were to use implicit namespace packages > ( https://www.python.org/dev/peps/pep-0420/ ) and the various > loaders/importers as well, then python could do what I've done with an > embedded python application for years. Freeze the stdlib (or put it > in a zipfile or whatever is fast). Then arrange PYTHONPATH to first > look on the filesystem and then look in the frozen/ziped storage. This is a great solution for experienced developers, but I think it would be pretty bad for novices or transplants from other languages (maybe even including Python 2). There are already multiple duplicate questions every month on StackOverflow from people asking "how do I find the source to stdlib module X". The canonical answer starts off by explaining how to import the module and use its __file__, which everyone is able to handle. If we have to instead explain how to work out the .py name from the qualified module name, how to work out the stdlib path from sys.path, and then how to find the source from those two things, with the caveat that it may not be installed at all on some platforms, and how to make sure what they're asking about really is a stdlib module, and how to make sure they aren't shadowing it with a module elsewhere on sys.path, that's a lot more complicated. Especially when you consider that some people on Windows and Mac are writing Python scripts without ever learning how to use the terminal or find their Python packages via Explorer/Finder. And meanwhile, other people would be asking why their app runs slower on one machine than another, because they didn't expect that installing python-dev on top of python would slow down startup. Finally, on Linux and Mac, the stdlib will usually be somewhere that's not user-writable--and we shouldn't expect users to have to mess with stuff in /usr/lib or /System/Library even if they do have sudo access. Of course we could put a "stdlib shadow" location on the sys.path and configure it for /usr/local/lib and /Library and/or for somewhere in -, but that just makes the lookup proceed even more complicated--not to mention that we've just added three stat calls to remove one open, at which point the optimization has probably become a pessimization. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
> " " == Barry Warsawwrites: >> On Feb 01, 2016, at 11:40 AM, R. David Murray wrote: >> I don't know about anyone else, but on my own development >> systems it is not that unusual for me to *edit* the stdlib >> files (to add debug prints) while debugging my own programs. >> Freeze would definitely interfere with that. I could, of >> course, install a separate source build on my dev system, but I >> thought it worth mentioning as a factor. [snip] > But even with system scripts, I do need to step through them > occasionally. If it were a matter of changing a shebang or > invoking the script with a different Python > (e.g. /usr/bin/python3s vs. /usr/bin/python3) to get the full > unpacked source, that would be fine. If the stdlib were to use implicit namespace packages ( https://www.python.org/dev/peps/pep-0420/ ) and the various loaders/importers as well, then python could do what I've done with an embedded python application for years. Freeze the stdlib (or put it in a zipfile or whatever is fast). Then arrange PYTHONPATH to first look on the filesystem and then look in the frozen/ziped storage. Normally the filesystem part is empty. So, modules are loaded from the frozen/zip area. But if you wanna override one of the frozen modules simply copy one or more .py files onto the file system. I've been doing this only with modules in the global scope. But implicit namespace packages seem to open the door for this with packages. Mike ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
On Feb 01, 2016, at 11:40 AM, R. David Murray wrote: >Well, Brett said it would be optional, though perhaps the above >paragraph is asking about doing it in our Windows build. But the linux >distros might make also use the option if it exists, so the question is >very meaningful. However, you'd have to ask the distro if the source >would be shipped in the linux case, and I'd guess not in most cases. It's very likely the .py files would still be shipped, but perhaps in a -dev package that isn't normally installed. >I don't know about anyone else, but on my own development systems it is >not that unusual for me to *edit* the stdlib files (to add debug prints) >while debugging my own programs. Freeze would definitely interfere with >that. I could, of course, install a separate source build on my dev >system, but I thought it worth mentioning as a factor. I do this too, though usually in a VM or chroot and not in my live system. A very common situation for me though is pdb stepping through my own code and landing in -or passing through- stdlib. >On the other hand, if the distros go the way Nick has (I think) been >advocating, and have a separate 'system python for system scripts' that >is independent of the one installed for user use, having the system-only >python be frozen and sourceless would actually make sense on a couple of >levels. Yep, we've talked about it in Debian-land too, but never quite gotten around to doing anything. Certainly I'd like to see some consistency among Linux distros there (i.e. discussed on linux-sig@). But even with system scripts, I do need to step through them occasionally. If it were a matter of changing a shebang or invoking the script with a different Python (e.g. /usr/bin/python3s vs. /usr/bin/python3) to get the full unpacked source, that would be fine. Cheers, -Barry ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
On Feb 01 2016, mike.romb...@comcast.net wrote: " " == Barry Warsawwrites: >> On Feb 01, 2016, at 11:40 AM, R. David Murray wrote: >> I don't know about anyone else, but on my own development >> systems it is not that unusual for me to *edit* the >> stdlib files (to add debug prints) while debugging my own >> programs. Freeze would definitely interfere with that. >> I could, of course, install a separate source build on my >> dev system, but I thought it worth mentioning as a >> factor. [snip] > But even with system scripts, I do need to step through > them occasionally. If it were a matter of changing a > shebang or invoking the script with a different Python > (e.g. /usr/bin/python3s vs. /usr/bin/python3) to get the > full unpacked source, that would be fine. If the stdlib were to use implicit namespace packages ( https://www.python.org/dev/peps/pep-0420/ ) and the various loaders/importers as well, then python could do what I've done with an embedded python application for years. Freeze the stdlib (or put it in a zipfile or whatever is fast). Then arrange PYTHONPATH to first look on the filesystem and then look in the frozen/ziped storage. Presumably that would eliminate the performance advantages of the frozen/zipped storage because now Python would still have to issue all the stat calls to first check for the existence of a .py file. Best, -Nikolaus (No Cc on replies please, I'm reading the list) -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F »Time flies like an arrow, fruit flies like a Banana.« ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
On Feb 1, 2016, at 19:44, Terry Reedywrote: > >> On 2/1/2016 3:39 PM, Andrew Barnert via Python-Dev wrote: >> >> There are already multiple duplicate questions every month on >> StackOverflow from people asking "how do I find the source to stdlib >> module X". The canonical answer starts off by explaining how to >> import the module and use its __file__, which everyone is able to >> handle. > > Perhaps even easier: start IDLE, hit Alt-M, type in module name as one would > import it, click OK. If Python source is available, IDLE will open in an > editor window. with the path on the title bar. > >> If we have to instead explain how to work out the .py name >> from the qualified module name, how to work out the stdlib path from >> sys.path, and then how to find the source from those two things, with >> the caveat that it may not be installed at all on some platforms, and >> how to make sure what they're asking about really is a stdlib module, >> and how to make sure they aren't shadowing it with a module elsewhere >> on sys.path, that's a lot more complicated. > > The windows has the path on the title bar, so one can tell what was loaded. The point of this thread is the suggestion that the stdlib modules be frozen or stored in a zipfile, unless a user modifies things in some way to make the source accessible. So, if a user hasn't done that (which no novice will know how to do), there won't be a path to show in the title bar, so IDLE won't be any more help than the command line. (I suppose IDLE could grow a new feature to look up "associated source files" for a zipped stdlib or something, but that seems like a pretty big new feature.) > IDLE currently uses imp.find_module (this could be updated), with a backup of > __import__(...).__file__, so it will load non-stdlib files that can be > imported. > > > Finally, on Linux and Mac, the stdlib will usually be somewhere > > that's not user-writable > > On Windows, this depends on the install location. Perhaps there should be an > option for edit-save or view only to avoid accidental changes. The problem is that, if the standard way for users to see stdlib sources is to copy them from somewhere else (like $install/src/Lib) into a stdlib directory (like $install/Lib), then that stdlib directory has to be writable--and on Mac and Linux, it's not. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
On 2/1/2016 3:39 PM, Andrew Barnert via Python-Dev wrote: There are already multiple duplicate questions every month on StackOverflow from people asking "how do I find the source to stdlib module X". The canonical answer starts off by explaining how to import the module and use its __file__, which everyone is able to handle. Perhaps even easier: start IDLE, hit Alt-M, type in module name as one would import it, click OK. If Python source is available, IDLE will open in an editor window. with the path on the title bar. If we have to instead explain how to work out the .py name from the qualified module name, how to work out the stdlib path from sys.path, and then how to find the source from those two things, with the caveat that it may not be installed at all on some platforms, and how to make sure what they're asking about really is a stdlib module, and how to make sure they aren't shadowing it with a module elsewhere on sys.path, that's a lot more complicated. The windows has the path on the title bar, so one can tell what was loaded. IDLE currently uses imp.find_module (this could be updated), with a backup of __import__(...).__file__, so it will load non-stdlib files that can be imported. > Finally, on Linux and Mac, the stdlib will usually be somewhere > that's not user-writable On Windows, this depends on the install location. Perhaps there should be an option for edit-save or view only to avoid accidental changes. -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
On Mon, 1 Feb 2016 at 08:48 R. David Murraywrote: > On Mon, 01 Feb 2016 14:12:27 +1100, Steven D'Aprano > wrote: > > On Sun, Jan 31, 2016 at 08:23:00PM +, Brett Cannon wrote: > > > So freezing the stdlib helps on UNIX and not on OS X (if my old > testing is > > > still accurate). I guess the next question is what it does on Windows > and > > > if we would want to ever consider freezing the stdlib as part of the > build > > > process (and if we would want to change the order of importers on > > > sys.meta_path so frozen modules came after file-based ones). > > > > I find that being able to easily open stdlib .py files in a text editor > > to read the source is extremely valuable. I've learned much more from > > reading the source than from (e.g.) StackOverflow. Likewise, it's often > > handy to do a grep over the stdlib. When you talk about freezing the > > stdlib, what exactly does that mean? > > > > - will the source files still be there? > > Well, Brett said it would be optional, though perhaps the above > paragraph is asking about doing it in our Windows build. Nope, it would probably need to be across all OSs to have consistent semantics. > But the linux > distros might make also use the option if it exists, so the question is > very meaningful. However, you'd have to ask the distro if the source > would be shipped in the linux case, and I'd guess not in most cases. > > I don't know about anyone else, but on my own development systems it is > not that unusual for me to *edit* the stdlib files (to add debug prints) > while debugging my own programs. Freeze would definitely interfere with > that. I could, of course, install a separate source build on my dev > system, but I thought it worth mentioning as a factor. > This is what would need to be discussed in terms of how to handle this. For instance, we already do stuff in (I believe) site.py when we detect the build is in a checkout, so we could in that instance make sure the stdlib file directory takes precedence over any frozen code (hence why I wondered if the frozen importer on sys.meta_path should come after the sys.path importer). If we did that then we could make installing the stdlib files optional but still take precedence. It's all workable, it's just a question of if we want to. This is why I think we should get concrete benchmark numbers on Windows, Linux, and OS X to see if this is even worth considering as something we provide in our own binaries. > > On the other hand, if the distros go the way Nick has (I think) been > advocating, and have a separate 'system python for system scripts' that > is independent of the one installed for user use, having the system-only > python be frozen and sourceless would actually make sense on a couple of > levels. > It at least wouldn't hurt anything. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
So freezing the stdlib helps on UNIX and not on OS X (if my old testing is still accurate). I guess the next question is what it does on Windows and if we would want to ever consider freezing the stdlib as part of the build process (and if we would want to change the order of importers on sys.meta_path so frozen modules came after file-based ones). On Sun, 31 Jan 2016, 10:43 M.-A. Lemburg <m...@egenix.com> wrote: > On 30.01.2016 20:15, Steve Dower wrote: > > Brett tried freezing the entire stdlib at one point (as we do for parts > of importlib) and reported no significant improvement. Since that rules out > code compilation as well as the OS calls, it'd seem the priority is to > execute less code on startup. > > > > Details of that work were posted to python-dev about twelve months ago, > IIRC. Maybe a little longer. > > Freezing the entire stdlib does improve the startup time, > simply because it removes stat calls, which dominate the startup > time at least on Unix. > > It also allows sharing the stdlib byte code in memory, since it gets > stored in static C structs which the OS will happily mmap into > multiple processes for you without any additional effort. > > Our eGenix PyRun does exactly that. Even though the original > motivation is a different one, the gained improvement in > startup time is a nice side effect: > > http://www.egenix.com/products/python/PyRun/ > > Aside: The encodings don't really make much difference here. The > dictionaries aren't all that big, so generating them on the fly doesn't > really create much overhead. The trade off in terms of > maintainability/speed > definitely leans toward maintainability. For the larger encoding > tables we already have C implementations with appropriate data > structures to make lookup speed vs. storage needs efficient. > > -- > Marc-Andre Lemburg > eGenix.com > > Professional Python Services directly from the Experts (#1, Jan 31 2016) > >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ > >>> Python Database Interfaces ... http://products.egenix.com/ > >>> Plone/Zope Database Interfaces ... http://zope.egenix.com/ > > > ::: We implement business ideas - efficiently in both time and costs ::: > >eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 > D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg >Registered at Amtsgericht Duesseldorf: HRB 46611 >http://www.egenix.com/company/contact/ > http://www.malemburg.com/ > > > > Top-posted from my Windows Phone > > > > -Original Message- > > From: "Serhiy Storchaka" <storch...@gmail.com> > > Sent: 1/30/2016 10:22 > > To: "python-dev@python.org" <python-dev@python.org> > > Subject: Re: [Python-Dev] More optimisation ideas > > > > On 30.01.16 18:31, Steve Dower wrote: > >> On 30Jan2016 0645, Serhiy Storchaka wrote: > >>> $ ./python -m timeit -s "import codecs; from encodings.cp437 import > >>> decoding_table" -- "codecs.charmap_build(decoding_table)" > >>> 10 loops, best of 3: 4.36 usec per loop > >>> > >>> Getting rid from charmap_build() would save you at most 4.4 > microseconds > >>> per encoding. 0.0005 seconds if you have imported *all* standard > >>> encodings! > >> > >> Just as happy to be proven wrong. Perhaps I misinterpreted my original > >> profiling and then, embarrassingly, ran with the result for a long time > >> without retesting. > > > > AFAIK the most time is spent in system calls like stat or open. > > Archiving the stdlib into the ZIP file and using zipimport can decrease > > Python startup time (perhaps there is an open issue about this). > > > > > > ___ > > Python-Dev mailing list > > Python-Dev@python.org > > https://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/steve.dower%40python.org > > > > > > > > ___ > > Python-Dev mailing list > > Python-Dev@python.org > > https://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/mal%40egenix.com > > > > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
On 1/31/2016 12:09 PM, Antoine Pitrou wrote: The following documentation leaves me absolutely clueless: """This class only works with loaders that define exec_module() as control over what module type is used for the module is required. No wonder. I cannot parse it as an English sentence. It needs rewriting. For those same reasons, the loader’s create_module() method will be ignored (i.e., the loader’s method should only return None). Finally, modules which substitute the object placed into sys.modules will not work as there is no way to properly replace the module references throughout the interpreter safely; ValueError is raised if such a substitution is detected.""" (reference: https://docs.python.org/3/library/importlib.html#importlib.util.LazyLoader) -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
I have opened http://bugs.python.org/issue26252 to track writing the example (and before ppl go playing with the lazy loader, be aware of http://bugs.python.org/issue26186). On Sun, 31 Jan 2016 at 09:26 Brett Cannonwrote: > There are no example docs for it yet, but enough people have asked this > week about how to set up a custom importer that I will write up a generic > example case which will make sense for a lazy loader (need to file the > issue before I forget). > > On Sun, 31 Jan 2016, 09:11 Donald Stufft wrote: > >> >> On Jan 31, 2016, at 12:02 PM, Brett Cannon wrote: >> >> A lazy importer was added in Python 3.5 >> >> >> Is there any docs on how to actually use the LazyLoader in 3.5? I can’t >> seem to find any but I don’t really know the import system that well. >> >> - >> Donald Stufft >> PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 >> DCFA >> >> ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
> On Jan 31, 2016, at 12:02 PM, Brett Cannonwrote: > > A lazy importer was added in Python 3.5 Is there any docs on how to actually use the LazyLoader in 3.5? I can’t seem to find any but I don’t really know the import system that well. - Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA signature.asc Description: Message signed with OpenPGP using GPGMail ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
Brett Cannon python.org> writes: > > > A lazy importer was added in Python 3.5 and it was not possible > without the module spec refactoring. Wow... Thank you, I didn't know about that. Now for the next question: how am I supposed to use it? The following documentation leaves me absolutely clueless: """This class only works with loaders that define exec_module() as control over what module type is used for the module is required. For those same reasons, the loader’s create_module() method will be ignored (i.e., the loader’s method should only return None). Finally, modules which substitute the object placed into sys.modules will not work as there is no way to properly replace the module references throughout the interpreter safely; ValueError is raised if such a substitution is detected.""" (reference: https://docs.python.org/3/library/importlib.html#importlib.util.LazyLoader) I want to import lazily the modules from package "foobar.*", but not other modules as other libraries may depend on import side effects. How do I do that? The quoted snippet doesn't really help. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
There are no example docs for it yet, but enough people have asked this week about how to set up a custom importer that I will write up a generic example case which will make sense for a lazy loader (need to file the issue before I forget). On Sun, 31 Jan 2016, 09:11 Donald Stufftwrote: > > On Jan 31, 2016, at 12:02 PM, Brett Cannon wrote: > > A lazy importer was added in Python 3.5 > > > Is there any docs on how to actually use the LazyLoader in 3.5? I can’t > seem to find any but I don’t really know the import system that well. > > - > Donald Stufft > PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 > DCFA > > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
On 30.01.2016 20:15, Steve Dower wrote: > Brett tried freezing the entire stdlib at one point (as we do for parts of > importlib) and reported no significant improvement. Since that rules out code > compilation as well as the OS calls, it'd seem the priority is to execute > less code on startup. > > Details of that work were posted to python-dev about twelve months ago, IIRC. > Maybe a little longer. Freezing the entire stdlib does improve the startup time, simply because it removes stat calls, which dominate the startup time at least on Unix. It also allows sharing the stdlib byte code in memory, since it gets stored in static C structs which the OS will happily mmap into multiple processes for you without any additional effort. Our eGenix PyRun does exactly that. Even though the original motivation is a different one, the gained improvement in startup time is a nice side effect: http://www.egenix.com/products/python/PyRun/ Aside: The encodings don't really make much difference here. The dictionaries aren't all that big, so generating them on the fly doesn't really create much overhead. The trade off in terms of maintainability/speed definitely leans toward maintainability. For the larger encoding tables we already have C implementations with appropriate data structures to make lookup speed vs. storage needs efficient. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Jan 31 2016) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> Python Database Interfaces ... http://products.egenix.com/ >>> Plone/Zope Database Interfaces ... http://zope.egenix.com/ ::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/ > Top-posted from my Windows Phone > > -Original Message- > From: "Serhiy Storchaka" <storch...@gmail.com> > Sent: 1/30/2016 10:22 > To: "python-dev@python.org" <python-dev@python.org> > Subject: Re: [Python-Dev] More optimisation ideas > > On 30.01.16 18:31, Steve Dower wrote: >> On 30Jan2016 0645, Serhiy Storchaka wrote: >>> $ ./python -m timeit -s "import codecs; from encodings.cp437 import >>> decoding_table" -- "codecs.charmap_build(decoding_table)" >>> 10 loops, best of 3: 4.36 usec per loop >>> >>> Getting rid from charmap_build() would save you at most 4.4 microseconds >>> per encoding. 0.0005 seconds if you have imported *all* standard >>> encodings! >> >> Just as happy to be proven wrong. Perhaps I misinterpreted my original >> profiling and then, embarrassingly, ran with the result for a long time >> without retesting. > > AFAIK the most time is spent in system calls like stat or open. > Archiving the stdlib into the ZIP file and using zipimport can decrease > Python startup time (perhaps there is an open issue about this). > > > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/steve.dower%40python.org > > > > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/mal%40egenix.com > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
Hi, If you want to make startup time faster for a broad range of applications, please consider adding a lazy import facility in the stdlib. I recently tried to write a lazy import mechanism using import hooks (to make it portable from 2.6 to 3.5), it seems nearly impossible to do so (or, at least, for an average Python programmer like me). This would be much more useful (for actual users, not for architecture astronauts) than refactoring the importlib APIs in each feature version... Thanks in advance Antoine. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
A lazy importer was added in Python 3.5 and it was not possible without the module spec refactoring. On Sun, 31 Jan 2016, 08:57 Antoine Pitrouwrote: > > Hi, > > If you want to make startup time faster for a broad range of applications, > please consider adding a lazy import facility in the stdlib. > I recently tried to write a lazy import mechanism using import hooks > (to make it portable from 2.6 to 3.5), it seems nearly impossible to do > so (or, at least, for an average Python programmer like me). > > This would be much more useful (for actual users, not for architecture > astronauts) than refactoring the importlib APIs in each feature version... > > Thanks in advance > > Antoine. > > > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
On Sun, 31 Jan 2016, 15:36 Terry Reedywrote: > On 1/31/2016 12:09 PM, Antoine Pitrou wrote: > > > The following documentation leaves me absolutely clueless: > > > > """This class only works with loaders that define exec_module() as > control > > over what module type is used for the module is required. > > No wonder. I cannot parse it as an English sentence. It needs rewriting. > Feel free to open an issue to clarify the wording. -Brett > > For those same > > reasons, the loader’s create_module() method will be ignored (i.e., the > > loader’s method should only return None). Finally, modules which > substitute > > the object placed into sys.modules will not work as there is no way to > > properly replace the module references throughout the interpreter safely; > > ValueError is raised if such a substitution is detected.""" > > > > (reference: > > > https://docs.python.org/3/library/importlib.html#importlib.util.LazyLoader > ) > > -- > Terry Jan Reedy > > > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
On Sun, Jan 31, 2016 at 08:23:00PM +, Brett Cannon wrote: > So freezing the stdlib helps on UNIX and not on OS X (if my old testing is > still accurate). I guess the next question is what it does on Windows and > if we would want to ever consider freezing the stdlib as part of the build > process (and if we would want to change the order of importers on > sys.meta_path so frozen modules came after file-based ones). I find that being able to easily open stdlib .py files in a text editor to read the source is extremely valuable. I've learned much more from reading the source than from (e.g.) StackOverflow. Likewise, it's often handy to do a grep over the stdlib. When you talk about freezing the stdlib, what exactly does that mean? - will the source files still be there? - how will this affect people writing patches for bugs? -- Steve ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
On Sat, Jan 30, 2016, 12:30 Sven R. Kunzewrote: > On 30.01.2016 19:20, Serhiy Storchaka wrote: > > AFAIK the most time is spent in system calls like stat or open. > > Archiving the stdlib into the ZIP file and using zipimport can > > decrease Python startup time (perhaps there is an open issue about this). > > Oh, please don't. One thing I love about Python is the ease of access. > It wouldn't be a requirement, just a nootion > I personally think that startup time is not really a big issue; even > when it comes to microbenchmarks. > You might not, but just about every command-line app does. -brett > Best, > Sven > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
Brett tried freezing the entire stdlib at one point (as we do for parts of importlib) and reported no significant improvement. Since that rules out code compilation as well as the OS calls, it'd seem the priority is to execute less code on startup. Details of that work were posted to python-dev about twelve months ago, IIRC. Maybe a little longer. Top-posted from my Windows Phone -Original Message- From: "Serhiy Storchaka" <storch...@gmail.com> Sent: 1/30/2016 10:22 To: "python-dev@python.org" <python-dev@python.org> Subject: Re: [Python-Dev] More optimisation ideas On 30.01.16 18:31, Steve Dower wrote: > On 30Jan2016 0645, Serhiy Storchaka wrote: >> $ ./python -m timeit -s "import codecs; from encodings.cp437 import >> decoding_table" -- "codecs.charmap_build(decoding_table)" >> 10 loops, best of 3: 4.36 usec per loop >> >> Getting rid from charmap_build() would save you at most 4.4 microseconds >> per encoding. 0.0005 seconds if you have imported *all* standard >> encodings! > > Just as happy to be proven wrong. Perhaps I misinterpreted my original > profiling and then, embarrassingly, ran with the result for a long time > without retesting. AFAIK the most time is spent in system calls like stat or open. Archiving the stdlib into the ZIP file and using zipimport can decrease Python startup time (perhaps there is an open issue about this). ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/steve.dower%40python.org ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
On 30Jan2016 0645, Serhiy Storchaka wrote: $ ./python -m timeit -s "import codecs; from encodings.cp437 import decoding_table" -- "codecs.charmap_build(decoding_table)" 10 loops, best of 3: 4.36 usec per loop Getting rid from charmap_build() would save you at most 4.4 microseconds per encoding. 0.0005 seconds if you have imported *all* standard encodings! Just as happy to be proven wrong. Perhaps I misinterpreted my original profiling and then, embarrassingly, ran with the result for a long time without retesting. And how you expected to store encoding_table in more efficient way? There's nothing inefficient about its storage, but as it does not change it would be trivial to store it statically. Then "building" the map is simply obtaining a pointer into an already loaded memory page. Much faster than building it on load, but both are clearly insignificant compared to other factors. Cheers, Steve ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
On 30.01.16 18:31, Steve Dower wrote: On 30Jan2016 0645, Serhiy Storchaka wrote: $ ./python -m timeit -s "import codecs; from encodings.cp437 import decoding_table" -- "codecs.charmap_build(decoding_table)" 10 loops, best of 3: 4.36 usec per loop Getting rid from charmap_build() would save you at most 4.4 microseconds per encoding. 0.0005 seconds if you have imported *all* standard encodings! Just as happy to be proven wrong. Perhaps I misinterpreted my original profiling and then, embarrassingly, ran with the result for a long time without retesting. AFAIK the most time is spent in system calls like stat or open. Archiving the stdlib into the ZIP file and using zipimport can decrease Python startup time (perhaps there is an open issue about this). ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
On Sat, 30 Jan 2016 at 10:21 Serhiy Storchakawrote: > On 30.01.16 18:31, Steve Dower wrote: > > On 30Jan2016 0645, Serhiy Storchaka wrote: > >> $ ./python -m timeit -s "import codecs; from encodings.cp437 import > >> decoding_table" -- "codecs.charmap_build(decoding_table)" > >> 10 loops, best of 3: 4.36 usec per loop > >> > >> Getting rid from charmap_build() would save you at most 4.4 microseconds > >> per encoding. 0.0005 seconds if you have imported *all* standard > >> encodings! > > > > Just as happy to be proven wrong. Perhaps I misinterpreted my original > > profiling and then, embarrassingly, ran with the result for a long time > > without retesting. > > AFAIK the most time is spent in system calls like stat or open. > Archiving the stdlib into the ZIP file and using zipimport can decrease > Python startup time (perhaps there is an open issue about this). > Check the archives, but I did trying freezing the entire stdlib and it didn't really make a difference in startup, so I don't know if this still holds true anymore. At this point I think all of our knowledge of what takes the most amount of time during startup is outdated and someone should try to really profile the whole thing to see where the hotspots are (e.g., is it stat calls from imports, is it actually some specific function, is it just so many little things adding up to a big thing, etc.). ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
On 30 January 2016 at 03:48, Steve Dowerwrote: > > It doesn't currently end up on disk. Some tables are partially or completely > stored on disk as Python source code (some are partially generated from > simple rules), but others are generated by inverting those. That process > takes time that could be avoided by storing the generated tables, and > storing all of it in a format that doesn't require parsing, compiling and > executing (such as a native array). > > Potentially it could be a win all around if we stopped including the > (larger) source files, but that doesn't seem like a good idea for > maintaining portability to other implementations. The main thought is making > the compiler binary bigger to avoid generating encoding tables at startup. When I last tried to profile startup on Windows (I haven't used Windows for some time now) it seemed that the time was totally dominated by file system access. Essentially the limiting factor was the inordinate number of stat calls and small file accesses. Although this was probably Python 2.x which may not import those particular modules and maybe it depends on virus scanner software etc. Things may have changed now but I concluded that substantive gains could only come from improving FS access. Perhaps something like zipping up the standard library would see a big improvement. -- Oscar ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
On 29.01.16 19:05, Steve Dower wrote: This is probably the code snippet that bothered me the most: ### Encoding table encoding_table=codecs.charmap_build(decoding_table) It shows up in many of the encodings modules, and while it is not a bad function in itself, we are obviously generating a known data structure on every startup. Storing these in static data is a tradeoff between disk space and startup performance, and one I think it likely to be worthwhile. $ ./python -m timeit -s "import codecs; from encodings.cp437 import decoding_table" -- "codecs.charmap_build(decoding_table)" 10 loops, best of 3: 4.36 usec per loop Getting rid from charmap_build() would save you at most 4.4 microseconds per encoding. 0.0005 seconds if you have imported *all* standard encodings! And how you expected to store encoding_table in more efficient way? ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
On 30.01.2016 19:20, Serhiy Storchaka wrote: AFAIK the most time is spent in system calls like stat or open. Archiving the stdlib into the ZIP file and using zipimport can decrease Python startup time (perhaps there is an open issue about this). Oh, please don't. One thing I love about Python is the ease of access. I personally think that startup time is not really a big issue; even when it comes to microbenchmarks. Best, Sven ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
On 30.01.2016 21:32, Brett Cannon wrote: On Sat, Jan 30, 2016, 12:30 Sven R. Kunze> wrote: On 30.01.2016 19:20, Serhiy Storchaka wrote: > AFAIK the most time is spent in system calls like stat or open. > Archiving the stdlib into the ZIP file and using zipimport can > decrease Python startup time (perhaps there is an open issue about this). Oh, please don't. One thing I love about Python is the ease of access. It wouldn't be a requirement, just a nootion That's good. :) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
Hi, > > Storing these in static data is a tradeoff between > disk space and startup performance, and one I think it likely to be > worthwhile. it's really an important trade off? As far a I understand from your email those modules are always being loaded and the final data created. won't the space be there (on mem or disk)? Thanks in advance! francis ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More optimisation ideas
It doesn't currently end up on disk. Some tables are partially or completely stored on disk as Python source code (some are partially generated from simple rules), but others are generated by inverting those. That process takes time that could be avoided by storing the generated tables, and storing all of it in a format that doesn't require parsing, compiling and executing (such as a native array). Potentially it could be a win all around if we stopped including the (larger) source files, but that doesn't seem like a good idea for maintaining portability to other implementations. The main thought is making the compiler binary bigger to avoid generating encoding tables at startup. Top-posted from my Windows Phone -Original Message- From: "francismb" <franci...@email.de> Sent: 1/29/2016 13:56 To: "python-dev@python.org" <python-dev@python.org> Subject: Re: [Python-Dev] More optimisation ideas Hi, > > Storing these in static data is a tradeoff between > disk space and startup performance, and one I think it likely to be > worthwhile. it's really an important trade off? As far a I understand from your email those modules are always being loaded and the final data created. won't the space be there (on mem or disk)? Thanks in advance! francis ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/steve.dower%40python.org ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] More optimisation ideas
Since we're all talking about making Python faster, I thought I'd drop some previous ideas I've had here in case (1) someone wants to actually do them, and (2) they really are new ideas that haven't failed in the past. Mostly I was thinking about startup time. Here are the list of modules imported on clean startup on my Windows, US-English machine (from -v and cleaned up a bit): import _frozen_importlib import _imp import sys import '_warnings' import '_thread' import '_weakref' import '_frozen_importlib_external' import '_io' import 'marshal' import 'nt' import '_thread' import '_weakref' import 'winreg' import 'zipimport' import '_codecs' import 'codecs' import 'encodings.aliases' import 'encodings' import 'encodings.mbcs' import '_signal' import 'encodings.utf_8' import 'encodings.latin_1' import '_weakrefset' import 'abc' import 'io' import 'encodings.cp437' import 'errno' import '_stat' import 'stat' import 'genericpath' import 'ntpath' import '_collections_abc' import 'os' import '_sitebuiltins' import 'sysconfig' import '_locale' import '_bootlocale' import 'encodings.cp1252' import 'site' Obviously the easiest first thing is to remove or delay unnecessary imports. But a while ago I used a native profiler to trace through this and the most impactful modules were the encodings: import 'encodings.mbcs' import 'encodings.utf_8' import 'encodings.latin_1' import 'encodings.cp437' import 'encodings.cp1252' While I don't doubt that we need all of these for *some* reason, aliases, cp437 and cp1252 are relatively expensive modules to import. Mostly due to having large static dictionaries or data structures generated on startup. Given this is static and mostly read-only information[1], I see no reason why we couldn't either generate completely static versions of them, or better yet compile the resulting data structures into the core binary. ([1]: If being able to write to some of the encoding data is used by some people, I vote for breaking that for 3.6 and making it read-only.) This is probably the code snippet that bothered me the most: ### Encoding table encoding_table=codecs.charmap_build(decoding_table) It shows up in many of the encodings modules, and while it is not a bad function in itself, we are obviously generating a known data structure on every startup. Storing these in static data is a tradeoff between disk space and startup performance, and one I think it likely to be worthwhile. Anyway, just an idea if someone wants to try it and see what improvements we can get. I'd love to do it myself, but when it actually comes to finding time I keep coming up short. Cheers, Steve P.S. If you just want to discuss optimisation techniques or benchmarking in general, without specific application to CPython 3.6, there's a whole internet out there. Please don't make me the cause of a pointless centithread. :) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com