Re: [Python-Dev] PEP 0424: A method for exposing a length hint
On Sun, Jul 15, 2012 at 1:36 PM, Antoine Pitrou solip...@pitrou.net wrote: On Sun, 15 Jul 2012 18:47:38 +1000 Nick Coghlan ncogh...@gmail.com wrote: I'm not seeing the value in returning None over 0 for the don't know case - it just makes the API harder to use. The point is that 0 is a legitimate value for a length hint. Simple implementations of __length_hint__ will start returning 0 as a legitimate value and you will wrongly interpret that as don't know, which kinds of defeat the purpose of __length-hint__ ;) I agree with this: giving special meaning to what's already a valid length value seems wrong. Mark ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 0424: A method for exposing a length hint
On Tue, 17 Jul 2012 13:19:55 +1000 Nick Coghlan ncogh...@gmail.com wrote: There are no provisions for infinite iterators, that is not within the scope of this proposal. I'll repeat my observation that remaining silent on this point is effectively identical to blessing the practice of raising an exception in __length_hint__ to force fast failure of attempts to convert an infinite iterator to a concrete container. And I'll repeat that it is false ;) Being silent is certainly not the same thing as blessing a non-existent practice. Regards Antoine. -- Software development and contracting: http://pro.pitrou.net ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] io.BytesIO slower than monkey-patching io.RawIOBase
On Tue, 17 Jul 2012 06:34:14 +0300 Eli Bendersky eli...@gmail.com wrote: Is there any reason for this to be so? What does BytesIO give us that the second approach does not (I tried adding more methods to the patched RawIOBase to make it more functional, like seekable() and tell(), and it doesn't affect performance)? Well, try implementing non-trivial methods such as readline() or seek(), and writing in the middle rather than at the end. As Nick said, we could implement the same optimization as in StringIO, i.e. only materialize the buffer when necessary. Regards Antoine. -- Software development and contracting: http://pro.pitrou.net ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] io.BytesIO slower than monkey-patching io.RawIOBase
On 17.07.12 06:34, Eli Bendersky wrote: The second approach is consistently 10-20% faster than the first one (depending on input) for trunk Python 3.3 Is there any reason for this to be so? What does BytesIO give us that the second approach does not (I tried adding more methods to the patched RawIOBase to make it more functional, like seekable() and tell(), and it doesn't affect performance)? BytesIO resizes underlying buffer if it overflowed (overallocating 1/8 of size and copying old content to new buffer). Total it makes log[9/8](N) allocations and copy 8*N bytes (for large N). List uses the same strategy, but number of chunks usually significantly less than number of bytes. At the end all this chunks concatenated by join, which calculates sum of chunk lengths and allocate the resulting array with the desired size. That is why append/join is faster than BytesIO in this case. There are other note, about ElementTree.tostringlist(). Creating DataStream class in every function call is too expensive, and that is why monkeypatched version several times is faster than DataStream version for small data. But for long data it is faster too, because data.append() is on one lookup slower than monkeypatched write=data.append. This also raises a moral question - should I be using the second approach deep inside the stdlib (ET.tostring) just because it's faster? Please note that the previous version of Python used a monkeypatching. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] io.BytesIO slower than monkey-patching io.RawIOBase
On Tue, Jul 17, 2012 at 2:59 PM, Serhiy Storchaka storch...@gmail.comwrote: On 17.07.12 06:34, Eli Bendersky wrote: The second approach is consistently 10-20% faster than the first one (depending on input) for trunk Python 3.3 Is there any reason for this to be so? What does BytesIO give us that the second approach does not (I tried adding more methods to the patched RawIOBase to make it more functional, like seekable() and tell(), and it doesn't affect performance)? BytesIO resizes underlying buffer if it overflowed (overallocating 1/8 of size and copying old content to new buffer). Total it makes log[9/8](N) allocations and copy 8*N bytes (for large N). List uses the same strategy, but number of chunks usually significantly less than number of bytes. At the end all this chunks concatenated by join, which calculates sum of chunk lengths and allocate the resulting array with the desired size. That is why append/join is faster than BytesIO in this case. I've created http://bugs.python.org/issue15381 to track this (optimizing BytesIO). There are other note, about ElementTree.tostringlist(). Creating DataStream class in every function call is too expensive, and that is why monkeypatched version several times is faster than DataStream version for small data. But for long data it is faster too, because data.append() is on one lookup slower than monkeypatched write=data.append. I updated tostringlist() to use an outside class. This brings performance back to the old code. Eli ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 0424: A method for exposing a length hint
On Sun, Jul 15, 2012 at 1:28 AM, Alexandre Zani alexandre.z...@gmail.com wrote: I'm +1 on not having a public API for this. Ultimately the contract for a length hint will depend heavily upon what you need it for. Some applications would require a length hint to be an at least others an at most and others something else entirely. Given that the contract here appears to be =0, I don't think the length hint is particularly useful to the public at large. Other possible related uses could be to get an approximate number of results for a query without having to actually go through the whole query, useful for databases and search engines. But then you *do* want __len__ as well, so that also doesn't fit with the current PEP. But maybe that's a completely different usecase, even though it seems related to me? //Lennart ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Use function names instead of functions for os.supports_dir_fd?
Hi, Python 3.3 introduced os.supports_dir_fd to check if some os functions do accept a file descriptor instead of a path. The problem is that os.supports_dir_fd is a list of functions, not a list of function names. If os functions are monkey patched, you cannot test anymore if a function supports file descriptor. Monkey patching is a common practice in Python. test_os.py replaces os.exec*() functions temporary for example. It's also inconsistent with the new time.get_clock_info() function which expects the name of a time function, not the function directly. Victor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] A new JIT compiler for a faster CPython?
Hi, I would like to write yet another JIT compiler for CPython. Before writing anything, I would like your opinion because I don't know well other Python compilers. I also want to prepare a possible integration into CPython since the beginning of the project, or at least stay very close to the CPython project (and CPython developers!). I did not understand exactly why Unladen Swallow and psyco projects failed, so please tell me if you think that my project is going to fail too! == Why? == CPython is still the reference implementation, new features are first added to this implementation (ex: PyPy is not supporting Python 3 yet, but there is a project to support Python 3). Some projects still rely on low level properties of CPython, especially its C API (ex: numpy; PyPy has a cpyext module to emulate the CPython C API). A JIT is the most promising solution to speed up the main evaluation loop: using a JIT, it is possible to compile a function for a specific type on the fly and so enable deeper optimizations. psyco is no more maintained. It had its own JIT which is complex to maintain. For example, it is hard to port it to a new hardware. LLVM is fast and the next version will be faster. LLVM has a community, a documentation, a lot of tools and is active. There are many Python compilers which are very fast, but most of them only support a subset of Python or require to modify the code (ex: specify the type of all parameters and variables). For example, you cannot run Django with Shredskin. IMO PyPy is complex and hard to maintain. PyPy has a design completly different than CPython and is much faster and has a better memory footprint. I don't expect to be as fast as PyPy, just faster than CPython. == General idea == I don't want to replace CPython. This is an important point. All others Python compilers try to write something completly new, which is an huge task and is a problem to stay compatible with CPython. I would like to reuse as much as possible code of CPython and don't try to fight against the GIL or reference counting, but try to cooperate instead. I would like to use a JIT to generate specialized functions for a combinaison of arguments types. Specialization enables more optimizations. I would like to use LLVM because LLVM is an active project, has many developers and users, is fast and the next version will be faster! LLVM already supports common optimizations like inlining. My idea is to emit the same code than ceval.c from the bytecode to be fully compatible with CPython, and then write a JIT to optimize functions for a specific type. == Roadmap == -- Milestone 1: Proof of concept -- * Use the bytecode produced by CPython parser and compiler * Only compile a single function * Emit the same code than ceval.c using LLVM, but without tracing, exceptions nor signal handling (they will be added later) * Support compiling and calling the following functions: def func(a, b): return a+b The pymothoa project can be used as a base to implement quickly such proof of concept. -- Milestone 2: Specialized function for the int type -- * Use type annotation to generate specialized functions for the int type * Use C int with a guard detecting integer overflow to fallback on Python int -- Milestone 3: JIT -- * Depending on the type seen at runtime, recompile the function to generate specialized functions * Use guard to fallback to a generic implementation if the type is not the expected type * Drop maybe the code using function annotations At this step, we can start to benchmark to check if the (JIT) compiler is faster than CPython. -- Later (unsorted ideas) -- * Support exceptions * Full support of Python - classes - list comprehension - etc. * Optimizations: - avoid reference counting when possible - avoid temporary objects when possible - release the GIL when possible - inlining: should be very interesting with list comprehension - unroll loops? - lazy creation of the frame? * Use registers instead of a stack in the evaluation loop? * Add code to allow tracing and profiling * Add code to handle signals (pending calls) * Write a compiler using the AST, with a fallback to the bytecode? (would it be faster? easier or more complex to maintain?) * Test LLVM optimizers * Compile a whole module or even a whole program * Reduce memory footprint * Type annotation to help the optimizer? (with guards?) * const annotation to help the optimizer? (with guards?) * Support any build option of Python: - support Python 2 (2.5, 2.6, 2.7) and 3 (3.1, 3.2, 3.3, 3.4) - support narrow and wide mode: flag at runtime? - support debug and release mode: flag at runtime? - support 32 and 64 bits mode on Windows? == Other Python VM and compilers == -- Fully Python compliant -- * `PyPy http://pypy.org/`_ * `Jython http://www.jython.org/`_ based on the JVM * `IronPython http://ironpython.net/`_ based on the .NET VM * `Unladen Swallow
Re: [Python-Dev] A new JIT compiler for a faster CPython?
Victor Stinner, 17.07.2012 20:38: -- Subset of Python -- * `pymothoa http://code.google.com/p/pymothoa/`_: use LLVM; don't support classes nor exceptions. * `unpython http://code.google.com/p/unpython/`_: Python to C * `Perthon http://perthon.sourceforge.net/`_: Python to Perl * `Copperhead http://copperhead.github.com/`_: Python to GPU (Nvidia) You might also want to add numexpr and numba to that list. Numba might actually be quite close to pymothoa (hadn't heard of it before). Personally, I like the idea of having a JIT compiler more or less as an extension module at hand. Sort-of like a co-processor, just in software. Lets you run your code either interpreter or JITed, just as you need. Note that the Cython project is working on a protocol to efficiently call external C implemented Python functions by effectively unboxing them. That explicitly includes JIT compiled code, and a JIT compiler could obviously make good use of it from the other side as well. Stefan ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new JIT compiler for a faster CPython?
I'll admit I didn't read through your email, but you should absolutely check out Numba which is ramping up just now to do this: https://github.com/numba (I'm CC-ing their mailing list, perhaps some of them will read this and respond.) It is probably much less ambitious but that hopefully shouldn't stop you cooperating. It's started by Travis Oliphant (who started NumPy); here's his thoughts on PyPy and NumPy which provides some of the background for this project. http://technicaldiscovery.blogspot.no/2011/10/thoughts-on-porting-numpy-to-pypy.html Dag On 07/17/2012 08:38 PM, Victor Stinner wrote: Hi, I would like to write yet another JIT compiler for CPython. Before writing anything, I would like your opinion because I don't know well other Python compilers. I also want to prepare a possible integration into CPython since the beginning of the project, or at least stay very close to the CPython project (and CPython developers!). I did not understand exactly why Unladen Swallow and psyco projects failed, so please tell me if you think that my project is going to fail too! == Why? == CPython is still the reference implementation, new features are first added to this implementation (ex: PyPy is not supporting Python 3 yet, but there is a project to support Python 3). Some projects still rely on low level properties of CPython, especially its C API (ex: numpy; PyPy has a cpyext module to emulate the CPython C API). A JIT is the most promising solution to speed up the main evaluation loop: using a JIT, it is possible to compile a function for a specific type on the fly and so enable deeper optimizations. psyco is no more maintained. It had its own JIT which is complex to maintain. For example, it is hard to port it to a new hardware. LLVM is fast and the next version will be faster. LLVM has a community, a documentation, a lot of tools and is active. There are many Python compilers which are very fast, but most of them only support a subset of Python or require to modify the code (ex: specify the type of all parameters and variables). For example, you cannot run Django with Shredskin. IMO PyPy is complex and hard to maintain. PyPy has a design completly different than CPython and is much faster and has a better memory footprint. I don't expect to be as fast as PyPy, just faster than CPython. == General idea == I don't want to replace CPython. This is an important point. All others Python compilers try to write something completly new, which is an huge task and is a problem to stay compatible with CPython. I would like to reuse as much as possible code of CPython and don't try to fight against the GIL or reference counting, but try to cooperate instead. I would like to use a JIT to generate specialized functions for a combinaison of arguments types. Specialization enables more optimizations. I would like to use LLVM because LLVM is an active project, has many developers and users, is fast and the next version will be faster! LLVM already supports common optimizations like inlining. My idea is to emit the same code than ceval.c from the bytecode to be fully compatible with CPython, and then write a JIT to optimize functions for a specific type. == Roadmap == -- Milestone 1: Proof of concept -- * Use the bytecode produced by CPython parser and compiler * Only compile a single function * Emit the same code than ceval.c using LLVM, but without tracing, exceptions nor signal handling (they will be added later) * Support compiling and calling the following functions: def func(a, b): return a+b The pymothoa project can be used as a base to implement quickly such proof of concept. -- Milestone 2: Specialized function for the int type -- * Use type annotation to generate specialized functions for the int type * Use C int with a guard detecting integer overflow to fallback on Python int -- Milestone 3: JIT -- * Depending on the type seen at runtime, recompile the function to generate specialized functions * Use guard to fallback to a generic implementation if the type is not the expected type * Drop maybe the code using function annotations At this step, we can start to benchmark to check if the (JIT) compiler is faster than CPython. -- Later (unsorted ideas) -- * Support exceptions * Full support of Python - classes - list comprehension - etc. * Optimizations: - avoid reference counting when possible - avoid temporary objects when possible - release the GIL when possible - inlining: should be very interesting with list comprehension - unroll loops? - lazy creation of the frame? * Use registers instead of a stack in the evaluation loop? * Add code to allow tracing and profiling * Add code to handle signals (pending calls) * Write a compiler using the AST, with a fallback to the bytecode? (would it be faster? easier or more complex to maintain?) * Test LLVM optimizers * Compile a whole module or even a whole
Re: [Python-Dev] A new JIT compiler for a faster CPython?
Hi, 2012/7/17 Victor Stinner victor.stin...@gmail.com: -- Milestone 3: JIT -- * Depending on the type seen at runtime, recompile the function to generate specialized functions * Use guard to fallback to a generic implementation if the type is not the expected type From my understanding, psyco did exactly this. -- Amaury Forgeot d'Arc ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new JIT compiler for a faster CPython?
Hi Victor. I'm willing to explain to you details why having LLVM does not solve almost any issues and why PyPy is complex, or why you think it's complex. Find me on IRC if you want (fijal, can be found on #pypy on freenode for example). In our opinion something like psyco that gets brought to the levels of speed of pypy would be massively more complex than PyPy, most importantly it would be incredibly fragile. It's possible, but it's lots and lots of work. I don't think it possibly can be done with one person. Speaking about compatible with cpython and yet fast - I would strongly recommend talking to Mark Shannon (the author of HotPy). He's by far the best person who can answer some questions and have a rough plan how to go forward. It would be much better to concentrate efforts rather than write yet another half-finished JIT (because reading code is hard). Cheers, fijal ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new JIT compiler for a faster CPython?
Maciej Fijalkowski, 17.07.2012 21:16: It would be much better to concentrate efforts rather than write yet another half-finished JIT (because reading code is hard). +1 Stefan ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new JIT compiler for a faster CPython?
On Jul 17, 2012, at 11:38 AM, Victor Stinner victor.stin...@gmail.com wrote: IMO PyPy is complex and hard to maintain. PyPy has a design completly different than CPython and is much faster and has a better memory footprint. I don't expect to be as fast as PyPy, just faster than CPython. I think this criticism is misguided. Let's grant for the moment that you're right, and PyPy is complex and hard to maintain. If a high-level Python parser and JIT compiler written in Python came out as complex and unmaintainable, why do you believe that they'll be easy to write in C? You are correct that it has a different architecture than CPython: it has a different architecture because CPython's architecture is limiting because of its simplicity and makes it difficult to do things like write JIT compilers. The output of the Unladen Swallow project was illuminating in that regard. (Please note I said output and not failure, the Unladen Swallow folks did the community a great service and produced many useful artifacts, even if they didn't meet their original goal.) Polluting the straightforward, portable architecture of CPython with significant machine-specific optimizations to bolt on extra features that are already being worked on elsewhere seems like a waste of effort to me. You could, instead, go work on documenting PyPy's architecture so it seems less arcane to newcomers. Some of the things in there which look like hideous black magic are actually fairly straightforward when explained, as I have learned by being lucky enough to receive explanations in person from Maciej, Benjamin and Alex at various conferences. I mean, don't get me wrong, if this worked out, I'd love a faster CPython, I do still use use many tools which don't support PyPy yet, so I can see the appeal of greater runtime compatibility with CPython than CPyExt offers. I just think that it will end up being a big expenditure of effort for relatively little return. If you disagree, you should feel no need to convince me; just go do it and prove me wrong, which I will be quite happy to be. I would just like to think about whether this is the best use of your energy first. But definitely listen to Maciej's suggestion about concentrating efforts with other people engaged in similar efforts, regardless :). As your original message shows, there has already been enough duplication of effort in this area. -glyph___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new JIT compiler for a faster CPython?
I would like to write yet another JIT compiler for CPython. FWIW, so do I. I did not understand exactly why Unladen Swallow and psyco projects failed, so please tell me if you think that my project is going to fail too! It may well happen that your project fails, or doesn't even start. Mine didn't start for the last two years (but now may eventually do start). I'm not sure psyco really failed; if it did, it was because of PyPy: PyPy was created to do the same stuff as psyco, just better. It was abandoned in favor of PyPy - whether that's a failure of psyco, I don't know. IMO, the psyco implementation itself failed because it was unmaintainable, containing very complicated code that nobody but its authors could understand. Also, I know for a fact that Unladen Swallow (the project) didn't fail; some interesting parts were contributed to Python and are now part of its code base. It's the JIT compiler of Unladen Swallow that failed; in my understanding because LLVM is crap (i.e. it is slow, memory-consuming, and buggy) - as a low-level virtual machine; it may be ok as a compiler backend (but I still think it is buggy there as well). psyco is no more maintained. I think this is factually incorrect: Christian Tismer maintains it (IIUC). I would like to use a JIT to generate specialized functions for a combinaison of arguments types. I think history has moved past specializing JITs. Tracing JITs are the status quo; they provide specialization as a side effect. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new JIT compiler for a faster CPython?
If you disagree, you should feel no need to convince me; just go do it and prove me wrong, which I will be quite happy to be. I would just like to think about whether this is the best use of your energy first. While I follow most of your reasoning, I think this is a flaw in your logic. This is free software: the only person to decide where energy is best used is the person providing the energy. It may well be that Victor gives up after the first three steps, or it may be that he comes back with a working prototype in August. He may well find that his energy is *best* spent in this project, since it may get him a highly-payed job, a university diploma, or reputation. If not that, he'll learn a lot. But definitely listen to Maciej's suggestion about concentrating efforts with other people engaged in similar efforts, regardless :). Again, this thinking is flawed, IMO. It might be in the community's interest if people coordinate, but not in the interest of the individual contributor. As your original message shows, there has already been enough duplication of effort in this area. And that's not really a problem, IMO. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new JIT compiler for a faster CPython?
2012/7/18 mar...@v.loewis.de: I would like to write yet another JIT compiler for CPython. FWIW, so do I. I don't know whether it's good news (that Martin wants to put his expertise in this area) or a bad sign (that he did not start after so many years of Python development - the problem becomes more and more difficult each time one thinks about it) -- Amaury Forgeot d'Arc ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new JIT compiler for a faster CPython?
Personally, I like the idea of having a JIT compiler more or less as an extension module at hand. Sort-of like a co-processor, just in software. Lets you run your code either interpreter or JITed, just as you need. Me too, so something like psyco. LLVM is written in C++ and may have license issue, so I don't really want to add a dependency to LLVM to CPython. For an experimental project, a third party module is also more convinient. Victor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new JIT compiler for a faster CPython?
It's the JIT compiler of Unladen Swallow that failed; in my understanding because LLVM is crap (i.e. it is slow, memory-consuming, and buggy) - as a low-level virtual machine; it may be ok as a compiler backend (but I still think it is buggy there as well). What is the status of LLVM nowadays? Is it not a good solution to write a portable JIT? I don't want to write my own library to generate machine code. psyco is no more maintained. I think this is factually incorrect: Christian Tismer maintains it (IIUC). http://psyco.sourceforge.net/ says: News, 12 March 2012 Psyco is unmaintained and dead. Please look at PyPy for the state-of-the-art in JIT compilers for Python. Victor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new JIT compiler for a faster CPython?
On the cpyext front, it would be rather helpful if developers interested in a high speed Python interpreter with good C extension compatibility worked with Dave Malcolm on his static analyser for Python C extensions. One of the reasons cpyext has trouble is that many refcounting bugs in extensions aren't fatal on CPython’s due to additional internal references - a refcount of 1 when it should be 2 is survivable in a way that 0 vs 1 is not. Get rid of that drudgery from hacking on cpyext and it becomes significantly easier to expand the number of extensions that will work across multiple implementations of the API. Cheers, Nick. -- Sent from my phone, thus the relative brevity :) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new JIT compiler for a faster CPython?
On Tue, Jul 17, 2012 at 6:20 PM, Victor Stinner victor.stin...@gmail.com wrote: What is the status of LLVM nowadays? Is it not a good solution to write a portable JIT? I don't want to write my own library to generate machine code. You don't have to, even if you don't want to use LLVM. There are plenty of ligher-weight approaches to that. For example, GNU Lightning [1] or sljit [2]. [1] http://www.gnu.org/software/lightning/ [2] http://sljit.sourceforge.net/ -- Devin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new JIT compiler for a faster CPython?
2012/7/18 Nick Coghlan ncogh...@gmail.com: On the cpyext front, it would be rather helpful if developers interested in a high speed Python interpreter with good C extension compatibility worked with Dave Malcolm on his static analyser for Python C extensions. One of the reasons cpyext has trouble is that many refcounting bugs in extensions aren't fatal on CPython’s due to additional internal references - a refcount of 1 when it should be 2 is survivable in a way that 0 vs 1 is not. It's not only about bugs. Even when reference counts are correctly managed, cpyext is slow: - each time an object crosses the C|pypy boundary, there is a dict lookup (!) - each time a new object is passed or returned to C, a PyObject structure must be allocated (and sometime much more, specially for strings and types). Py_DECREF will of course free the PyObject, so next time will allocate the object again. - borrowed references are a nightmare. Get rid of that drudgery from hacking on cpyext and it becomes significantly easier to expand the number of extensions that will work across multiple implementations of the API. There are also some extension modules that play tricky games with the API; PyQt for example uses metaclasses with a custom tp_alloc slot, to have access to the PyTypeObject structure during the construction of the type... The Python C API is quite complete, but some use cases are still poorly supported. -- Amaury Forgeot d'Arc ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new JIT compiler for a faster CPython?
On 2012-07-17, at 6:38 PM, Devin Jeanpierre wrote: On Tue, Jul 17, 2012 at 6:20 PM, Victor Stinner victor.stin...@gmail.com wrote: What is the status of LLVM nowadays? Is it not a good solution to write a portable JIT? I don't want to write my own library to generate machine code. You don't have to, even if you don't want to use LLVM. There are plenty of ligher-weight approaches to that. For example, GNU Lightning [1] or sljit [2]. [1] http://www.gnu.org/software/lightning/ [2] http://sljit.sourceforge.net/ And, there is also DynASM [1], [2]. This one was built for LuaJIT and is under MIT licence. [1] http://luajit.org/dynasm.html [2] https://github.com/LuaDist/luajit/tree/master/dynasm - Yury ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new JIT compiler for a faster CPython?
As your original message shows, there has already been enough duplication of effort in this area. I didn't find yet a project reusing ceval.c: most projects implement their own eval loop and don't use CPython at all. My idea is not to write something new, but just try to optimize the existing ceval.c code. Pseudo-code: * read the bytecode of a function * replace each bytecode by its C code * optimize * compile the C code to machine code (I don't know if C code is the right expression here, it's just for the example) Dummy example: def mysum(a, b): return a+b Python compiles it to bytecode as: dis.dis(mysum) 0 LOAD_FAST0 (a) 3 LOAD_FAST1 (b) 6 BINARY_ADD 7 RETURN_VALUE The bytecode can be compiled to something like: x = GETLOCAL(0); # a if (x == NULL) /* error */ Py_INCREF(x); PUSH(x); x = GETLOCAL(1); # b if (x == NULL) /* error */ Py_INCREF(x); PUSH(x); w = POP(); v = TOP(); x = PyNumber_Add(v, w); Py_DECREF(v); Py_DECREF(w); if (x == NULL) /* error */ SET_TOP(x); retval = POP(); return retval; The calls to Py_INCREF() and Py_DEREF() can be removed. The code is no more based on a loop: CPU prefers sequential code. The stack can be replaced variables: the compiler (LLVM?) knows how to replace many variables with a few variables, or even use CPU registers instead. Example: a = GETLOCAL(0); # a if (a == NULL) /* error */ b = GETLOCAL(1); # b if (b == NULL) /* error */ return PyNumber_Add(a, b); I don't expect to run a program 10x faster, but I would be happy if I can run arbitrary Python code 25% faster. -- Specialization / tracing JIT can be seen as another project, or at least added later. Victor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new JIT compiler for a faster CPython?
On 17/07/2012 23:20, Victor Stinner wrote: http://psyco.sourceforge.net/ says: News, 12 March 2012 Psyco is unmaintained and dead. Please look at PyPy for the state-of-the-art in JIT compilers for Python. Victor A search on pypi for JIT compilers gives no matches. -- Cheers. Mark Lawrence. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Use function names instead of functions for os.supports_dir_fd?
On 07/17/2012 10:34 AM, Victor Stinner wrote: Python 3.3 introduced os.supports_dir_fd to check if some os functions do accept a file descriptor instead of a path. The problem is that os.supports_dir_fd is a list of functions, not a list of function names. If os functions are monkey patched, you cannot test anymore if a function supports file descriptor. If you're monkey-patching the function, you can monkey-patch os.supports_dir_fd too. Monkey patching is a common practice in Python. test_os.py replaces os.exec*() functions temporary for example. For testing, yes. It's not recommended for production code. //arry/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new JIT compiler for a faster CPython?
Victor Stinner wrote: == Other Python VM and compilers == As far as I know, these are all still active, although possibly experimental: Pynie (Python for the Parrot VM) WPython (16-bit word-codes instead of byte-codes) HotPy (high-performance optimizing VM for Python) Skulpt (Javascript implementation) HoPe(Python in Haskell) Berp(another Python in Haskell) WPython in particular seems to be very promising, and quite fast. I don't understand why it doesn't get more attention (although I admit I can't criticise, since I haven't installed or used it myself). http://www.pycon.it/media/stuff/slides/beyond-bytecode-a-wordcode-based-python.pdf In the Java world, there are byte-code optimizers such as Soot, BLOAT and ProGuard which apparently can speed up Java significantly. As far as I can tell, in the Python world byte-code optimization is a severely neglected area. For good reason? No idea. -- Steven ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Use function names instead of functions for os.supports_dir_fd?
Victor Stinner wrote: Hi, Python 3.3 introduced os.supports_dir_fd to check if some os functions do accept a file descriptor instead of a path. The problem is that os.supports_dir_fd is a list of functions, not a list of function names. If os functions are monkey patched, you cannot test anymore if a function supports file descriptor. One of the dangers of monkey-patching. Monkey patching is a common practice in Python. test_os.py replaces os.exec*() functions temporary for example. Perhaps for testing, but I don't think monkey-patching is common in production code. Perhaps you are thinking of Ruby :) It's also inconsistent with the new time.get_clock_info() function which expects the name of a time function, not the function directly. Since functions are first-class objects in Python, and people should be used to passing functions around as parameters, perhaps it is better to say that get_clock_info is inconsistent with supports_dir_fd. Personally, I prefer passing function objects rather than names, since the *name* of the function shouldn't matter. But since I recognise that other people may think differently, I would probably support passing both the name or the function object itself. -- Steven ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new JIT compiler for a faster CPython?
On 07/17/2012 04:34 PM, Steven D'Aprano wrote: As far as I know, these are all still active, although possibly experimental: Pynie (Python for the Parrot VM) WPython (16-bit word-codes instead of byte-codes) [...] WPython in particular seems to be very promising, and quite fast. I don't understand why it doesn't get more attention (although I admit I can't criticise, since I haven't installed or used it myself). Cesar (sp?) was at Mark's talk on HotPy at EuroPython. We asked him if WPython was still active, and he said, nope, no community interest. IIRC Pynie is basically dead too. I don't know about the others, //arry/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Use function names instead of functions for os.supports_dir_fd?
Monkey patching is a common practice in Python. test_os.py replaces os.exec*() functions temporary for example. Perhaps for testing, but I don't think monkey-patching is common in production code. Perhaps you are thinking of Ruby :) The gevent library does monkey-patch os.fork (and time.sleep and many other functions), but gevent is maybe not ready for production? :-) Victor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new JIT compiler for a faster CPython?
Victor Stinner victor.stinner at gmail.com writes: Example: a = GETLOCAL(0); # a if (a == NULL) /* error */ b = GETLOCAL(1); # b if (b == NULL) /* error */ return PyNumber_Add(a, b); I don't expect to run a program 10x faster, but I would be happy if I can run arbitrary Python code 25% faster. -- Specialization / tracing JIT can be seen as another project, or at least added later. Victor This is almost exactly what Unladen Swallow originally did. First, LLVM will not do all of the optimizations you are expecting it to do out of the box. It will still have all the stack accesses, and it will have all of the ref counting operations. You can get a small speed boost from removing the interpretation dispatch overhead, but you also explode your memory usage, and the speedups are tiny. Please, learn from Unladen Swallow and other's experiences, otherwise they're for naught, and frankly we (python-dev) waste a lot of time. Alex ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new JIT compiler for a faster CPython?
Victor Stinner, 18.07.2012 00:15: Personally, I like the idea of having a JIT compiler more or less as an extension module at hand. Sort-of like a co-processor, just in software. Lets you run your code either interpreter or JITed, just as you need. Me too, so something like psyco. In the sense that it's a third party module, yes. Not in the sense of how it hooks into the runtime. The intention would be that users explicitly run their code in a JIT compiled environment, e.g. their template processing or math code. The runtime wouldn't switch to a JIT compiler automatically for normal code. I mean, that could still become a feature at some point, but I find a decorator or an exec-like interface quite acceptable, as long as it fails loudly with can't do that if the JIT compiler doesn't support a specific language feature. Stefan ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new JIT compiler for a faster CPython?
Alex Gaynor, 18.07.2012 03:24: Victor Stinner writes: Example: a = GETLOCAL(0); # a if (a == NULL) /* error */ b = GETLOCAL(1); # b if (b == NULL) /* error */ return PyNumber_Add(a, b); I don't expect to run a program 10x faster, but I would be happy if I can run arbitrary Python code 25% faster. -- Specialization / tracing JIT can be seen as another project, or at least added later. This is almost exactly what Unladen Swallow originally did. First, LLVM will not do all of the optimizations you are expecting it to do out of the box. It will still have all the stack accesses, and it will have all of the ref counting operations. You can get a small speed boost from removing the interpretation dispatch overhead, but you also explode your memory usage, and the speedups are tiny. My experience with Cython tells me that even if you move the entire interpretation overhead out of the way, you'd only get some 5-20% speedup for real code, rarely more if you have some really tight loops. Adding a full-blown JIT compiler to the dependencies just for that is usually not worth it, and Unladen Swallow succeeded in showing that pretty clearly. It's when you start specialising and optimising code patterns that it becomes really interesting, but you can do that statically at build time or compile time in most cases (at least in the more interesting ones) and Cython is one way to do it. Again, no need to add a JIT compiler. The nice thing about JIT compilers is that you can give them your code and they'll try to optimise it for you without further interaction. That doesn't mean you get the fastest code ever, it just means that they do all the profiling for you and try to figure it out all by themselves. That may or may not work out, but it usually works quite ok (and you'll love JIT compilers for it) and only rarely gets seriously in the way (and that's when you'll hate JIT compilers). However, it requires that the JIT compiler knows about a lot of optimisations. PyPy's JIT is full of those. It's not the fact that it has a JIT compiler at all that makes it fast and not the fact that they compile Python to machine code, it's the fact that they came up with a huge bunch of specialisations that makes lots of code patterns fast once it detected them. LLVM (or any other low-level JIT compiler) won't help at all with that. Stefan ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new JIT compiler for a faster CPython?
2012/7/18 Victor Stinner victor.stin...@gmail.com I don't expect to run a program 10x faster, but I would be happy if I can run arbitrary Python code 25% faster. If that's your target, you don't need to resort to a bytecode-to-binary-equivalent compiler. WPython already gave similar results with Python 2.6. The idea behind is that using an hybrid stack-register VM, you'll spend less time on the ceval loop constant stuff (checking for events / GIL release, etc.). That's because superinstructions aggregates more bytecodes into a single wordcode, which requires only one decoding phase, avoids many pushes/pops, and some unnecessary inc/decr reference counting. A better peephole optimizer is provided, and some other optimizations as well. There's also room for more optimizations. I have many ideas to improve both WPython or just the ceval loop. For example, at the last EuroPython sprint I was working to a ceval optimization that gave about 10% speed improvement for the CPython 3.3 beta trunk (on my old MacBook Air, running 32-bit Windows 8 preview), but still needs to be checked for correctness (I'm spending much more time running and checking the standard tests than for its implementation ;-) In the end, I think that a lot can be done to improve the good old CPython VM, without resorting to a JIT compiler. Lack of time is the enemy... Regards, Cesare -- Specialization / tracing JIT can be seen as another project, or at least added later. Victor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/cesare.di.mauro%40gmail.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new JIT compiler for a faster CPython?
What is the status of LLVM nowadays? Is it not a good solution to write a portable JIT? I don't think it is. It is still slow and memory hungry. The fact that the version that Apple ships with Xcode still miscompiles Python 3.3 tells me that it is still buggy. I don't want to write my own library to generate machine code. I plan to use nanojit. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new JIT compiler for a faster CPython?
Zitat von Mark Lawrence breamore...@yahoo.co.uk: On 17/07/2012 23:20, Victor Stinner wrote: http://psyco.sourceforge.net/ says: News, 12 March 2012 Psyco is unmaintained and dead. Please look at PyPy for the state-of-the-art in JIT compilers for Python. Victor A search on pypi for JIT compilers gives no matches. I think you misread: PyPy, not pypi. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A new JIT compiler for a faster CPython?
Please, learn from Unladen Swallow and other's experiences, otherwise they're for naught, and frankly we (python-dev) waste a lot of time. Again: we (python-dev) won't waste much time (unless we chose to in discussions); Victor may lose time, but then he may not. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com