Re: [Python-Dev] [ANN] superinstructions (VPython 0.1)
Antoine Pitrou <[EMAIL PROTECTED]> writes: > Hi, > > J. Sievers gmail.com> writes: >> >> A sequence of code such as LOAD_CONST LOAD_FAST BINARY_ADD will, in >> CPython, push some constant onto the stack, push some local onto the >> stack, then pop both off the stack, add them and push the result back >> onto the stack. >> Turning this into a superinstruction means inlining LOAD_CONST and >> LOAD_FAST, modifying them to store the values they'd otherwise push >> onto the stack in local variables and adding a version of BINARY_ADD >> which reads its arguments from those local variables rather than the >> stack (this reduces dispatch time in addition to pops and pushes). > > The problem is that this only optimizes code like "x + 1" but not "1 + x" or > "x > + y". To make this generic a first step would be to try to fuse LOAD_CONST and > LOAD_FAST into a single opcode (and check it doesn't slow down the VM). This > could be possible by copying the constants table into the start of the frame's > variables array when the frame is created, so that the LOAD_FAST code still > does > a single indexed array dereference. Since constants are constants, they don't > need to be copied again when the frame is re-used by a subsequent call of the > same function (but this would slow done recursive functions a bit, since those > have to create new frames each time they are called). > > Then fusing e.g. LOAD_FAST LOAD_FAST BINARY_ADD into ADD_FAST_FAST would cover > many more cases than the optimization you are writing about, without any > explosion in the number of opcodes. > > Regards > > Antoine. > I don't know that I'd call it an explosion. Currently there are ~150 superinstructions in all (problematic when using bytecode but inconsequential once one is committed to threaded code). A superinstruction definition in Vmgen, btw, looks as follows: fcbinary_add = load_fast load_const binary_add -jakob ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [ANN] VPython 0.1
"M.-A. Lemburg" <[EMAIL PROTECTED]> writes: [snip] > BTW: I hope you did not use pybench to get profiles of the opcodes. > That would most certainly result in good results for pybench, but > less good ones for general applications such as Django or Zope/Plone. Algorithm used for superinstruction selection: 1) idea: LOAD_CONST/LOAD_FAST + some suffix 2) potential suffixes: $ grep '..*(..*--..*)$' ceval.vmg | grep 'a1 a2 --' > INSTRUCTIONS 3) delete any instruction that I felt wouldn't be particularly frequent from INSTRUCTIONS (e.g. raise_varargs) 4) use awk to generate superinstruction definitions Not only is this relatively unbiased but also very low effort. -jakob ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [ANN] VPython 0.1
"Daniel Stutzbach" <[EMAIL PROTECTED]> writes: [snip] >I searched around for information on how threaded code interacts with >branch prediction, and here's what I found. The short answer is that >threaded code significantly improves branch prediction. See ``Optimizing indirect branch prediction accuracy in virtual machine interpreters'' and ``The Structure and Performance of Efficient Interpreters''. Cheers, -jakob ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [ANN] VPython 0.1
On Fri, Oct 24, 2008 at 7:18 AM, Terry Reedy <[EMAIL PROTECTED]> wrote: > I have not seen any Windows test yet. The direct threading is gcc-specific, > so there might be degradation with MSVC. > erlang uses gcc to compile a single source file on windows and uses MS VC++ to compile all others. They also need the gcc labels-as-values extension and the file in question seems to be their bytecode interpreter (beam_emu.c). - Ralf ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [ANN] VPython 0.1
Greg Ewing <[EMAIL PROTECTED]> writes: > Daniel Stutzbach wrote: > >> With threaded code, every handler ends with its own dispatcher, so >> the processor can make fine-grained predictions. > > I'm still wondering whether all this stuff makes a > noticeable difference in real-life Python code, which > spends most of its time doing expensive things like > attribute lookups and function calls, rather than > fiddling with integers in local variables. Does it? Typically, VM interpreters spend a large percentage of their cycles in opcode dispatch rather than opcode execution. Running WITH_TSC compiled Python on some computation heavy application (Mercurial maybe?) processing a typical workload would be interesting. Also, I'd estimate that operations such as string concat and list append (opcode BINARY_ADD) are fairly common, even in real-life Python code. Cheers, -jakob ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [ANN] VPython 0.1
On 2008-10-24 09:53, J. Sievers wrote: > "M.-A. Lemburg" <[EMAIL PROTECTED]> writes: > > [snip] >> BTW: I hope you did not use pybench to get profiles of the opcodes. >> That would most certainly result in good results for pybench, but >> less good ones for general applications such as Django or Zope/Plone. > > Algorithm used for superinstruction selection: > > 1) idea: LOAD_CONST/LOAD_FAST + some suffix > 2) potential suffixes: >$ grep '..*(..*--..*)$' ceval.vmg | grep 'a1 a2 --' > INSTRUCTIONS > 3) delete any instruction that I felt wouldn't be particularly frequent >from INSTRUCTIONS (e.g. raise_varargs) > 4) use awk to generate superinstruction definitions > > Not only is this relatively unbiased but also very low effort. Well, the "I felt wouldn't be particularly frequent" part does sound a bit biased, but you obviously made good choices ;-) I thought you used the tracing functions that Vmgen provides for determining which combinations occur more often. That's how I worked back then - I instrumented the interpreter and then let it run for a few days doing whatever I worked on or with at the time. I then found that it makes sense to process LOAD_FAST completely outside the switch statement and to move common opcodes such as CALL_* to the switch with the most used opcodes. Inlining the code for calling C functions/methods also made a difference, since most calls in Python are to C functions/methods. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 24 2008) >>> Python/Zope Consulting and Support ...http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [ANN] VPython 0.1
[EMAIL PROTECTED] writes: > On 23 Oct, 10:42 pm, [EMAIL PROTECTED] wrote: >>Guido van Rossum wrote: >>>there already is something else called VPython >> >>Perhaps it could be called Fython (Python with a Forth-like VM) >>or Thython (threaded-code Python). > > I feel like I've missed something important, but, why not just call it > "Python"? ;-) > > It's a substantial patch, but from what I understand it's a huge > performance improvement and completely compatible, both at the C API > and Python source levels. > > Is there any reason this should be a separate project rather than just > be rolled in to the core? Obviously once issues like the 'Cell' macro > are dealt with. While it seems to work reliably, I don't think the current implementation is really ``core-ready'' as it stands. I consider it more of a prototype to demonstrate the potential impact on these kinds of low-level dispatch optimizations (and for me personally an opportunity to familiarize myself with the CPython VM). IMO the main issues are: - Right now, CPython's bytecode is translated to direct threaded code lazily (when a code object is first evaluated). This would have to be merged into compile.c in some way plus some assorted minor changes. - The various things mentioned in TODO. - Finally, the core developers probably won't want to depend on Gforth (required to run Vmgen), so one might have to re-implement Vmgen (not a huge deal; I know that there are a number of unpublished versions of Vmgen floating around and IIRC one of them is even written in Python; even if not Vmgen isn't really that difficult to implement). Once that's done, however, I'd consider readability to have _increased_ if anything (compare the switch statement in vanilla Python 2.5.2's ceval.c to the patchset's ceval.vmg). Note that none of the above are really show stoppers from a user's point of view. It's about conforming to CPython's standards of neatness. Cheers, -jakob ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [ANN] superinstructions (VPython 0.1)
Steve Holden holdenweb.com> writes: > Though it would seem redundant to create multiple copies of constant > structures. Wouldn't there be some way to optimize this to allow each > call to access the data from the same place? It's just copying a table of pointers, so a bare memcpy() or its hand-written equivalent is enough. Most functions shouldn't have lots of constants. Here is a small test on some stdlib modules (with 2.5.2): >>> def consts_per_func(mod): ... return [len(f.func_code.co_consts) for f in vars(mod).values() if hasattr(f, "func_code")] ... >>> consts_per_func(logging) [3, 3, 2, 2, 2, 2, 3, 2, 3, 1, 3, 2, 4, 2, 3, 5, 3, 3, 1, 10, 3] >>> consts_per_func(os) [3, 4, 2, 4, 2, 1, 2, 1, 3, 1, 2, 2, 4, 2, 1, 2, 3, 3, 3, 3, 2, 2, 1, 1, 3, 2, 1, 1, 2, 1, 1] >>> consts_per_func(threading) [1, 3, 1, 1, 1, 3, 13, 1, 1, 1, 1, 1, 1, 2, 1, 1] >>> consts_per_func(threading.Thread) [1, 2, 1, 1, 3, 2, 2, 1, 2, 8, 2, 6, 3, 7, 13] >>> consts_per_func(heapq) [2, 3, 2, 2] >>> consts_per_func(unittest.TestCase) [2, 3, 2, 4, 2, 3, 4, 4, 3, 3, 4, 2, 3, 1, 1, 2, 1, 3, 4, 2, 2, 3, 2, 4, 2, 4, 3, 4, 2, 2, 2, 2, 4] Regards Antoine. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [ANN] VPython 0.1
Greg Ewing wrote: > [EMAIL PROTECTED] wrote: > >> Is there any reason this should be a separate project rather than just >> be rolled in to the core? > > Always keep in mind that one of the important characteristics > of CPython is that its implementation is very straightforward > and easy to follow. Replacing the ceval loop with machine > generated code would be a step away from that. Funny to hear that from the author of a well-known code generator. ;-) I haven't looked at the specific Vmgen code in question, but I tend to find a short DSL description of repetitive functionality much more straightforward than the same thing implemented in custom, hand-optimised code in a general purpose language like C. Just think of the switch split that MAL described in one of his comments. Having two switch statements and a couple of separate special cases for a single eval loop might look pretty arbitrary and not straight forward at all to a reader who doesn't have enough background regarding the performance characteristics of Python's VM statements. Stefan ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [ANN] VPython 0.1
Guido> This is very interesting (at this point I'm just lurking), but Guido> has anyone pointed out yet that there already is something else Guido> called VPython, which has a long standing "right" to the name? I believe Jakob has already been notified about this. How about TPython? A quick google-check suggests that while there is at least one instance of that name in use as related to Python, it seems to be fairly obscure and is perhaps only used internally at CERN. Skip ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [ANN] VPython 0.1
Terry> I have not seen any Windows test yet. The direct threading is Terry> gcc-specific, so there might be degradation with MSVC. Not if a compiler #ifdef selects between two independent choices: #ifdef __GCC__ /* or whatever the right incantation is */ #include "ceval-threaded.c" #else #include "ceval-switched.c" #endif and so on... BTW, as to the implementation of individual VM instructions I don't believe the Vmgen stuff affects that. It's just the way the instructions are assembled. Skip ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] No manifest files on Windows?
Mark Hammond schrieb: >> In http://bugs.python.org/issue4120, the author suggests that it might >> be possible to completely stop using the manifest mechanism, for VS >> 2008. Given the many problems that this SxS stuff has caused, this >> sounds like a very desirable route, although I haven't done any actual >> testing yet. >> >> Can all the Windows experts please comment? Could that work? Does it >> have any downsides? >> >> If it works, I would like to apply it to 3.0, although I probably >> won't be able to apply it to tomorrow's rc. Would it also be possible >> to change that in 2.6.1 (even though python26.dll in 2.6.0 already >> includes a manifest, as do all the pyd files)? > > My take is that the bug is suggesting the manifest be dropped only from .pyd > files. Python's executable and DLL will still have the manifest reference, > so the CRT is still loaded from a manifest (FWIW, the CRT will abort() if > initialized other than via a manifest!). What about COM objects: isn't pythoncom26.dll or _ctypes.pyd the first executable image that is loaded first for them? And how would they load the crt? -- Thanks, Thomas ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [ANN] VPython 0.1
At 10:47 AM 10/24/2008 +0200, J. Sievers wrote: - Right now, CPython's bytecode is translated to direct threaded code lazily (when a code object is first evaluated). This would have to be merged into compile.c in some way plus some assorted minor changes. Don't you mean codeobject.c? I don't see how the compiler relates, as Python programs can generate or transform bytecode. (For example, Zope's Python sandboxing works that way.) ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Summary of Python tracker Issues
ACTIVITY SUMMARY (10/17/08 - 10/24/08) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue number. Do NOT respond to this message. 2124 open (+32) / 13891 closed (+20) / 16015 total (+52) Open issues with patches: 700 Average duration of open issues: 713 days. Median duration of open issues: 1918 days. Open Issues Breakdown open 2108 (+32) pending16 ( +0) Issues Created Or Reopened (54) ___ Memory leak in itertools.chain() 10/20/08 CLOSED http://bugs.python.org/issue2231reopened loewis patch itertools.groupby() leaks memory with circular reference 10/20/08 CLOSED http://bugs.python.org/issue2246reopened loewis patch Dis docs on CALL_FUNCTION a bit unclear 10/17/08 CLOSED http://bugs.python.org/issue4141created novalis_dt patch smtplib doesn't clear helo/ehlo flags on quit10/17/08 http://bugs.python.org/issue4142created dwig ast.Node objects do not have column number information 10/18/08 CLOSED http://bugs.python.org/issue4143created kevinwatters 3 tutorial documentation errors 10/18/08 http://bugs.python.org/issue4144created LambertDW tabulary entries in PDF documentation10/19/08 http://bugs.python.org/issue4145created wplappert compilation of Modules/python.c fails on OpenBSD 10/19/08 CLOSED http://bugs.python.org/issue4146created djmdjm patch, needs review xml.dom.minidom toprettyxml: omit whitespace for text-only eleme 10/19/08 http://bugs.python.org/issue4147created thomas.lee patch Using blender10/20/08 CLOSED http://bugs.python.org/issue4148created mshee Py_BuildValue and "y" format unit10/20/08 CLOSED http://bugs.python.org/issue4149created exe pdb "up" command fails in generator frames 10/20/08 CLOSED http://bugs.python.org/issue4150created arigo patch, patch Separate build dir broken10/20/08 http://bugs.python.org/issue4151created nas patch ihooks module cannot handle absolute imports 10/20/08 http://bugs.python.org/issue4152created nas Unicode HOWTO up to date?10/20/08 http://bugs.python.org/issue4153created tjreedy More doc trivia 10/20/08 CLOSED http://bugs.python.org/issue4154created LambertDW Wrong math calculation 10/20/08 CLOSED http://bugs.python.org/issue4155created bubersson Docs for BaseHandler.protocol_xxx methods are unclear10/21/08 http://bugs.python.org/issue4156created kjohnson Tuple not callable in platform.py10/21/08 CLOSED http://bugs.python.org/issue4157created Feite compilation of sqlite3 fails
Re: [Python-Dev] [ANN] VPython 0.1
[EMAIL PROTECTED] writes: > BTW, as to the implementation of individual VM instructions I don't believe > the Vmgen stuff affects that. It's just the way the instructions are > assembled. Vmgen handles the pushing and popping as well. E.g. ROT_THREE becomes: rot_three ( a1 a2 a3 -- a3 a1 a2 ) BINARY_POWER is: binary_power ( a1 a2 -- a dec:a1 dec:a2 next:a ) a = PyNumber_Power(a1, a2, Py_None); (Here I have abused Vmgen a bit by declaring, in addition to the actual value stack, some dummy stacks with different stack prefixes and using the ``push'' instructions generated for those to do reference counting.) I should mention that some of the more involved instructions have no declared effect (i.e. ( -- ) ) with stack manipulation still being done by hand. Cheers, -jakob ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [ANN] VPython 0.1
[EMAIL PROTECTED] writes: > Guido> This is very interesting (at this point I'm just lurking), but > Guido> has anyone pointed out yet that there already is something else > Guido> called VPython, which has a long standing "right" to the name? > > I believe Jakob has already been notified about this. How about TPython? A > quick google-check suggests that while there is at least one instance of > that name in use as related to Python, it seems to be fairly obscure and is > perhaps only used internally at CERN. > TPython it is! Cheers, -jakob ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] No manifest files on Windows?
> Mark Hammond schrieb: > >> In http://bugs.python.org/issue4120, the author suggests that it > might > >> be possible to completely stop using the manifest mechanism, for VS > >> 2008. Given the many problems that this SxS stuff has caused, this > >> sounds like a very desirable route, although I haven't done any > actual > >> testing yet. > >> > >> Can all the Windows experts please comment? Could that work? Does it > >> have any downsides? > >> > >> If it works, I would like to apply it to 3.0, although I probably > >> won't be able to apply it to tomorrow's rc. Would it also be > possible > >> to change that in 2.6.1 (even though python26.dll in 2.6.0 already > >> includes a manifest, as do all the pyd files)? > > > > My take is that the bug is suggesting the manifest be dropped only > from .pyd > > files. Python's executable and DLL will still have the manifest > reference, > > so the CRT is still loaded from a manifest (FWIW, the CRT will > abort() if > > initialized other than via a manifest!). > > What about COM objects: isn't pythoncom26.dll or _ctypes.pyd the first > executable > image that is loaded first for them? And how would they load the crt? Yeah - I don't think the manifest could be dropped from these files. pythoncom is already loaded magically, but it would make sense to ensure the patch is setup such that an extension can still request a manifest as normal for the special cases when it is needed. I think the vast majority of .pyd files will not need the manifest... But I'm surprised it hurts! I'm surprised that if a .pyd references an assembly already loaded into the process as a private assembly from another directory, the load will fail unless there is *another* copy of the private assembly next to the .pyd (the manifest reference is always a "strong" reference including versions and hashes, so there is no ambiguity), but at this stage I'm taking it on faith that the bug as reported does actually exist - I've only ever tested with shared assembles. Cheers, Mark ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [ANN] VPython 0.1
Stefan Behnel wrote: Funny to hear that from the author of a well-known code generator. ;-) I've never claimed that anything about the implementation of Pyrex is easy to follow. :-) Having two switch statements and a couple of separate special cases for a single eval loop might look pretty arbitrary and not straight forward at all to a reader who doesn't have enough background regarding the performance characteristics of Python's VM statements. Maybe not, but at least you can follow what it's doing just by knowing C. Introducing vmgen would introduce another layer for the reader to learn about. I'm not saying this is a bad enough problem to stop it being done, just that it's something to consider that isn't necessarily on the positive side. -- Greg ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
