[issue4753] Faster opcode dispatch on gcc

2015-06-02 Thread David Bolen
David Bolen added the comment: Oops, sorry, I had just followed the commit comment to this issue. For the record here, it looks like Benjamin has committed an update (5e8fa1b13516) that resolves the problem. -- ___ Python tracker

[issue4753] Faster opcode dispatch on gcc

2015-06-01 Thread David Bolen
David Bolen added the comment: The 2.7 back-ported version of this patch appears to have broken compilation on the Windows XP buildbot, during the OpenSSL build process, when the newly built Python is used to execute the build_ssl.py script. After this patch, when that stage executes, and

[issue4753] Faster opcode dispatch on gcc

2015-06-01 Thread David Bolen
David Bolen added the comment: I ran a few more tests, and the generated executable hangs in both release and debug builds. The closest I can get at the moment is that it's stuck importing errno from the import sys, errno line in os.py - at least no matter how long I wait after starting a

[issue4753] Faster opcode dispatch on gcc

2015-06-01 Thread R. David Murray
R. David Murray added the comment: Please open a new issue with the details about your problem. -- nosy: +r.david.murray ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4753 ___

[issue4753] Faster opcode dispatch on gcc

2015-05-28 Thread Roundup Robot
Roundup Robot added the comment: New changeset 17d3bbde60d2 by Benjamin Peterson in branch '2.7': backport computed gotos (#4753) https://hg.python.org/cpython/rev/17d3bbde60d2 -- nosy: +python-dev ___ Python tracker rep...@bugs.python.org

[issue4753] Faster opcode dispatch on gcc

2015-05-27 Thread Robert Collins
Robert Collins added the comment: FWIW I'm interested and willing to poke at this if more testers/reviewers are needed. -- nosy: +rbcollins ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4753

[issue4753] Faster opcode dispatch on gcc

2015-05-27 Thread Ned Deily
Ned Deily added the comment: @Vamsi, could you please open a new issue and attach your patch there so it can be properly tracked for 2.7? This issue has been closed for five years and the code has been out in the field for a long time in Python 3. Thanks! -- nosy: +ned.deily

[issue4753] Faster opcode dispatch on gcc

2015-05-27 Thread Srinivas Vamsi Parasa
Srinivas Vamsi Parasa added the comment: Hi All, This is Vamsi from Server Scripting Languages Optimization team at Intel Corporation. Would like to submit a request to enable the computed goto based dispatch in Python 2.x (which happens to be enabled by default in Python 3 given its

[issue4753] Faster opcode dispatch on gcc

2010-07-19 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: This is too late for 2.x now, closing. -- resolution: accepted - fixed status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4753

[issue4753] Faster opcode dispatch on gcc

2010-05-20 Thread Skip Montanaro
Changes by Skip Montanaro s...@pobox.com: -- nosy: -skip.montanaro ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4753 ___ ___ Python-bugs-list

[issue4753] Faster opcode dispatch on gcc

2009-07-18 Thread Michele Dionisio
Michele Dionisio michele.dioni...@gmail.com added the comment: I have patch the code of python3.1 to use computed goto tecnique also with Visual Studio. The performance result is not good (I really don't know why). But it is a good work-araound for use the computed goto also on windows. The only

[issue4753] Faster opcode dispatch on gcc

2009-07-02 Thread Jesús Cea Avión
Changes by Jesús Cea Avión j...@jcea.es: -- nosy: +jcea ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4753 ___ ___ Python-bugs-list mailing list

[issue4753] Faster opcode dispatch on gcc

2009-04-11 Thread Mark Dickinson
Changes by Mark Dickinson dicki...@gmail.com: -- nosy: -marketdickinson ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4753 ___ ___

[issue4753] Faster opcode dispatch on gcc

2009-04-11 Thread Alexandre Vassalotti
Changes by Alexandre Vassalotti alexan...@peadrop.com: -- nosy: -alexandre.vassalotti ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4753 ___ ___

[issue4753] Faster opcode dispatch on gcc

2009-04-09 Thread Andrew I MacIntyre
Andrew I MacIntyre aimacint...@users.sourceforge.net added the comment: Antoine, in my testing the loss of the HAS_ARG() optimisation in my patch appears to have negligible cost on i386, but starts to look significant on amd64. On an Intel E8200 cpu running FreeBSD 7.1 amd64, with gcc 7.2.1 and

[issue4753] Faster opcode dispatch on gcc

2009-03-31 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: On 2009-03-31 03:19, A.M. Kuchling wrote: A.M. Kuchling li...@amk.ca added the comment: Is a backport to 2.7 still planned? I hope it is. -- ___ Python tracker rep...@bugs.python.org

[issue4753] Faster opcode dispatch on gcc

2009-03-31 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: Andrew, your patch disables the optimization that HAS_ARG(op) is a constant when op is known by the compiler (that is, inside a TARGET_##op label), so I'd rather keep the version which is currently in SVN. -- versions: -Python 3.1

[issue4753] Faster opcode dispatch on gcc

2009-03-30 Thread A.M. Kuchling
A.M. Kuchling li...@amk.ca added the comment: Is a backport to 2.7 still planned? -- nosy: +akuchling ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4753 ___

[issue4753] Faster opcode dispatch on gcc

2009-03-22 Thread Andrew I MacIntyre
Andrew I MacIntyre aimacint...@users.sourceforge.net added the comment: Out of interest, the attached patch against the py3k branch at r70516 cleans up the threaded code changes a little: - gets rid of TARGET_WITH_IMPL macro; - TARGET(op) is followed by a colon, so that it looks like a label

[issue4753] Faster opcode dispatch on gcc

2009-02-20 Thread Joshua Bronson
Changes by Joshua Bronson jabron...@gmail.com: -- nosy: +jab ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4753 ___ ___ Python-bugs-list mailing

[issue4753] Faster opcode dispatch on gcc

2009-02-07 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: Skip, removing the colon doesn't work if the macro adds code after the colon :) ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4753 ___

[issue4753] Faster opcode dispatch on gcc

2009-02-07 Thread Skip Montanaro
Skip Montanaro s...@pobox.com added the comment: Antoine Skip, removing the colon doesn't work if the macro adds code Antoine after the colon :) When I looked I thought both TARGET and TARGET_WITH_IMPL ended with a colon, but I see that's not the case. How about removing TARGET_WITH_IMPL

[issue4753] Faster opcode dispatch on gcc

2009-02-04 Thread Gabriel Genellina
Gabriel Genellina gagsl-...@yahoo.com.ar added the comment: Might I suggest that the TARGET and TARGET_WITH_IMPL macros not include the trailing colon? Yes, please! -- nosy: +gagenellina ___ Python tracker rep...@bugs.python.org

[issue4753] Faster opcode dispatch on gcc

2009-02-03 Thread Skip Montanaro
Skip Montanaro s...@pobox.com added the comment: This has been checked in, right? Might I suggest that the TARGET and TARGET_WITH_IMPL macros not include the trailing colon? I think that will make it more friendly toward smart editors such as Emacs' C mode. I definitely get better indentation

[issue4753] Faster opcode dispatch on gcc

2009-01-31 Thread Mark Dickinson
Mark Dickinson dicki...@gmail.com added the comment: Square brackets added in r69133. The gentoo x86 3.x buildbot seems to be passing the compile stage now. (Though not the test stage, of course: one can't have everything!) ___ Python tracker

[issue4753] Faster opcode dispatch on gcc

2009-01-31 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: Square brackets added in r69133. The gentoo x86 3.x buildbot seems to be passing the compile stage now. (Though not the test stage, of course: one can't have everything!) The test failure also happens on trunk, it may be related to the

[issue4753] Faster opcode dispatch on gcc

2009-01-31 Thread Mark Dickinson
Mark Dickinson dicki...@gmail.com added the comment: The test failure also happens on trunk, it may be related to the recent tk changes. Yes; sorry---I didn't mean to suggest that the test failures were in any way related to the opcode dispatch stuff. Apart from the ttk teething

[issue4753] Faster opcode dispatch on gcc

2009-01-30 Thread Kevin Watters
Kevin Watters kevinwatt...@gmail.com added the comment: Does anyone know the equivalent ICC command line option for GCC's -fno- gcse? (Or if it has one?) I can't find a related option in the docs. It looks like ICC hits the same combining goto problem, as was mentioned: without changing any

[issue4753] Faster opcode dispatch on gcc

2009-01-30 Thread Mark Dickinson
Mark Dickinson dicki...@gmail.com added the comment: The x86 gentoo buildbot is failing to compile, with error: /Python/makeopcodetargets.py ./Python/opcode_targets.h File ./Python/makeopcodetargets.py, line 28 f.write(,\n.join(\t%s % s for s in targets))

[issue4753] Faster opcode dispatch on gcc

2009-01-30 Thread Mark Dickinson
Mark Dickinson dicki...@gmail.com added the comment: One other thought: it seems that as a result of this change, the py3k build process now depends on having some version of Python already installed; before this, it didn't. Is this true, or am I misinterpreting something? Might it be

[issue4753] Faster opcode dispatch on gcc

2009-01-30 Thread Mark Dickinson
Mark Dickinson dicki...@gmail.com added the comment: Sorry: ignore that last. Python/opcode_targets.h is already part of the distribution. I don't know what I was doing wrong. ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4753

[issue4753] Faster opcode dispatch on gcc

2009-01-30 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: Mark: Are there any objections to me adding a couple of square brackets to this line to turn the argument of join into a list comprehension? No problems for me. You might also add to the top comments of the file that it is 2.3-compatible.

[issue4753] Faster opcode dispatch on gcc

2009-01-28 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: For the record, I've compiled py3k on an embarassingly fast Core2-based server (Xeon E5410), and the computed gotos option gives a 16% speedup on pybench and pystone. (with gcc 4.3.2 in 64-bit mode) ___ Python

[issue4753] Faster opcode dispatch on gcc

2009-01-27 Thread Gregory P. Smith
Gregory P. Smith g...@krypto.org added the comment: I'll take on the two remaining tasks for this: * add configure magic to detect when the compiler supports this so that it can default to --with-computed-gotos on modern systems. * commit the back port to 2.7 trunk. -- assignee: -

[issue4753] Faster opcode dispatch on gcc

2009-01-26 Thread Paolo 'Blaisorblade' Giarrusso
Paolo 'Blaisorblade' Giarrusso p.giarru...@gmail.com added the comment: -fno-gcse is controversial. Even if it might avoid jumps sharing, the impact of that option has to be measured, since common subexpression elimination allows omitting some recalculations, so disabling global CSE might have a

[issue4753] Faster opcode dispatch on gcc

2009-01-26 Thread Kevin Watters
Changes by Kevin Watters kevinwatt...@gmail.com: -- nosy: +kevinwatters ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4753 ___ ___

[issue4753] Faster opcode dispatch on gcc

2009-01-25 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: Committed in py3k in r68924. I won't backport it to trunk myself but it should be easy enough, provided people are interested. -- resolution: - accepted stage: patch review - committed/rejected status: open - pending versions: -Python

[issue4753] Faster opcode dispatch on gcc

2009-01-24 Thread Jeffrey Yasskin
Jeffrey Yasskin jyass...@gmail.com added the comment: In the comment, you might mention both -fno-crossjumping and -fno-gcse. -fno-crossjumping's description looks like it ought to prevent combining computed gotos, but http://gcc.gnu.org/onlinedocs/gcc-4.3.3/gcc/Optimize-Options.html says

[issue4753] Faster opcode dispatch on gcc

2009-01-16 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: Here is an updated patch with a dedicated configure option (--with-computed-gotos, disabled by default), rather than a compiler detection switch. (sorry, the patch is very long because it seems running autoconf changes a lot of things in the

[issue4753] Faster opcode dispatch on gcc

2009-01-16 Thread Skip Montanaro
Skip Montanaro s...@pobox.com added the comment: Antoine (sorry, the patch is very long because it seems running Antoine autoconf changes a lot of things in the configure script) Normal practice is to not include the configure script in such patches and indicate to people that they will

[issue4753] Faster opcode dispatch on gcc

2009-01-16 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: Thanks Skip, it makes sense... so here is a patch without the configure script. (I wonder however if those huge configure changes, when checked into the SVN, could break something silently somewhere) Added file:

[issue4753] Faster opcode dispatch on gcc

2009-01-16 Thread Antoine Pitrou
Changes by Antoine Pitrou pit...@free.fr: Removed file: http://bugs.python.org/file12767/threadedceval6.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4753 ___

[issue4753] Faster opcode dispatch on gcc

2009-01-13 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: As for superinstructions, you can find an example here: #4715. ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4753 ___

[issue4753] Faster opcode dispatch on gcc

2009-01-12 Thread Jeffrey Yasskin
Jeffrey Yasskin jyass...@gmail.com added the comment: Here's the vmgen-based patch for comparison. Again, it passes all the tests, but isn't complete outside of that and (unless consensus develops that a couple percent is worth requiring vmgen) shouldn't distract from reviewing Antoine's patch.

[issue4753] Faster opcode dispatch on gcc

2009-01-12 Thread Paolo 'Blaisorblade' Giarrusso
Paolo 'Blaisorblade' Giarrusso p.giarru...@gmail.com added the comment: A couple percent maybe is not worth vmgen-ing. But even if I'm not a vmgen expert, I read many papers from Ertl about superinstructions and replication, so the expected speedup from vmgen'ing is much bigger. Is there some

[issue4753] Faster opcode dispatch on gcc

2009-01-12 Thread Jeffrey Yasskin
Jeffrey Yasskin jyass...@gmail.com added the comment: I've left some line-by-line comments at http://codereview.appspot.com/11905. Sorry if there was already a Rietveld thread; I didn't see one. ___ Python tracker rep...@bugs.python.org

[issue4753] Faster opcode dispatch on gcc

2009-01-12 Thread Jeffrey Yasskin
Jeffrey Yasskin jyass...@gmail.com added the comment: @Paolo: I'm going to be looking into converting more common sequences into superinstructions. We only have LOAD_CONST+XXX so far. The others are difficult because vmgen doesn't provide easy ways to deal with error handling, but Jakob and I

[issue4753] Faster opcode dispatch on gcc

2009-01-12 Thread Paolo 'Blaisorblade' Giarrusso
Paolo 'Blaisorblade' Giarrusso p.giarru...@gmail.com added the comment: Ok, then vmgen adds almost just direct threading instead of indirect threading. Since the purpose of superinstructions is to eliminate dispatch overhead, and that's more important when little actual work is done, what about

[issue4753] Faster opcode dispatch on gcc

2009-01-11 Thread Andrew Bennetts
Changes by Andrew Bennetts s...@users.sourceforge.net: -- nosy: +spiv ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4753 ___ ___ Python-bugs-list

[issue4753] Faster opcode dispatch on gcc

2009-01-11 Thread Jeffrey Yasskin
Jeffrey Yasskin jyass...@gmail.com added the comment: Here's a port of threadedceval5.patch to trunk. It passes the tests. I haven't benchmarked this exact patch, but on one Intel Core2, a similar patch got an 11%-14% speedup (on 2to3 and pybench). I've also cleaned up Jakob Sievers' vmgen

[issue4753] Faster opcode dispatch on gcc

2009-01-11 Thread Gregory P. Smith
Gregory P. Smith g...@krypto.org added the comment: Benchmarking pitrou_dispatch_2.7.patch applied to trunk r68522 on a 32- bit Efficeon (x86) using gcc 4.2.4-1ubuntu3 yields a 10% pybench speedup. ___ Python tracker rep...@bugs.python.org

[issue4753] Faster opcode dispatch on gcc

2009-01-10 Thread Paolo 'Blaisorblade' Giarrusso
Paolo 'Blaisorblade' Giarrusso p.giarru...@gmail.com added the comment: The standing question is still: can we get ICC to produce the expected output? It looks like we still didn't manage, and since ICC is the best compiler out there, this matters. Some problems with SunCC, even if it doesn't

[issue4753] Faster opcode dispatch on gcc

2009-01-10 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: @pitrou: The machine I got the 15% speedup on is in 64-bit mode with gcc 4.3.2. Which is the processor? I guess the bigger speedups should be on Pentium4, since it has the bigger mispredict penalties. Athlon X2 3600+.

[issue4753] Faster opcode dispatch on gcc

2009-01-10 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: It looks like we still didn't manage, and since ICC is the best compiler out there, this matters. Well, from the perspective of Python, what matters mostly is the commonly used compilers (that is, gcc and MSVC). I doubt many people compile

[issue4753] Faster opcode dispatch on gcc

2009-01-10 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: On 2009-01-10 10:55, Antoine Pitrou wrote: Antoine Pitrou pit...@free.fr added the comment: It looks like we still didn't manage, and since ICC is the best compiler out there, this matters. Well, from the perspective of Python, what

[issue4753] Faster opcode dispatch on gcc

2009-01-10 Thread Paolo 'Blaisorblade' Giarrusso
Paolo 'Blaisorblade' Giarrusso p.giarru...@gmail.com added the comment: Same for CPU-specific tuning: I don't think we want to ship Python with compiler flags which depend on the particular CPU being used. I wasn't suggesting this - but since different CPUs have different optimization rules,

[issue4753] Faster opcode dispatch on gcc

2009-01-10 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: (First culprit might be license/compatibility problems I guess, but the speedup would be worth the time to fix the troubles IMHO). That would be the obvious reason IMO. And Intel is the only one who can fix the troubles.

[issue4753] Faster opcode dispatch on gcc

2009-01-10 Thread Alexander Belopolsky
Changes by Alexander Belopolsky belopol...@users.sourceforge.net: -- nosy: +belopolsky ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4753 ___ ___

[issue4753] Faster opcode dispatch on gcc

2009-01-09 Thread Daniel Diniz
Daniel Diniz aja...@gmail.com added the comment: Paolo, Applying your patches makes no difference with gcc 4.2 and gives a barely noticeable (~2%) slowdown with icc. These results are from a Celeron M 410 (Core Solo Yonah-based), so it's a rather old platform to run benchmarks on.

[issue4753] Faster opcode dispatch on gcc

2009-01-09 Thread Paolo 'Blaisorblade' Giarrusso
Paolo 'Blaisorblade' Giarrusso p.giarru...@gmail.com added the comment: @ ajaksu2 Applying your patches makes no difference with gcc 4.2 and gives a barely noticeable (~2%) slowdown with icc. Your patches is something quite unclear :-) Which are the patch sets you are comparing? And on 32 or

[issue4753] Faster opcode dispatch on gcc

2009-01-07 Thread Paolo 'Blaisorblade' Giarrusso
Changes by Paolo 'Blaisorblade' Giarrusso p.giarru...@gmail.com: Added file: http://bugs.python.org/file12634/restore-old-oparg-load.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4753 ___

[issue4753] Faster opcode dispatch on gcc

2009-01-07 Thread Paolo 'Blaisorblade' Giarrusso
Changes by Paolo 'Blaisorblade' Giarrusso p.giarru...@gmail.com: Added file: http://bugs.python.org/file12633/abstract-switch-reduced.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4753 ___

[issue4753] Faster opcode dispatch on gcc

2009-01-07 Thread Paolo 'Blaisorblade' Giarrusso
Paolo 'Blaisorblade' Giarrusso p.giarru...@gmail.com added the comment: I finally implemented my suggestion for the switch elimination. On top of threadedceval5.patch, apply abstract-switch-reduced.diff and then restore-old-oparg-load.diff to test it. This way, only computed goto's are used. I

[issue4753] Faster opcode dispatch on gcc

2009-01-07 Thread Skip Montanaro
Skip Montanaro s...@pobox.com added the comment: Paolo Various techniques allow to create binary code from the Paolo interpreter binary, by just pasting together the code for the Paolo common interpreters cases and producing calls to the other. But, Paolo guess what, on most

[issue4753] Faster opcode dispatch on gcc

2009-01-07 Thread Paolo 'Blaisorblade' Giarrusso
Paolo 'Blaisorblade' Giarrusso p.giarru...@gmail.com added the comment: @skip: In simple words, the x86 call: call 0x2000 placed at address 0x1000 becomes: call %rip + 0x1000 RIP holds the instruction pointer, which will be 0x1000 in this case (actually, I'm ignoring the detail that when

[issue4753] Faster opcode dispatch on gcc

2009-01-07 Thread Gregory P. Smith
Changes by Gregory P. Smith g...@krypto.org: -- nosy: +gregory.p.smith ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4753 ___ ___ Python-bugs-list

[issue4753] Faster opcode dispatch on gcc

2009-01-06 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: FWIW, I have made a quick attempt at removing the f-f_lasti assignment in the few places where it could be removed, but it didn't make a difference on my machine. The problem being that there are very few places where it is legitimate to remove

[issue4753] Faster opcode dispatch on gcc

2009-01-06 Thread Paolo 'Blaisorblade' Giarrusso
Paolo 'Blaisorblade' Giarrusso p.giarru...@gmail.com added the comment: @pitrou: ranting mode Argh, reference counting hinders even that? /ranting mode I just discovered another problem caused by refcounting. Various techniques allow to create binary code from the interpreter binary, by just

[issue4753] Faster opcode dispatch on gcc

2009-01-05 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: Le lundi 05 janvier 2009 à 02:39 +, Paolo 'Blaisorblade' Giarrusso a écrit : About f-last_i, when I have time I want to try optimizing it. Somewhere you can be sure it's not going to be used. There are lots of places which can call into

[issue4753] Faster opcode dispatch on gcc

2009-01-05 Thread Jeffrey Yasskin
Changes by Jeffrey Yasskin jyass...@gmail.com: -- nosy: +collinwinter, jyasskin ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4753 ___ ___

[issue4753] Faster opcode dispatch on gcc

2009-01-04 Thread Paolo 'Blaisorblade' Giarrusso
Paolo 'Blaisorblade' Giarrusso p.giarru...@gmail.com added the comment: @Alexandre: So, can you try dropping the switch altogether, using always computed goto and seeing how does the resulting code get compiled? Removing the switch won't be possible unless we change the semantic

[issue4753] Faster opcode dispatch on gcc

2009-01-04 Thread Paolo 'Blaisorblade' Giarrusso
Paolo 'Blaisorblade' Giarrusso p.giarru...@gmail.com added the comment: @Skip: if one decides to generate binary code, there is no need to use switches. Inline threading (also known as code copying in some research papers) is what you are probably looking for:

[issue4753] Faster opcode dispatch on gcc

2009-01-04 Thread Ralph Corderoy
Changes by Ralph Corderoy ralph-pythonb...@inputplus.co.uk: ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4753 ___ ___ Python-bugs-list mailing list

[issue4753] Faster opcode dispatch on gcc

2009-01-04 Thread Skip Montanaro
Skip Montanaro s...@pobox.com added the comment: I'm sure this is the wrong place to bring this up, but I had a thought about simple JIT compilation coupled with the opcode dispatch changes in this issue. Consider this silly function: def f(a, b): ... result = 0 ... while b:

[issue4753] Faster opcode dispatch on gcc

2009-01-04 Thread Alexandre Vassalotti
Alexandre Vassalotti alexan...@peadrop.com added the comment: Removing the switch won't be possible unless we change the semantic EXTENDED_ARG. In addition, I doubt the improvement, if any, would worth the increased complexity. Nevermind what I have said. I managed to remove switch pretty

[issue4753] Faster opcode dispatch on gcc

2009-01-04 Thread Alexandre Vassalotti
Alexandre Vassalotti alexan...@peadrop.com added the comment: I managed to remove switch pretty easily by moving opcode fetching in the FAST_DISPATCH macro and abstracting the control flow of the switch. Here is the diff against threadceval5.patch. Added file:

[issue4753] Faster opcode dispatch on gcc

2009-01-04 Thread Facundo Batista
Changes by Facundo Batista facu...@taniquetil.com.ar: -- nosy: +facundobatista ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4753 ___ ___

[issue4753] Faster opcode dispatch on gcc

2009-01-04 Thread Ralph Corderoy
Ralph Corderoy ralph-pythonb...@inputplus.co.uk added the comment: Regarding compressing the opcode table to make better use of cache; what if the most frequently occurring opcodes where placed together, e.g. the opcodes were ordered by frequency, most frequent first. Just based on a one-off

[issue4753] Faster opcode dispatch on gcc

2009-01-04 Thread Paolo 'Blaisorblade' Giarrusso
Paolo 'Blaisorblade' Giarrusso p.giarru...@gmail.com added the comment: @alexandre: if you add two labels per opcode and two dispatch tables, one before (like now) and one after the parameter fetch (where we have the 'case'), you can keep the same speed. And under the hood we also had two

[issue4753] Faster opcode dispatch on gcc

2009-01-03 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: I'm not an expert in this kind of optimizations. Could we gain more speed by making the dispatcher table more dense? Python has less than 128 opcodes (len(opcode.opmap) == 113) so they can be squeezed in a smaller table. I naively assume a

[issue4753] Faster opcode dispatch on gcc

2009-01-03 Thread djc
Changes by djc dirk...@ochtman.nl: -- nosy: +djc ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4753 ___ ___ Python-bugs-list mailing list

[issue4753] Faster opcode dispatch on gcc

2009-01-03 Thread Yann Ramin
Changes by Yann Ramin at...@stackworks.net: -- nosy: +theatrus ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4753 ___ ___ Python-bugs-list mailing

[issue4753] Faster opcode dispatch on gcc

2009-01-03 Thread Benoit Boissinot
Changes by Benoit Boissinot bboissin+pythonb...@gmail.com: -- nosy: +bboissin ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4753 ___ ___

[issue4753] Faster opcode dispatch on gcc

2009-01-03 Thread Daniel Diniz
Daniel Diniz aja...@gmail.com added the comment: IIUC, this is what gcc 4.2.4 generates on a Celeron M for the code Alexandre posted: movl-272(%ebp), %eax movl8(%ebp), %edx subl-228(%ebp), %eax movl%eax, 60(%edx) movl-272(%ebp), %ecx

[issue4753] Faster opcode dispatch on gcc

2009-01-03 Thread Paolo 'Blaisorblade' Giarrusso
Paolo 'Blaisorblade' Giarrusso p.giarru...@gmail.com added the comment: 1st note: is that code from the threaded version? Note that you need to modify the source to make it accept also ICC to try that. In case you already did that, I guess the patch is not useful at all with ICC since, as far as

[issue4753] Faster opcode dispatch on gcc

2009-01-03 Thread Paolo 'Blaisorblade' Giarrusso
Paolo 'Blaisorblade' Giarrusso p.giarru...@gmail.com added the comment: Daniel, I forgot to ask for the compilation command line you used, since they make a lot of difference. Can you post them? Thanks ___ Python tracker rep...@bugs.python.org

[issue4753] Faster opcode dispatch on gcc

2009-01-03 Thread Alexandre Vassalotti
Alexandre Vassalotti alexan...@peadrop.com added the comment: Paolo wrote: So, can you try dropping the switch altogether, using always computed goto and seeing how does the resulting code get compiled? Removing the switch won't be possible unless we change the semantic EXTENDED_ARG. In

[issue4753] Faster opcode dispatch on gcc

2009-01-02 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: On 2009-01-01 23:59, Antoine Pitrou wrote: Antoine Pitrou pit...@free.fr added the comment: I've updated the comments as per Alexandre's request, added support for SUN CC, and fixed the generation script to use the new filename. Since

[issue4753] Faster opcode dispatch on gcc

2009-01-02 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: This is safer than enabling the support unconditionally for GCC and the SUN Pro C compiler, since it is rather likely that some GCC versions have bugs which could render Python unusable if compiled with the dispatching support enabled. What

[issue4753] Faster opcode dispatch on gcc

2009-01-02 Thread Skip Montanaro
Skip Montanaro s...@pobox.com added the comment: Antoine I fear that with a configure option, disabled by default, the Antoine code will get very poor testing and perhaps get broken in some Antoine subtle way without anyone noticing. That can be fixed by enabling that option on the

[issue4753] Faster opcode dispatch on gcc

2009-01-02 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: If I dump the assembler code for ceval.c will that help others debug the problem? Well, I'm no PPC expert but it can be useful. Can you dump it with -S -dA? (also, can you try the latest patch? I've made some tiny adjustement in the opcode

[issue4753] Faster opcode dispatch on gcc

2009-01-02 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: On 2009-01-02 17:10, Antoine Pitrou wrote: Antoine Pitrou pit...@free.fr added the comment: This is safer than enabling the support unconditionally for GCC and the SUN Pro C compiler, since it is rather likely that some GCC versions

[issue4753] Faster opcode dispatch on gcc

2009-01-02 Thread Skip Montanaro
Skip Montanaro s...@pobox.com added the comment: OK, I think I'm misreading the output of pybench. Let me reset. Ignore anything I've written previously on this topic. Instead, I will just post the output of my pybench comparison runs and let more expert people interpret as appropriate. The

[issue4753] Faster opcode dispatch on gcc

2009-01-02 Thread Skip Montanaro
Skip Montanaro s...@pobox.com added the comment: The next is the result of running on my MacBook Pro (Intel Core 2 Duo). Added file: http://bugs.python.org/file12546/pybench.sum.Intel ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4753

[issue4753] Faster opcode dispatch on gcc

2009-01-02 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: OK, I think I'm misreading the output of pybench. Let me reset. Ignore anything I've written previously on this topic. Instead, I will just post the output of my pybench comparison runs and let more expert people interpret as appropriate.

[issue4753] Faster opcode dispatch on gcc

2009-01-02 Thread Skip Montanaro
Skip Montanaro s...@pobox.com added the comment: Antoine Ok, so the threaded version is actually faster by 20% on your Antoine PPC, and slower by 5% on your Core 2 Duo. Thanks for doing the Antoine measurements! Confirmed by pystone runs as well. Sorry for the earlier misdirection.

[issue4753] Faster opcode dispatch on gcc

2009-01-02 Thread Alexandre Vassalotti
Alexandre Vassalotti alexan...@peadrop.com added the comment: The patch make a huge difference on 64-bit Linux. I get a 20% speed-up and the lowest run time so far. That is quite impressive! At first glance, it seems the extra registers of the x86-64 architecture permit GCC to avoid spilling

[issue4753] Faster opcode dispatch on gcc

2009-01-02 Thread Alexandre Vassalotti
Alexandre Vassalotti alexan...@peadrop.com added the comment: One more thing, the patch causes the following warnings to be emited by GCC when USE_COMPUTED_GOTOS is undefined. Python/ceval.c: In function ‘PyEval_EvalFrameEx’: Python/ceval.c:2420: warning: label ‘_make_function’ defined but

[issue4753] Faster opcode dispatch on gcc

2009-01-02 Thread Skip Montanaro
Skip Montanaro s...@pobox.com added the comment: Alexandre's last comment reminded me I forgot to post the PPC assembler code. Next two files are the output as requested by Antoine. Added file: http://bugs.python.org/file12553/ceval.i.unthreaded ___ Python

[issue4753] Faster opcode dispatch on gcc

2009-01-02 Thread Skip Montanaro
Changes by Skip Montanaro s...@pobox.com: Added file: http://bugs.python.org/file12555/ceval.i.threaded ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4753 ___

  1   2   >