Re: [Python-Dev] Unladen swallow status
On Wed, Jul 21, 2010 at 2:43 PM, Maciej Fijalkowski fij...@gmail.com wrote: On Wed, Jul 21, 2010 at 6:50 PM, Reid Kleckner reid.kleck...@gmail.com wrote: On Wed, Jul 21, 2010 at 8:11 AM, Tim Golden m...@timgolden.me.uk wrote: Brett suggested that the Unladen Swallow merge to trunk was waiting for some work to complete on the JIT compiler and Georg, as release manager for 3.2, confirmed that Unladen Swallow would not be merged before 3.3. Yeah, this has slipped. I have patches that need review, and Jeff and Collin have been distracted with other work. Hopefully when one of them gets around to that, I can proceed with the merge without blocking on them. Reid The merge py3k-jit to trunk? I believe he's talking about the merger of the Unladen tree into the py3k-jit branch. Collin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] commit privs
On Sun, Jul 11, 2010 at 9:28 AM, Antoine Pitrou solip...@pitrou.net wrote: On Sun, 11 Jul 2010 13:23:13 + Reid Kleckner reid.kleck...@gmail.com wrote: I'm also expecting to be doing more work merging unladen-swallow into the py3k-jit branch, so I was wondering if I could get commit privileges for that. It sounds good to me. Also, thanks for your threading patches! +1 from me. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] New regex module for 3.2?
On Mon, Jul 12, 2010 at 8:18 AM, Michael Foord fuzzy...@voidspace.org.uk wrote: On 12/07/2010 15:07, Nick Coghlan wrote: On Mon, Jul 12, 2010 at 9:42 AM, Steven D'Apranost...@pearwood.info wrote: On Sun, 11 Jul 2010 09:37:22 pm Eric Smith wrote: re2 comparison is interesting from the point of if it should be included in stdlib. Is it re2 or regex? I don't see having 2 regular expression engines in the stdlib. There's precedence though... the old regex engine and the new re engine were side-by-side for many years before regex was deprecated and finally removed in 2.5. Hypothetically, re2 could similarly be added to the standard library while re is deprecated. re2 deliberately omits some features for efficiency reasons, hence is not even on the table as a possible replacement for the standard library version. If someone is in a position where re2 can solve their problems with the re module, they should also be in a position where they can track it down for themselves. If it has *partial* compatibility, and big enough performance improvements for common cases, it could perhaps be used where the regex doesn't use unsupported features. This would have some extra cost in the compile phase, but would mean Python could ship with two regex engines but only one interface exposed to the programmer... FWIW, this has all been discussed before: http://aspn.activestate.com/ASPN/Mail/Message/python-dev/3829265. In particular, I still believe that, it's not obvious that enough Python regexes would benefit from re2's performance/restrictions tradeoff to make such a hybrid system worthwhile in the long term. (There is no representative corpus of real-world Python regexes weighted for dynamic execution frequency to use in assessing such tradeoffs empirically like there is for JavaScript.) Collin MRAB's module offers a superset of re's features rather than a subset though, so once it has had more of a chance to bake on PyPI it may be worth another look. Cheers, Nick. -- http://www.ironpythoninaction.com/ http://www.voidspace.org.uk/blog READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/collinwinter%40google.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] New regex module for 3.2?
On Fri, Jul 9, 2010 at 10:28 AM, MRAB pyt...@mrabarnett.plus.com wrote: anatoly techtonik wrote: On Thu, Jul 8, 2010 at 10:52 PM, MRAB pyt...@mrabarnett.plus.com wrote: Hi all, I re-implemented the re module, adding new features and speed improvements. It's available at: http://pypi.python.org/pypi/regex under the name regex so that it can be tried alongside re. I'd be interested in any comments or feedback. How does it compare with re in terms of speed on real-world data? The benchmarks suggest it should be faster, or at worst comparable. And where are the benchmarks? In particular it would be interesting to see it compared both to re from stdlib and re2 from http://code.google.com/p/re2/ The benchmarks bm_regex_effbot.py and bm_regex_v8.py both perform multiple runs of the tests multiple times, giving just the total times for each set. Here are the averages: Python26 BENCHMARK re regex ratio bm_regex_effbot 0.135secs 0.083secs 1.63 bm_regex_v8 0.153secs 0.085secs 1.80 Python31 BENCHMARK re regex ratio bm_regex_effbot 0.138secs 0.083secs 1.66 bm_regex_v8 0.170secs 0.091secs 1.87 Out of curiosity, what are the results for the bm_regex_compile benchmark? Collin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Unpickling memory usage problem, and a proposed solution
On Fri, Apr 23, 2010 at 11:49 AM, Alexandre Vassalotti alexan...@peadrop.com wrote: On Fri, Apr 23, 2010 at 2:38 PM, Alexandre Vassalotti alexan...@peadrop.com wrote: Collin Winter wrote a simple optimization pass for cPickle in Unladen Swallow [1]. The code reads through the stream and remove all the unnecessary PUTs in-place. I just noticed the code removes *all* PUT opcodes, regardless if they are needed or not. So, this code can only be used if there's no GET in the stream (which is unlikely for a large stream). I believe Collin made this trade-off for performance reasons. However, it wouldn't be hard to make the current code to work like pickletools.optimize(). The optimization pass is only run if you don't use any GETs. The optimization is also disabled if you're writing to a file-like object. These tradeoffs were appropriate for the workload I was optimizing against. Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Unpickling memory usage problem, and a proposed solution
On Fri, Apr 23, 2010 at 11:53 AM, Collin Winter collinwin...@google.com wrote: On Fri, Apr 23, 2010 at 11:49 AM, Alexandre Vassalotti alexan...@peadrop.com wrote: On Fri, Apr 23, 2010 at 2:38 PM, Alexandre Vassalotti alexan...@peadrop.com wrote: Collin Winter wrote a simple optimization pass for cPickle in Unladen Swallow [1]. The code reads through the stream and remove all the unnecessary PUTs in-place. I just noticed the code removes *all* PUT opcodes, regardless if they are needed or not. So, this code can only be used if there's no GET in the stream (which is unlikely for a large stream). I believe Collin made this trade-off for performance reasons. However, it wouldn't be hard to make the current code to work like pickletools.optimize(). The optimization pass is only run if you don't use any GETs. The optimization is also disabled if you're writing to a file-like object. These tradeoffs were appropriate for the workload I was optimizing against. I should add that, adding the necessary bookkeeping to remove only unused PUTs (instead of the current all-or-nothing scheme) should not be hard. I'd watch out for a further performance/memory hit; the pickling benchmarks in the benchmark suite should help assess this. The current optimization penalizes pickling to speed up unpickling, which made sense when optimizing pickles that would go into memcache and be read out 13-15x more often than they were written. Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Unpickling memory usage problem, and a proposed solution
On Fri, Apr 23, 2010 at 1:53 PM, Alexandre Vassalotti alexan...@peadrop.com wrote: On Fri, Apr 23, 2010 at 3:57 PM, Dan Gindikin dgindi...@gmail.com wrote: This wouldn't help our use case, your code needs the entire pickle stream to be in memory, which in our case would be about 475mb, this is on top of the 300mb+ data structures that generated the pickle stream. In that case, the best we could do is a two-pass algorithm to remove the unused PUTs. That won't be efficient, but it will satisfy the memory constraint. Another solution is to not generate the PUTs at all by setting the 'fast' attribute on Pickler. But that won't work if you have a recursive structure, or have code that requires that the identity of objects to be preserved. I don't think it's possible in general to remove any PUTs if the pickle is being written to a file-like object. It is possible to reuse a single Pickler to pickle multiple objects: this causes the Pickler's memo dict to be shared between the objects being pickled. If you pickle foo, bar, and baz, foo may not have any GETs, but bar and baz may have GETs that reference data added to the memo by foo's PUT operations. Because you can't know what will be written to the file-like object later, you can't remove any of the PUT instructions in this scenario. This kind of thing is done in real-world code like cvs2svn (which I broke when I was optimizing cPickle; don't break cvs2svn, it's not fun to fix :). I added some basic tests for this support in cPython's Lib/test/pickletester.py. There might be room for app-specific optimizations that do this, but I'm not sure it would work for a general-usage cPickle that needs to stay compatible with the current system. Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] interesting article on regex performance
On Fri, Mar 12, 2010 at 8:12 AM, Nick Coghlan ncogh...@gmail.com wrote: [snip] To bring this on-topic for python-dev by considering how it could apply to Python's default re engine, I think the key issue is that any updates to the default engine would need to remain backwards compatible with all of the tricks that re2 doesn't support. There are major practical problems associated with making such a leap directly (Google's re2 engine is in C++ rather than C and we'd have to keep the existing implementation around regardless to handle the features that re2 doesn't support). I don't see why C++ would be a deal-breaker in this case, since it would be restricted to an extension module. I would say it is better to let re2 bake for a while and see if anyone is motivated to come up with Python bindings for it and release them on PyPI. FWIW, re2 is heavily, heavily used in production at Google. Stabilizing any proposed Python bindings would be a good idea, but I'm not sure how much more baking re2's core functionality needs. Once that happens (and assuming the bindings earn a good reputation), the first step towards integration would be to include a See Also in the re module documentation to point people towards the more limited (but more robust) regex engine implementation. The next step would probably be a hybrid third party library that exploits the NFA approach when it can, but resorts to backtracking when it has to in order to handle full regex functionality. (Although developers would need to be able to disable the backtracking support in order to restore re2's guarantees of linear time execution) We considered such a hybrid approach for Unladen Swallow, but rejected it on the advice of the V8 team [1]: you end up maintaining two totally separate, incompatible regex engines; the hybrid system comes with interesting, possibly unintuitive performance/correctness issues when bailing from one implementation to another; performance is unstable as small changes are made to the regexes; and it's not obvious that enough Python regexes would benefit from re2's performance/restrictions tradeoff to make such a hybrid system worthwhile in the long term. (There is no representative corpus of real-world Python regexes weighted for dynamic execution frequency to use in assessing such tradeoffs empirically like there is for JavaScript.) re2 is very useful when you want to run user-provided regexes and want to protect your backends against pathological/malicious regex input, but I'm not sure how applicable it is to Python. I think there are more promising strategies to improve regex performance, such as reusing the new JIT infrastructure to JIT-compile regular expressions to machine code (along the lines of V8's irregexp). Some work has already been done in this direction, and I'd be thrilled to mentor any GSoC students interested in working on such a project this summer. Lastly, anyone interested in working on Python regex performance should take a look at the regex benchmarks in the standard benchmark suite [2]. Thanks, Collin Winter [1] - http://blog.chromium.org/2009/02/irregexp-google-chromes-new-regexp.html#c4843826268005492354 [2] - http://hg.python.org/benchmarks/file/5b8fe389710b/performance ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] interesting article on regex performance
On Fri, Mar 12, 2010 at 11:29 AM, s...@pobox.com wrote: There are major practical problems associated with making such a leap directly (Google's re2 engine is in C++ rather than C and we'd have to keep the existing implementation around regardless to handle the features that re2 doesn't support). Collin I don't see why C++ would be a deal-breaker in this case, since Collin it would be restricted to an extension module. Traditionally Python has run on some (minority) platforms where C++ was unavailable. While the re module is a dynamically linked extension module and thus could be considered optional, I doubt anybody thinks of it as optional nowadays. It's used in the regression test suite anyway. It would be tough to run unit tests on such minority platforms without it. You'd have to maintain both the current sre implementation and the new re2 implementation for a long while into the future. re2 is not a full replacement for Python's current regex semantics: it would only serve as an accelerator for a subset of the current regex language. Given that, it makes perfect sense that it would be optional on such minority platforms (much like the incoming JIT). Collin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Caching function pointers in type objects
Hey Daniel, On Wed, Mar 3, 2010 at 1:24 PM, Daniel Stutzbach dan...@stutzbachenterprises.com wrote: On Tue, Mar 2, 2010 at 9:06 PM, Reid Kleckner r...@mit.edu wrote: I don't think this will help you solve your problem, but one thing we've done in unladen swallow is to hack PyType_Modified to invalidate our own descriptor caches. We may eventually want to extend that into a callback interface, but it probably will never be considered an API that outside code should depend on. Thanks Reid and Benjamin for the information. I think I see a way to dramatically speed up PyObject_RichCompareBool when comparing immutable, built-in, non-container objects (int, float, str, etc.). It would speed up list.sort when the key is one of those types, as well as most operations on the ubiquitous dictionary with str keys. That definitely sounds worth pursuing. Is that a worthwhile avenue to pursue, or is it likely to be redundant with Unladen Swallow's optimizations? I don't believe it will be redundant with the optimizations in Unladen Swallow. If I can find time to pursue it, would it be best for me to implement it as a patch to Unladen Swallow, CPython trunk, or CPython py3k? I would recommend patching py3k, with a backport to trunk. Thanks, Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Caching function pointers in type objects
On Wed, Mar 3, 2010 at 2:41 PM, Daniel Stutzbach dan...@stutzbachenterprises.com wrote: On Wed, Mar 3, 2010 at 4:34 PM, Collin Winter collinwin...@google.com wrote: I would recommend patching py3k, with a backport to trunk. After thinking it over, I'm inclined to patch trunk, so I can run the Unladen Swallow macro-benchmarks, then forward-port to py3k. I'm correct in understanding that it will be a while before the Unladen Swallow benchmarks can support Python 3, right? That's correct; porting the full benchmark suite to Python 3 will require projects like Django to support Python 3. Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Packaging JIT-less versions of Python
Hey David, On Mon, Mar 1, 2010 at 7:29 PM, David Malcolm dmalc...@redhat.com wrote: On Mon, 2010-03-01 at 15:35 -0800, Collin Winter wrote: [snip] - How would you prefer to build the JIT-less package (current options: via a ./configure flag; or by deleting _llvmjit.so from the JIT-enabled package)? - Would the two packages be able to exist side-by-side, or would they be mutually exclusive? I have a particular interest in ABI compatibility: if turning JIT on and off is going to change the ABI of extension modules, that would be a major pain, as I hope that we will have dozens of C extension modules available via RPM for our Python 3 stack by the time of the great unladen merger. Do you have a good way of testing ABI compatibility, or is it just build modules, see if they work? Some general way of testing ABI compatibility would be really useful for PEP 384, too. So I'm keen for the ability to toggle the JIT code in the face of bugs and have it not affect ABI. -Xjit will do this at runtime (once that's renamed), but I think it would be useful to be able to toggle the JIT on/off default during the build, so that I can fix a broken architecture for non-technical users, but have individual testers opt back in with -Xjit whilst tracking down a major bug. That's something we can definitely do: you'd just change the default value for the -Xjit flag from whenhot to never. Those individual testers would pass -Xjit=whenhot to opt back in. We could make that a ./configure flag if it would be useful to you and the other distros. In either case, I don't want to have to recompile 30 extension modules to try with/without JIT; that would introduce too much change during bug-hunts, and be no fun at all. That would suck indeed; I want to avoid that. I think that kind of thing falls under PEP 384, which we will have to obey once it is accepted. (In the blue-sky nirvana future, I'd love to be able to ship ahead-of-time compiled versions of the stdlib, pre-optimized based on realworld workloads. Back in my reality, though, I have bugs to fix before I can work on _that_ patch :( ) Reid Kleckner may be looking at that for his Master's project. It's definitely doable. My strong preference would be to have the JIT included by default so that it receives as much testing as possible. Sounds reasonable. Hope the above made sense and is useful. Thanks for your perspective, Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Packaging JIT-less versions of Python
Hey packaging guys, We recently committed a change to Unladen Swallow [1] that moves all the JIT infrastructure into a Python extension module. The theory [2] behind this patch was that this would make it easier for downstream packagers to ship a JIT-less Python package, with the JIT compiler available via an optional add-on package. Some questions for you, so we're not shooting blind here: - Have you guys thought about how a JIT-enabled Python 3 installation would be packaged by your respective distros? - Would you prefer the default python3.x package to have a JIT, or would you omit the JIT by default? - How would you prefer to build the JIT-less package (current options: via a ./configure flag; or by deleting _llvmjit.so from the JIT-enabled package)? - Would the two packages be able to exist side-by-side, or would they be mutually exclusive? My strong preference would be to have the JIT included by default so that it receives as much testing as possible. Thanks, Collin Winter [1] - http://code.google.com/p/unladen-swallow/source/detail?r=1110 [2] - http://code.google.com/p/unladen-swallow/issues/detail?id=136 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Mercurial repository for Python benchmarks
On Sun, Feb 21, 2010 at 9:43 PM, Collin Winter collinwin...@google.com wrote: Hey Daniel, On Sun, Feb 21, 2010 at 4:51 PM, Daniel Stutzbach dan...@stutzbachenterprises.com wrote: On Sun, Feb 21, 2010 at 2:28 PM, Collin Winter collinwin...@google.com wrote: Would it be possible for us to get a Mercurial repository on python.org for the Unladen Swallow benchmarks? Maciej and I would like to move the benchmark suite out of Unladen Swallow and into python.org, where all implementations can share it and contribute to it. PyPy has been adding some benchmarks to their copy of the Unladen benchmarks, and we'd like to have as well, and Mercurial seems to be an ideal solution to this. If and when you have a benchmark repository set up, could you announce it via a reply to this thread? I'd like to check it out. Will do. The benchmarks repository is now available at http://hg.python.org/benchmarks/. It contains all the benchmarks that the Unladen Swallow svn repository contains, including the beginnings of a README.txt that describes the available benchmarks and a quick-start guide for running perf.py (the main interface to the benchmarks). This will eventually contain all the information from http://code.google.com/p/unladen-swallow/wiki/Benchmarks, as well as guidelines on how to write good benchmarks. If you have svn commit access, you should be able to run `hg clone ssh://h...@hg.python.org/repos/benchmarks`. I'm not sure how to get read-only access; Dirkjan can comment on that. Still todo: - Replace the static snapshots of 2to3, Mercurial and other hg-based projects with clones of the respective repositories. - Fix the 2to3 and nbody benchmarks to work with Python 2.5 for Jython and PyPy. - Import some of the benchmarks PyPy has been using. Any access problems with the hg repo should be directed to Dirkjan. Thanks so much for getting the repo set up so fast! Thanks, Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 385 progress report
On Sat, Feb 13, 2010 at 2:23 PM, Benjamin Peterson benja...@python.org wrote: 2010/2/13 Martin v. Löwis mar...@v.loewis.de: I still think that the best approach for projects to use 2to3 is to run 2to3 at install time from a single-source release. For that, projects will have to adjust to whatever bugs certain 2to3 releases have, rather than requiring users to download a newer version of 2to3 that fixes them. For this use case, a tightly-integrated lib2to3 (with that name and sole purpose) is the best thing. Alright. That is reasonable. The other thing is that we will loose some vcs history and some history granularity by switching development to the trunk version, since just the svnmerged revisions will be converted. So the consensus is that 2to3 should be pulled out of the main Python tree? Should the 2to3 hg repository be deleted, then? Thanks, Collin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Mercurial repository for Python benchmarks
On Mon, Feb 22, 2010 at 3:17 PM, Collin Winter collinwin...@google.com wrote: The benchmarks repository is now available at http://hg.python.org/benchmarks/. It contains all the benchmarks that the Unladen Swallow svn repository contains, including the beginnings of a README.txt that describes the available benchmarks and a quick-start guide for running perf.py (the main interface to the benchmarks). This will eventually contain all the information from http://code.google.com/p/unladen-swallow/wiki/Benchmarks, as well as guidelines on how to write good benchmarks. We now have a Benchmarks component in the bug tracker. Suggestions for new benchmarks, feature requests for perf.py, and bugs in existing benchmarks should be reported under that component. Thanks, Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 385 progress report
On Mon, Feb 22, 2010 at 4:27 PM, Martin v. Löwis mar...@v.loewis.de wrote: The other thing is that we will loose some vcs history and some history granularity by switching development to the trunk version, since just the svnmerged revisions will be converted. So the consensus is that 2to3 should be pulled out of the main Python tree? Not sure what you mean by pull out; I had expect that the right verb should be pull into: 2to3 should be pulled into the main Python tree. Sorry, I meant pulled out as in: I want an updated version for the benchmark suite, where should I get that? Should the 2to3 hg repository be deleted, then? Which one? To my knowledge, there is no official 2to3 repository yet. When the switchover happens, 2to3 should not be converted to its own hg repository, yes. This one: http://hg.python.org/2to3 Collin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 385 progress report
On Mon, Feb 22, 2010 at 5:03 PM, Nick Coghlan ncogh...@gmail.com wrote: Dirkjan Ochtman wrote: On Mon, Feb 22, 2010 at 16:09, Collin Winter coll...@gmail.com wrote: So the consensus is that 2to3 should be pulled out of the main Python tree? Should the 2to3 hg repository be deleted, then? Wouldn't the former be reason to officialize the hg repository, instead of deleting it? I think the difference between pull out and pull from is causing confusion here (and no, I'm not sure which of those Collin actually meant either). Sorry, I meant pull from. I want an updated snapshot of 2to3 for the benchmark suite, and I'm looking for the best place to grab it from. Collin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Mercurial repository for Python benchmarks
Hey Dirkjan, Would it be possible for us to get a Mercurial repository on python.org for the Unladen Swallow benchmarks? Maciej and I would like to move the benchmark suite out of Unladen Swallow and into python.org, where all implementations can share it and contribute to it. PyPy has been adding some benchmarks to their copy of the Unladen benchmarks, and we'd like to have as well, and Mercurial seems to be an ideal solution to this. Thanks, Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Mercurial repository for Python benchmarks
On Sun, Feb 21, 2010 at 3:31 PM, Dirkjan Ochtman djc.ocht...@gmail.com wrote: Hi Collin (and others), On Sun, Feb 21, 2010 at 15:28, Collin Winter collinwin...@google.com wrote: Would it be possible for us to get a Mercurial repository on python.org for the Unladen Swallow benchmarks? Maciej and I would like to move the benchmark suite out of Unladen Swallow and into python.org, where all implementations can share it and contribute to it. PyPy has been adding some benchmarks to their copy of the Unladen benchmarks, and we'd like to have as well, and Mercurial seems to be an ideal solution to this. Just a repository on hg.python.org? Sounds good to me. Are you staying for the sprints? We'll just do it. (Might need to figure out some hooks we want to put up with it.) Yep, that's all we want. I'll be around for the sprints through Tuesday, sitting at the Unladen Swallow sprint. Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Mercurial repository for Python benchmarks
Hey Daniel, On Sun, Feb 21, 2010 at 4:51 PM, Daniel Stutzbach dan...@stutzbachenterprises.com wrote: On Sun, Feb 21, 2010 at 2:28 PM, Collin Winter collinwin...@google.com wrote: Would it be possible for us to get a Mercurial repository on python.org for the Unladen Swallow benchmarks? Maciej and I would like to move the benchmark suite out of Unladen Swallow and into python.org, where all implementations can share it and contribute to it. PyPy has been adding some benchmarks to their copy of the Unladen benchmarks, and we'd like to have as well, and Mercurial seems to be an ideal solution to this. If and when you have a benchmark repository set up, could you announce it via a reply to this thread? I'd like to check it out. Will do. In the meantime, you can read http://code.google.com/p/unladen-swallow/wiki/Benchmarks to find out how to check out the current draft of the benchmarks, as well as which benchmarks are currently included. Thanks, Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
On Sat, Feb 13, 2010 at 12:12 AM, Maciej Fijalkowski fij...@gmail.com wrote: I like this wording far more. It's at the very least far more precise. Those examples are fair enough (except the fact that PyPy is not 32bit x86 only, the JIT is). [snip] slower than US on some workloads is true, while not really telling much to a potential reader. For any X and Y implementing the same language X is faster than Y on some workloads is usually true. To be precise you would need to include the above table in the PEP, which is probably a bit too much, given that PEP is not about PyPy at all. I'm fine with any wording that is at least correct. I've updated the language: http://codereview.appspot.com/186247/diff2/9005:11001/11002. Thanks for the clarifications. Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
Hey Maciej, On Thu, Feb 11, 2010 at 6:39 AM, Maciej Fijalkowski fij...@gmail.com wrote: Snippet from: http://codereview.appspot.com/186247/diff2/5014:8003/7002 *PyPy*: PyPy [#pypy]_ has good performance on numerical code, but is slower than Unladen Swallow on non-numerical workloads. PyPy only supports 32-bit x86 code generation. It has poor support for CPython extension modules, making migration for large applications prohibitively expensive. That part at the very least has some sort of personal opinion prohibitively, Of course; difficulty is always in the eye of the person doing the work. Simply put, PyPy is not a drop-in replacement for CPython: there is no embedding API, much less the same one exported by CPython; important libraries, such as MySQLdb and pycrypto, do not build against PyPy; PyPy is 32-bit x86 only. All of these problems can be overcome with enough time/effort/money, but I think you'd agree that, if all I'm trying to do is speed up my application, adding a new x86-64 backend or implementing support for CPython extension modules is certainly north of prohibitively expensive. I stand by that wording. I'm willing to enumerate all of PyPy's deficiencies in this regard in the PEP, rather than the current vaguer wording, if you'd prefer. while the other part is not completely true slower than US on non-numerical workloads. Fancy providing a proof for that? I'm well aware that there are benchmarks on which PyPy is slower than CPython or US, however, I would like a bit more weighted opinion in the PEP. Based on the benchmarks you're running at http://codespeak.net:8099/plotsummary.html, PyPy is slower than CPython on many non-numerical workloads, which Unladen Swallow is faster than CPython at. Looking at the benchmarks there at which PyPy is faster than CPython, they are primarily numerical; this was the basis for the wording in the PEP. My own recent benchmarking of PyPy and Unladen Swallow (both trunk; PyPy wouldn't run some benchmarks): | Benchmark| PyPy | Unladen | Change | +==+===+=+=+ | ai | 0.61 | 0.51| 1.1921x faster | | django | 0.68 | 0.8 | 1.1898x slower | | float| 0.03 | 0.07| 2.7108x slower | | html5lib | 20.04 | 16.42 | 1.2201x faster | | pickle | 17.7 | 1.09| 16.2465x faster | | rietveld | 1.09 | 0.59| 1.8597x faster | | slowpickle | 0.43 | 0.56| 1.2956x slower | | slowspitfire | 2.5 | 0.63| 3.9853x faster | | slowunpickle | 0.26 | 0.27| 1.0585x slower | | unpickle | 28.45 | 0.78| 36.6427x faster | I'm happy to change the wording to slower than US on some workloads. Thanks, Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
To follow up on some of the open issues: On Wed, Jan 20, 2010 at 2:27 PM, Collin Winter collinwin...@google.com wrote: [snip] Open Issues === - *Code review policy for the ``py3k-jit`` branch.* How does the CPython community want us to procede with respect to checkins on the ``py3k-jit`` branch? Pre-commit reviews? Post-commit reviews? Unladen Swallow has enforced pre-commit reviews in our trunk, but we realize this may lead to long review/checkin cycles in a purely-volunteer organization. We would like a non-Google-affiliated member of the CPython development team to review our work for correctness and compatibility, but we realize this may not be possible for every commit. The feedback we've gotten so far is that at most, only larger, more critical commits should be sent for review, while most commits can just go into the branch. Is that broadly agreeable to python-dev? - *How to link LLVM.* Should we change LLVM to better support shared linking, and then use shared linking to link the parts of it we need into CPython? The consensus has been that we should link shared against LLVM. Jeffrey Yasskin is now working on this in upstream LLVM. We are tracking this at http://code.google.com/p/unladen-swallow/issues/detail?id=130 and http://llvm.org/PR3201. - *Prioritization of remaining issues.* We would like input from the CPython development team on how to prioritize the remaining issues in the Unladen Swallow codebase. Some issues like memory usage are obviously critical before merger with ``py3k``, but others may fall into a nice to have category that could be kept for resolution into a future CPython 3.x release. The big-ticket items here are what we expected: reducing memory usage and startup time. We also need to improve profiling options, both for oProfile and cProfile. - *Create a C++ style guide.* Should PEP 7 be extended to include C++, or should a separate C++ style PEP be created? Unladen Swallow maintains its own style guide [#us-styleguide]_, which may serve as a starting point; the Unladen Swallow style guide is based on both LLVM's [#llvm-styleguide]_ and Google's [#google-styleguide]_ C++ style guides. Any thoughts on a CPython C++ style guide? My personal preference would be to extend PEP 7 to cover C++ by taking elements from http://code.google.com/p/unladen-swallow/wiki/StyleGuide and the LLVM and Google style guides (which is how we've been developing Unladen Swallow). If that's broadly agreeable, Jeffrey and I will work on a patch to PEP 7. Thanks, Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
Hi Craig, On Tue, Feb 2, 2010 at 4:42 PM, Craig Citro craigci...@gmail.com wrote: Done. The diff is at http://codereview.appspot.com/186247/diff2/5014:8003/7002. I listed Cython, Shedskin and a bunch of other alternatives to pure CPython. Some of that information is based on conversations I've had with the respective developers, and I'd appreciate corrections if I'm out of date. Well, it's a minor nit, but it might be more fair to say something like Cython provides the biggest improvements once type annotations are added to the code. After all, Cython is more than happy to take arbitrary Python code as input -- it's just much more effective when it knows something about types. The code to make Cython handle closures has just been merged ... hopefully support for the full Python language isn't so far off. (Let me know if you want me to actually make a comment on Rietveld ...) Indeed, you're quite right. I've corrected the description here: http://codereview.appspot.com/186247/diff2/7005:9001/10001 Now what's more interesting is whether or not U-S and Cython could play off one another -- take a Python program, run it with some generic input data under Unladen and record info about which functions are hot, and what types they tend to take, then let Cython/gcc -O3 have a go at these, and lather, rinse, repeat ... JIT compilation and static compilation obviously serve different purposes, but I'm curious if there aren't other interesting ways to take advantage of both. Definitely! Someone approached me about possibly reusing the profile data for a feedback-enhanced code coverage tool, which has interesting potential, too. I've added a note about this under the Future Work section: http://codereview.appspot.com/186247/diff2/9001:10002/9003 Thanks, Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] API for VM branches (was: PEP 3146)
[Moving to python-ideas; python-dev to bcc] On Tue, Feb 2, 2010 at 2:02 AM, M.-A. Lemburg m...@egenix.com wrote: Collin Winter wrote: [snip] If such a restrictive plugin-based scheme had been available when we began Unladen Swallow, I do not doubt that we would have ignored it entirely. I do not like the idea of artificially tying the hands of people trying to make CPython faster. I do not see any part of Unladen Swallow that would have been made easier by such a scheme. If anything, it would have made our project more difficult. I don't think that it has to be restrictive - much to the contrary, it would provide a consistent API to those CPython internals and also clarify the separation between the various parts. Something which currently does not exist in CPython. We do not need an API to CPython's internals: we are not interfacing with them, we are replacing and augmenting them. Note that it may be easier for you (and others) to just take CPython and patch it as necessary. However, this doesn't relieve you from the needed maintenance - which, I presume, is one of the reasons why you are suggesting to merge U-S back into CPython ;-) That is incorrect. In the year we have been working on Unladen Swallow, we have only updated our vendor branch of CPython 2.6 once, going from 2.6.1 to 2.6.4. We have occasionally cherrypicked patches from the 2.6 maintenance branch to fix specific problems. The maintenance required by upstream CPython changes has been effectively zero. We are seeking to merge with CPython for three reasons: 1) verify that python-dev is interested in this project, and that we are not wasting our time; 2) expose the codebase to a wider, more heterogenous testing environment; 3) accelerate development by having more hands on the code. Upstream maintenance requirements have had zero impact on our planning. In any case, I'll be interested in reading your PEP that outlines how the plugin interface should work, which systems will be pluggable, and exactly how Unladen Swallow, WPython and Stackless would benefit. Let's move further discussion of this to python-ideas until there's something more concrete here. The py3k-jit branch will live long enough that we could update it to work with a plugin system, assuming it is demonstrated to be beneficial. Thanks, Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
Hey Dirkjan, [Circling back to this part of the thread] On Thu, Jan 21, 2010 at 1:37 PM, Dirkjan Ochtman dirk...@ochtman.nl wrote: On Thu, Jan 21, 2010 at 21:14, Collin Winter collinwin...@google.com wrote: [snip] My quick take on Cython and Shedskin is that they are useful-but-limited workarounds for CPython's historically-poor performance. Shedskin, for example, does not support the entire Python language or standard library (http://shedskin.googlecode.com/files/shedskin-tutorial-0.3.html). Perfect, now put something like this in the PEP, please. ;) Done. The diff is at http://codereview.appspot.com/186247/diff2/5014:8003/7002. I listed Cython, Shedskin and a bunch of other alternatives to pure CPython. Some of that information is based on conversations I've had with the respective developers, and I'd appreciate corrections if I'm out of date. Thanks, Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
Hey MA, On Fri, Jan 29, 2010 at 11:14 AM, M.-A. Lemburg m...@egenix.com wrote: Collin Winter wrote: I added startup benchmarks for Mercurial and Bazaar yesterday (http://code.google.com/p/unladen-swallow/source/detail?r=1019) so we can use them as more macro-ish benchmarks, rather than merely starting the CPython binary over and over again. If you have ideas for better Mercurial/Bazaar startup scenarios, I'd love to hear them. The new hg_startup and bzr_startup benchmarks should give us some more data points for measuring improvements in startup time. One idea we had for improving startup time for apps like Mercurial was to allow the creation of hermetic Python binaries, with all necessary modules preloaded. This would be something like Smalltalk images. We haven't yet really fleshed out this idea, though. In Python you can do the same with the freeze.py utility. See http://www.egenix.com/www2002/python/mxCGIPython.html for an old project where we basically put the Python interpreter and stdlib into a single executable. We've recently revisited that project and created something we call pyrun. It fits Python 2.5 into a single executable and a set of shared modules (which for various reasons cannot be linked statically)... 12MB in total. If you load lots of modules from the stdlib this does provide a significant improvement over standard Python. Good to know there are options. One feature we had in mind for a system of this sort would be the ability to take advantage of the limited/known set of modules in the image to optimize the application further, similar to link-time optimizations in gcc/LLVM (http://www.airs.com/blog/archives/100). Back to the PEP's proposal: Looking at the data you currently have, the negative results currently don't really look good in the light of the small performance improvements. The JIT compiler we are offering is more than just its current performance benefit. An interpreter loop will simply never be as fast as machine code. An interpreter loop, no matter how well-optimized, will hit a performance ceiling and before that ceiling will run into diminishing returns. Machine code is a more versatile optimization target, and as such, allows many optimizations that would be impossible or prohibitively difficult in an interpreter. Unladen Swallow offers a platform to extract increasing performance for years to come. The current generation of modern, JIT-based JavaScript engines are instructive in this regard: V8 (which I'm most familiar with) delivers consistently improving performance release-over-release (see the graphs at the top of http://googleblog.blogspot.com/2009/09/google-chrome-after-year-sporting-new.html). I'd like to see CPython be able to achieve the same thing, like the new implementations of JavaScript and Ruby are able to do. We are aware that Unladen Swallow is not finished; that's why we're not asking to go into py3k directly. Unladen Swallow's memory usage will continue to decrease, and its performance will only go up. The current state is not its permanent state; I'd hate to see the perfect become the enemy of the good. Wouldn't it be possible to have the compiler approach work in three phases in order to reduce the memory footprint and startup time hit, ie. 1. run an instrumented Python interpreter to collect all the needed compiler information; write this information into a .pys file (Python stats) 2. create compiled versions of the code for various often used code paths and type combinations by reading the .pys file and generating an .so file as regular Python extension module 3. run an uninstrumented Python interpreter and let it use the .so files instead of the .py ones In production, you'd then only use step 3 and avoid the overhead of steps 1 and 2. That is certainly a possibility if we are unable to reduce memory usage to a satisfactory level. I've added a Contingency Plans section to the PEP, including this option: http://codereview.appspot.com/186247/diff2/8004:7005/8006. Thanks, Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
Hey MA, On Mon, Feb 1, 2010 at 9:58 AM, M.-A. Lemburg m...@egenix.com wrote: BTW: Some years ago we discussed the idea of pluggable VMs for Python. Wouldn't U-S be a good motivation to revisit this idea ? We could then have a VM based on byte code using a stack machines, one based on word code using a register machine and perhaps one that uses the Stackless approach. What is the usecase for having pluggable VMs? Is the idea that, at runtime, the user would select which virtual machine they want to run their code under? How would the user make that determination intelligently? I think this idea underestimates a) how deeply the current CPython VM is intertwined with the rest of the implementation, and b) the nature of the changes required by these separate VMs. For example, Unladen Swallow adds fields to the C-level structs for dicts, code objects and frame objects; how would those changes be pluggable? Stackless requires so many modifications that it is effectively a fork; how would those changes be pluggable? Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
On Mon, Feb 1, 2010 at 11:17 AM, M.-A. Lemburg m...@egenix.com wrote: Collin Winter wrote: I think this idea underestimates a) how deeply the current CPython VM is intertwined with the rest of the implementation, and b) the nature of the changes required by these separate VMs. For example, Unladen Swallow adds fields to the C-level structs for dicts, code objects and frame objects; how would those changes be pluggable? Stackless requires so many modifications that it is effectively a fork; how would those changes be pluggable? They wouldn't be pluggable. Such changes would have to be made in a more general way in order to serve more than just one VM. I believe these VMs would have little overlap. I cannot imagine that Unladen Swallow's needs have much in common with Stackless's, or with those of a hypothetical register machine to replace the current stack machine. Let's consider that last example in more detail: a register machine would require completely different bytecode. This would require replacing the bytecode compiler, the peephole optimizer, and the bytecode eval loop. The frame object would need to be changed to hold the registers and a new blockstack design; the code object would have to potentially hold a new bytecode layout. I suppose making all this pluggable would be possible, but I don't see the point. This kind of experimentation is ideal for a branch: go off, test your idea, report your findings, merge back. Let the branch be long-lived, if need be. The Mercurial migration will make all this easier. Getting the right would certainly require a major effort, but it would also reduce the need to have several branches of C-based Python implementations. If such a restrictive plugin-based scheme had been available when we began Unladen Swallow, I do not doubt that we would have ignored it entirely. I do not like the idea of artificially tying the hands of people trying to make CPython faster. I do not see any part of Unladen Swallow that would have been made easier by such a scheme. If anything, it would have made our project more difficult. Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
On Fri, Jan 29, 2010 at 7:22 AM, Nick Coghlan ncogh...@gmail.com wrote: Antoine Pitrou wrote: Or you could submit patches piecewise on http://bugs.python.org I think the first step would be to switch to 16-bit bytecodes. It would be uncontroversial (the increase in code size probably has no negative effect) and would provide the foundation for all of your optimizations. I wouldn't consider changing from bytecode to wordcode uncontroversial - the potential to have an effect on cache hit ratios means it needs to be benchmarked (the U-S performance tests should be helpful there). It's the same basic problem where any changes to the ceval loop can have surprising performance effects due to the way they affect the compiled switch statements ability to fit into the cache and other low level processor weirdness. Agreed. We originally switched Unladen Swallow to wordcode in our 2009Q1 release, and saw a performance improvement from this across the board. We switched back to bytecode for the JIT compiler to make upstream merger easier. The Unladen Swallow benchmark suite should provided a thorough assessment of the impact of the wordcode - bytecode switch. This would be complementary to a JIT compiler, rather than a replacement for it. I would note that the switch will introduce incompatibilities with libraries like Twisted. IIRC, Twisted has a traceback prettifier that removes its trampoline functions from the traceback, parsing CPython's bytecode in the process. If running under CPython, it assumes that the bytecode is as it expects. We broke this in Unladen's wordcode switch. I think parsing bytecode is a bad idea, but any switch to wordcode should be advertised widely. Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
Hey Terry, On Fri, Jan 29, 2010 at 2:47 PM, Terry Reedy tjre...@udel.edu wrote: On 1/29/2010 4:19 PM, Collin Winter wrote: On Fri, Jan 29, 2010 at 7:22 AM, Nick Coghlanncogh...@gmail.com wrote: Agreed. We originally switched Unladen Swallow to wordcode in our 2009Q1 release, and saw a performance improvement from this across the board. We switched back to bytecode for the JIT compiler to make upstream merger easier. The Unladen Swallow benchmark suite should provided a thorough assessment of the impact of the wordcode - bytecode switch. This would be complementary to a JIT compiler, rather than a replacement for it. I would note that the switch will introduce incompatibilities with libraries like Twisted. IIRC, Twisted has a traceback prettifier that removes its trampoline functions from the traceback, parsing CPython's bytecode in the process. If running under CPython, it assumes that the bytecode is as it expects. We broke this in Unladen's wordcode switch. I think parsing bytecode is a bad idea, but any switch to wordcode should be advertised widely. Several years, there was serious consideration of switching to a registerbased vm, which would have been even more of a change. Since I learned 1.4, Guido has consistently insisted that the CPython vm is not part of the language definition and, as far as I know, he has rejected any byte-code hackery in the stdlib. While he is not one to, say, randomly permute the codes just to frustrate such hacks, I believe he has always considered vm details private and subject to change and any usage thereof 'at one's own risk'. No, I agree entirely: bytecode is an implementation detail that could be changed at any time. But like reference counting, it's an implementation detail that people have -- for better or worse -- come to rely on. My only point was that a switch to wordcode should be announced prominently in the release notes and not assumed to be without impact on user code. That people are directly munging CPython bytecode means that CPython should provide a better, more abstract way to do the same thing that's more resistant to these kinds of changes. Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
Hi William, On Wed, Jan 27, 2010 at 7:26 AM, William Dode w...@flibuste.net wrote: Hi (as a simple user), I'd like to know why you didn't followed the same way as V8 Javascript, or the opposite, why for V8 they didn't choose llvm ? I imagine that startup time and memory was also critical for V8. Startup time and memory usage are arguably *more* critical for a Javascript implementation, since if you only spend a few milliseconds executing Javascript code, but your engine takes 10-20ms to startup, then you've lost. Also, a minimized memory profile is important if you plan to embed your JS engine on a mobile platform, for example, or you need to run in a heavily-multiprocessed browser on low-memory consumer desktops and netbooks. Among other reasons we chose LLVM, we didn't want to write code generators for each platform we were targeting. LLVM has done this for us. V8, on the other hand, has to implement a new code generator for each new platform they want to target. This is non-trivial work: it takes a long time, has a lot of finicky details, and it greatly increases the maintenance burden on the team. We felt that requiring python-dev to understand code generation on multiple platforms was a distraction from what python-dev is trying to do -- develop Python. V8 still doesn't have x86-64 code generation working on Windows (http://code.google.com/p/v8/issues/detail?id=330), so I wouldn't underestimate the time required for that kind of project. Thanks, Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
, but will sometimes cherrypick upstream patches that we want (since doing the full vendor merge can take a while). Right now, we're using an unmodified snapshot of LLVM. I've added language to the PEP to clarify some of these points: - No in-tree copies of LLVM/Clang: http://codereview.appspot.com/186247/diff2/5004:5006/5007 - Shared linking of LLVM: http://codereview.appspot.com/186247/diff2/5006:6007/5008 I've filed http://code.google.com/p/unladen-swallow/issues/detail?id=130 so that we have our own issue to track this, in addition to the upstream LLVM bug (http://llvm.org/PR3201). Thanks, Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
Hi William, On Wed, Jan 27, 2010 at 11:02 AM, William Dode w...@flibuste.net wrote: The startup time and memory comsumption are a limitation of llvm that their developers plan to resolve or is it only specific to the current python integration ? I mean the work to correct this is more on U-S or on llvm ? Part of it is LLVM, part of it is Unladen Swallow. LLVM is very flexible, and there's a price for that. We have also found and fixed several cases of quadratic memory usage in LLVM optimization passes, and there may be more of those lurking around. On the Unladen Swallow side, there are doubtless things we can do to improve our usage of LLVM; http://code.google.com/p/unladen-swallow/issues/detail?id=68 has most of our work on this, and there are still more ideas to implement. Part of the issue is that Unladen Swallow is using LLVM's JIT infrastructure in ways that it really hasn't been used before, and so there's a fair amount of low-hanging fruit left in LLVM that no-one has needed to pick yet. Thanks, Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
Hi Cesare, On Tue, Jan 26, 2010 at 12:29 AM, Cesare Di Mauro cesare.di.ma...@gmail.com wrote: Hi Collin, One more question: is it easy to support more opcodes, or a different opcode structure, in Unladen Swallow project? I assume you're asking about integrating WPython. Yes, adding new opcodes to Unladen Swallow is still pretty easy. The PEP includes a section on this, http://www.python.org/dev/peps/pep-3146/#experimenting-with-changes-to-python-or-cpython-bytecode, though it doesn't cover something more complex like converting from bytecode to wordcode, as a purely hypothetical example ;) Let me know if that section is unclear or needs more data. Converting from bytecode to wordcode should be relatively straightforward, assuming that the arrangement of opcode arguments is the main change. I believe the only real place you would need to update is the JIT compiler's bytecode iterator (see http://code.google.com/p/unladen-swallow/source/browse/trunk/Util/PyBytecodeIterator.cc). Depending on the nature of the changes, the runtime feedback system might need to be updated, too, but it wouldn't be too difficult, and the changes should be localized. Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
Hey Martin, On Thu, Jan 21, 2010 at 2:25 PM, Martin v. Löwis mar...@v.loewis.de wrote: Reid Kleckner wrote: On Thu, Jan 21, 2010 at 4:34 PM, Martin v. Löwis mar...@v.loewis.de wrote: How large is the LLVM shared library? One surprising data point is that the binary is much larger than some of the memory footprint measurements given in the PEP. Could it be that you need to strip the binary, or otherwise remove unneeded debug information? Python is always built with debug information (-g), at least it was in 2.6.1 which unladen is based off of, and we've made sure to build LLVM the same way. We had to muck with the LLVM build system to get it to include debugging information. On my system, stripping the python binary takes it from 82 MB to 9.7 MB. So yes, it contains extra debug info, which explains the footprint measurements. The question is whether we want LLVM built with debug info or not. Ok, so if 70MB are debug information, I think a lot of the concerns are removed: - debug information doesn't consume any main memory, as it doesn't get mapped when the process is started. - debug information also doesn't take up space in the system distributions, as they distribute stripped binaries. As 10MB is still 10 times as large as a current Python binary,people will probably search for ways to reduce that further, or at least split it up into pieces. 70MB of the increase was indeed debug information. Since the Linux distros that I checked ship stripped Python binaries, I've stripped the Unladen Swallow binaries as well, and while the size increase is still significant, it's not as large as it once was. Stripped CPython 2.6.4: 1.3 MB Stripped CPython 3.1.1: 1.4 MB Stripped Unladen r1041: 12 MB A 9x increase is better than a 20x increase, but it's not great, either. There is still room to trim the set of LLVM libraries used by Unladen Swallow, and we're continuing to investigate reducing on-disk binary size (http://code.google.com/p/unladen-swallow/issues/detail?id=118 tracks this). I've updated the PEP to reflect this configuration, since it's what most users will pick up via their system package managers. The exact change to the PEP wording is http://codereview.appspot.com/186247/diff2/6001:6003/5002. Thanks, Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
Hi Floris, On Sun, Jan 24, 2010 at 3:40 AM, Floris Bruynooghe floris.bruynoo...@gmail.com wrote: On Sat, Jan 23, 2010 at 10:09:14PM +0100, Cesare Di Mauro wrote: Introducing C++ is a big step, also. Aside the problems it can bring on some platforms, it means that C++ can now be used by CPython developers. It doesn't make sense to force people use C for everything but the JIT part. In the end, CPython could become a mix of C and C++ code, so a bit more difficult to understand and manage. Introducing C++ is a big step, but I disagree that it means C++ should be allowed in the other CPython code. C++ can be problematic on more obscure platforms (certainly when static initialisers are used) and being able to build a python without C++ (no JIT/LLVM) would be a huge benefit, effectively having the option to build an old-style CPython at compile time. (This is why I ased about --without-llvm being able not to link with libstdc++). I'm working on a patch to completely remove all traces of C++ with configured with --without-llvm. It's a straightforward change, and should present no difficulties. For reference, what are these obscure platforms where static initializers cause problems? Thanks, Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
Hi Cesare, On Sat, Jan 23, 2010 at 1:09 PM, Cesare Di Mauro cesare.di.ma...@gmail.com wrote: Hi Collin IMO it'll be better to make Unladen Swallow project a module, to be installed and used if needed, so demanding to users the choice of having it or not. The same way psyco does, indeed. Nowadays it requires too much memory, longer loading time, and fat binaries for not-so-great performances. I know that some issues have being worked on, but I don't think that they'll show something comparable to the current CPython status. You're proposing that, even once the issues of memory usage and startup time are addressed, Unladen Swallow should still be an extension module? I don't see why. You're assuming that these issues cannot be fixed, which I disagree with. I think maintaining something like a JIT compiler out-of-line, as Psyco is, causes long-term maintainability problems. Such extension modules are forever playing catchup with the CPython code, depending on implementation details that the CPython developers are right to regard as open to change. It also limits what kind of optimizations you can implement or forces those optimizations to be implemented with workarounds that might be suboptimal or fragile. I'd recommend reading the Psyco codebase, if you haven't yet. As others have requested, we are working hard to minimize the impact of the JIT so that it can be turned off entirely at runtime. We have an active issue tracking our progress at http://code.google.com/p/unladen-swallow/issues/detail?id=123. Introducing C++ is a big step, also. Aside the problems it can bring on some platforms, it means that C++ can now be used by CPython developers. Which platforms, specifically? What is it about C++ on those platforms that is problematic? Can you please provide details? It doesn't make sense to force people use C for everything but the JIT part. In the end, CPython could become a mix of C and C++ code, so a bit more difficult to understand and manage. Whether CPython should allow wider usage of C++ or whether developer should be force[d] to use C is not our decision, and is not part of this PEP. With the exception of Python/eval.c, we deliberately have not converted any CPython code to C++ so that if you're not working on the JIT, python-dev's workflow remains the same. Even within eval.cc, the only C++ parts are related to the JIT, and so disappear completely with configured with --without-llvm (or if you're not working on the JIT). In any case, developers can easily tell which language to use based on file extension. The compiler errors that would result from compiling C++ with a C compiler would be a good indication as well. What I see is that LLVM is a too big project for the goal of having just a JIT-ed Python VM. It can be surely easier to use and integrate into CPython, but requires too much resources Which resources do you feel that LLVM would tax, machine resources or developer resources? Are you referring to the portions of LLVM used by Unladen Swallow, or the entire wider LLVM project, including the pieces Unladen Swallow doesn't use at runtime? (on the contrary, Psyco demands little resources, give very good performances, but seems to be like a mess to manage and extend). This is not my experience. For the workloads I have experience with, Psyco doubles memory usage while only providing a 15-30% speed improvement. Psyco's benefits are not uniform. Unladen Swallow has been designed to be much more maintainable and easier to extend and modify than Psyco: the compiler and its attendant optimizations are well-tested (see Lib/test/test_llvm.py, for one) and well-documented (see Python/llvm_notes.txt for one). I think that the project is bearing out the success of our design: Google's full-time engineers are a small minority on the project at this point, and almost all performance-improving patches are coming from non-Google developers. Thanks, Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
Hey Floris, On Mon, Jan 25, 2010 at 1:25 PM, Floris Bruynooghe floris.bruynoo...@gmail.com wrote: On Mon, Jan 25, 2010 at 10:14:35AM -0800, Collin Winter wrote: I'm working on a patch to completely remove all traces of C++ with configured with --without-llvm. It's a straightforward change, and should present no difficulties. Great to hear that, thanks for caring. This has now been resolved. As of http://code.google.com/p/unladen-swallow/source/detail?r=1036, ./configure --without-llvm has no dependency on libstdc++: Before: $ otool -L ./python.exe ./python.exe: /usr/lib/libSystem.B.dylib /usr/lib/libstdc++.6.dylib /usr/lib/libgcc_s.1.dylib After: $ otool -L ./python.exe ./python.exe: /usr/lib/libSystem.B.dylib /usr/lib/libgcc_s.1.dylib I've explicitly noted this in the PEP (see http://codereview.appspot.com/186247/diff2/2001:4001/5001). Thanks, Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
Hey Tony, On Fri, Jan 22, 2010 at 10:11 AM, Tony Nelson tonynel...@georgeanelson.com wrote: On 10-01-22 02:53:21, Collin Winter wrote: On Thu, Jan 21, 2010 at 11:37 PM, Glyph Lefkowitz gl...@twistedmatrix.com wrote: On Jan 21, 2010, at 6:48 PM, Collin Winter wrote: ... There's been a recent thread on our mailing list about a patch that dramatically reduces the memory footprint of multiprocess concurrency by separating reference counts from objects. ... Currently, CPython gets a performance advantage from having reference counts hot in the cache when the referenced object is used. There is still the write pressure from updating the counts. With separate reference counts, an extra cache line must be loaded from memory (it is unlikely to be in the cache unless the program is trivial). I see from the referenced posting that this is a 10% speed hit (the poster attributes the hit to extra instructions). Perhaps the speed and memory hits could be minimized by only doing this for some objects? Only objects that are fully shared (such as read- only data) benefit from this change. I don't know but shared objects may already be treated separately. One option that we discussed was to create a ./configure flag to toggle between inline refcounts and separate refcounts. Advanced users that care about the memory usage of multiprocess concurrency could compile their own CPython binary to enable this space optimization at the cost of some performance. On the other hand, once we get enough performance out of the JIT that python-dev is willing to take a 10% hit, then I'd say we should just turn the space optimization on by default. In the meantime, though, a configure flag would be a useful intermediate point for a number of people. Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
Hey Jake, On Thu, Jan 21, 2010 at 10:48 AM, Jake McGuire mcgu...@google.com wrote: On Thu, Jan 21, 2010 at 10:19 AM, Reid Kleckner r...@mit.edu wrote: On Thu, Jan 21, 2010 at 12:27 PM, Jake McGuire mcgu...@google.com wrote: On Wed, Jan 20, 2010 at 2:27 PM, Collin Winter collinwin...@google.com wrote: Profiling - Unladen Swallow integrates with oProfile 0.9.4 and newer [#oprofile]_ to support assembly-level profiling on Linux systems. This means that oProfile will correctly symbolize JIT-compiled functions in its reports. Do the current python profiling tools (profile/cProfile/pstats) still work with Unladen Swallow? Sort of. They disable the use of JITed code, so they don't quite work the way you would want them to. Checking tstate-c_tracefunc every line generated too much code. They still give you a rough idea of where your application hotspots are, though, which I think is acceptable. Hmm. So cProfile doesn't break, but it causes code to run under a completely different execution model so the numbers it produces are not connected to reality? We've found the call graph and associated execution time information from cProfile to be extremely useful for understanding performance issues and tracking down regressions. Giving that up would be a huge blow. FWIW, cProfile's call graph information is still perfectly accurate, but you're right: turning on cProfile does trigger execution under a different codepath. That's regrettable, but instrumentation-based profiling is always going to introduce skew into your numbers. That's why we opted to improve oProfile, since we believe sampling-based profiling to be a better model. Profiling was problematic to support in machine code because in Python, you can turn profiling on from user code at arbitrary points. To correctly support that, we would need to add lots of hooks to the generated code to check whether profiling is enabled, and if so, call out to the profiler. Those is profiling enabled now? checks are (almost) always going to be false, which means we spend cycles for no real benefit. Can YouTube use oProfile for profiling, or is instrumented profiling critical? oProfile does have its downsides for profiling user code: you see all the C-language support functions, not just the pure-Python functions. That extra data might be useful, but it's probably more information than most people want. YouTube might want it, though. Assuming YouTube can't use oProfile as-is, there are some options: - Write a script around oProfile's reporting tool to strip out all C functions from the report. Enhance oProfile to fix any deficiencies compared to cProfile's reporting. - Develop a sampling profiler for Python that only samples pure-Python functions, ignoring C code (but including JIT-compiled Python code). - Add the necessary profiling hooks to JITted code to better support cProfile, but add a command-line flag (something explicit like -O3) that removes the hooks and activates the current behaviour (or something even more restrictive, possibly). - Initially compile Python code without the hooks, but have a trip-wire set to detect the installation of profiling hooks. When profiling hooks are installed, purge all machine code from the system and recompile all hot functions to include the profiling hooks. Thoughts? Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PyCon Keynote
On Thu, Jan 21, 2010 at 8:37 AM, Kortatu glorybo...@gmail.com wrote: Hi! For me, could be very interesting something about Unladen Swallow, and your opinion about JIT compilers. FWIW, there will be a separate talk about Unladen Swallow at PyCon. I for one would like to hear Guido talk about something else :) Collin 2010/1/21 Michael Foord fuzzy...@voidspace.org.uk On 21/01/2010 15:03, Thomas Wouters wrote: On Wed, Jan 13, 2010 at 19:51, Guido van Rossum gu...@python.org wrote: Please mail me topics you'd like to hear me talk about in my keynote at PyCon this year. How about something completely different... ? Your history of Python stuff has been really interesting. I'd like to hear you lay to rest that nonsense about you retiring : Well, ditto. :-) All the best, Michael -- Thomas Wouters tho...@python.org Hi! I'm a .signature virus! copy me into your .signature file to help me spread! ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.ironpythoninaction.com/ http://www.voidspace.org.uk/blog READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/gloryboy84%40gmail.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/collinw%40gmail.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
Hi Dirkjan, On Wed, Jan 20, 2010 at 10:55 PM, Dirkjan Ochtman dirk...@ochtman.nl wrote: On Thu, Jan 21, 2010 at 02:56, Collin Winter collinwin...@google.com wrote: Agreed. We are actively working to improve the startup time penalty. We're interested in getting guidance from the CPython community as to what kind of a startup slow down would be sufficient in exchange for greater runtime performance. For some apps (like Mercurial, which I happen to sometimes hack on), increased startup time really sucks. We already have our demandimport code (I believe bzr has something similar) to try and delay imports, to prevent us spending time on imports we don't need. Maybe it would be possible to do something like that in u-s? It could possibly also keep track of the thorny issues, like imports where there's an except ImportError that can do fallbacks. I added startup benchmarks for Mercurial and Bazaar yesterday (http://code.google.com/p/unladen-swallow/source/detail?r=1019) so we can use them as more macro-ish benchmarks, rather than merely starting the CPython binary over and over again. If you have ideas for better Mercurial/Bazaar startup scenarios, I'd love to hear them. The new hg_startup and bzr_startup benchmarks should give us some more data points for measuring improvements in startup time. One idea we had for improving startup time for apps like Mercurial was to allow the creation of hermetic Python binaries, with all necessary modules preloaded. This would be something like Smalltalk images. We haven't yet really fleshed out this idea, though. Thanks, Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
Hey Dirkjan, On Thu, Jan 21, 2010 at 11:16 AM, Dirkjan Ochtman dirk...@ochtman.nl wrote: On Thu, Jan 21, 2010 at 18:32, Collin Winter collinwin...@google.com wrote: I added startup benchmarks for Mercurial and Bazaar yesterday (http://code.google.com/p/unladen-swallow/source/detail?r=1019) so we can use them as more macro-ish benchmarks, rather than merely starting the CPython binary over and over again. If you have ideas for better Mercurial/Bazaar startup scenarios, I'd love to hear them. The new hg_startup and bzr_startup benchmarks should give us some more data points for measuring improvements in startup time. Sounds good! I seem to remember from a while ago that you included the Mercurial test suite in your performance tests, but maybe those were the correctness tests rather than the performance tests (or maybe I'm just mistaken). I didn't see any mention of that in the proto-PEP, in any case. We used to run the Mercurial correctness tests at every revision, but they were incredibly slow and a bit flaky under CPython 2.6. Bazaar's tests were faster, but were flakier, so we ended up disabling them, too. We only run these tests occasionally. One idea we had for improving startup time for apps like Mercurial was to allow the creation of hermetic Python binaries, with all necessary modules preloaded. This would be something like Smalltalk images. We haven't yet really fleshed out this idea, though. Yeah, that might be interesting. I think V8 can do something similar, right? Correct; V8 loads a pre-compiled image of its builtins to reduce startup time. What I personally would consider interesting for the PEP is a (not too big) section evaluating where other Python-performance efforts are at. E.g. does it make sense to propose a u-s merge now when, by the time 3.3 (or whatever) is released, there'll be a very good PyPy that sports memory usage competitive for embedded development (already does right now, I think) and a good tracing JIT? Or when we can compile Python using Cython, or Shedskin -- probably not as likely; but I think it might be worth assessing the landscape a bit before this huge change is implemented. I can definitely work on that. http://codespeak.net:8099/plotsummary.html should give you a quick starting point for PyPy's performance. My reading of those graphs is that it does very well on heavily-numerical workloads, but is much slower than CPython on more diverse workloads. When I initially benchmarked PyPy vs CPython last year, PyPy was 3-5x slower on non-numerical workloads, and 60x slower on one benchmark (./perf.py -b pickle,unpickle, IIRC). My quick take on Cython and Shedskin is that they are useful-but-limited workarounds for CPython's historically-poor performance. Shedskin, for example, does not support the entire Python language or standard library (http://shedskin.googlecode.com/files/shedskin-tutorial-0.3.html). Cython is a super-set of Python, and files annotated for maximum Cython performance are no longer valid Python code, and will not run on any other Python implementation. The advantage of using an integrated JIT compiler is that we can support Python-as-specified, without workarounds or changes in workflow. The compiler can observe which parts of user code are static (or static-ish) and take advantage of that, without the manual annotations needed by Cython. Cython is good for writing extension modules without worrying about the details of reference counting, etc, but I don't see it as an either-or alternative for a JIT compiler. P.S. Is there any chance of LLVM doing something like tracing JITs? Those seem somewhat more promising to me (even though I understand they're quite hard in the face of Python features like stack frames). Yes, you could implement a tracing JIT with LLVM. We chose a function-at-a-time JIT because it would a) be an easy-to-implement baseline to measure future improvement, and b) create much of the infrastructure for a future tracing JIT. Implementing a tracing JIT that crosses the C/Python boundary would be interesting. Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
Hey Greg, On Wed, Jan 20, 2010 at 10:54 PM, Gregory P. Smith g...@krypto.org wrote: +1 My biggest concern is memory usage but it sounds like addressing that is already in your mind. I don't so much mind an additional up front constant and per-line-of-code hit for instrumentation but leaks are unacceptable. Any instrumentation data or jit caches should be managed (and tunable at run time when possible and it makes sense). Reducing memory usage is a high priority. One thing being worked on right now is to avoid collecting runtime data for functions that will never be considered hot. That's one leak in the current implementation. I think having a run time flag (or environment variable for those who like that) to disable the use of JIT at python3 execution time would be a good idea. Yep, we already have a -j flag that supports don't ever use the JIT (-j never), use the JIT when you think you should (-j whenhot), and always the use the JIT (-j always) options. I'll mention this in the PEP (we'll clearly need to make this an -X option before merger). Thanks, Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
Hey Barry, On Thu, Jan 21, 2010 at 3:34 AM, Barry Warsaw ba...@python.org wrote: On Jan 20, 2010, at 11:05 PM, Jack Diederich wrote: Does disabling the LLVM change binary compatibility between modules targeted at the same version? At tonight's Boston PIG we had some binary package maintainers but most people (including myself) only cared about source compatibility. I assume linux distros care about binary compatibility _a lot_. A few questions come to mind: 1. What are the implications for PEP 384 (Stable ABI) if U-S is added? PEP 384 looks to be incomplete at this writing, but reading the section Structures, it says Only the following structures and structure fields are accessible to applications: - PyObject (ob_refcnt, ob_type) - PyVarObject (ob_base, ob_size) - Py_buffer (buf, obj, len, itemsize, readonly, ndim, shape, strides, suboffsets, smalltable, internal) - PyMethodDef (ml_name, ml_meth, ml_flags, ml_doc) - PyMemberDef (name, type, offset, flags, doc) - PyGetSetDef (name, get, set, doc, closure) Of these, the only one we have changed is PyMethodDef, and then to add two fields to the end of the structure. We have changed other types (dicts and code come to mind), but I believe we have only appended fields and not deleted or reordered existing fields. I don't believe that introducing the Unladen Swallow JIT will make maintaining a stable ABI per PEP 384 more difficult. We've been careful about not exporting any C++ symbols via PyAPI_FUNC(), so I don't believe that will be an issue either, but Jeffrey can comment more deeply on this issue. If PEP 384 is accepted, I'd like it to include a testing strategy so that we can be sure that we haven't accidentally broken ABI compatibility. That testing should ideally be automated. 2. What effect does requiring C++ have on the embedded applications across the set of platforms that Python is currently compatible on? In a previous life I had to integrate a C++ library with Python as an embedded language and had lots of problems on some OSes (IIRC Solaris and Windows) getting all the necessary components to link properly. To be clear, you're talking about embedding Python in a C/C++ application/library? We have successfully integrated Unladen Swallow into a large C++ application that uses Python as an embedded scripting language. There were no special issues or restrictions that I had to overcome to do this. If you have any applications/libraries in particular that you'd like me to test, I'd be happy to do that. 3. Will the U-S bits come with a roadmap to the code? It seems like this is dropping a big black box of code on the Python developers, and I would want to reduce the learning curve as much as possible. Yes; there is http://code.google.com/p/unladen-swallow/source/browse/trunk/Python/llvm_notes.txt, which goes into developer-level detail about various optimizations and subsystems. We have other documentation in the Unladen Swallow wiki that is being merged into llvm_notes.txt. Simply dropping this code onto python-dev without a guide to it would be unacceptable. llvm_notes.txt also details available instrumentation, useful to CPython developers who are investigating performance changes. Thanks, Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
On Wed, Jan 20, 2010 at 2:27 PM, Collin Winter collinwin...@google.com wrote: [snip] Incremental builds, however, are significantly slower. The table below shows incremental rebuild times after touching ``Objects/listobject.c``. +-+---+---+--+ | Incr make | CPython 2.6.4 | CPython 3.1.1 | Unladen Swallow r988 | +=+===+===+==+ | Run 1 | 0m1.854s | 0m1.456s | 0m24.464s | +-+---+---+--+ | Run 2 | 0m1.437s | 0m1.442s | 0m24.416s | +-+---+---+--+ | Run 3 | 0m1.440s | 0m1.425s | 0m24.352s | +-+---+---+--+ http://code.google.com/p/unladen-swallow/source/detail?r=1015 has significantly improved this situation. The new table of incremental build times: +-+---+---+---+ | Incr make | CPython 2.6.4 | CPython 3.1.1 | Unladen Swallow r1024 | +=+===+===+===+ | Run 1 | 0m1.854s | 0m1.456s | 0m6.680s | +-+---+---+---+ | Run 2 | 0m1.437s | 0m1.442s | 0m5.310s | +-+---+---+---+ | Run 3 | 0m1.440s | 0m1.425s | 0m7.639s | +-+---+---+---+ The remaining increase is from statically linking LLVM into libpython. PEP updated: http://codereview.appspot.com/186247/diff2/1:4/5 Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
Hi Paul, On Thu, Jan 21, 2010 at 12:56 PM, Paul Moore p.f.mo...@gmail.com wrote: I'm concerned about the memory and startup time penalties. It's nice that you're working on them - I'd like to see them remain a priority. Ultimately a *lot* of people use Python for short-running transient commands (not just adhoc scripts, think hg log) and startup time and memory penalties can really hurt there. I think final merger from the proposed py3k-jit branch into py3k should block on reducing the startup and memory usage penalties. Improving startup time is high on my list of priorities, and I think there's a fair amount of low-hanging fruit there. I've just updated http://code.google.com/p/unladen-swallow/issues/detail?id=64 with some ideas that have recently come up to improve startup time, as well results from the recently-added hg_startup and bzr_startup benchmarks. I'll also update the PEP with these benchmark results, since they're important to a lot of people. Windows compatibility is a big deal to me. And IMHO, it's a great strength of Python at the moment that it has solid Windows support. I would be strongly *against* this PEP if it was going to be Unix or Linux only. As it is, I have concerns that Windows could suffer from the common none of the developers use Windows, but we do our best problem. I'm hoping that having U-S integrated into the core will mean that there will be more Windows developers able to contribute and alleviate that problem. One of our contributors, James Abbatiello (cc'd), has done a bang-up job of making Unladen Swallow work on Windows. My understanding from his last update is that Unladen Swallow works well on Windows, but he can comment further as to the precise state of Windows support and any remaining challenges faced on that platform, if any. One question - once Unladen Swallow is integrated, will Google's support (in terms of dedicated developer time) remain? If not, I'd rather see more of the potential gains realised before integration, as otherwise it could be a long time before it happens. Ideally, I'd like to see a commitment from Google - otherwise the cynic in me is inclined to say no until the suggested speed benefits have materialised and only then accept U-S for integration. Less cynically, it's clear that there's quite a way to go before the key advertised benefits of U-S are achieved, and I don't want the project to lose steam before it gets there. While this decision is not mine, I don't believe our director would be open to an open-ended commitment of full-time Google engineering resources, though we still have another few quarters of engineering time allocated to the Google team (myself, Jeffrey Yasskin). At this point, the clear majority of Unladen Swallow's performance patches are coming from the non-Google developers on the project (Jeffrey and I are mostly working on infrastructure), and I believe that pattern will continue once the py3k-jit branch is established. Thanks, Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
On Thu, Jan 21, 2010 at 2:24 PM, Collin Winter collinwin...@google.com wrote: I'll also update the PEP with these benchmark results, since they're important to a lot of people. Done; see http://codereview.appspot.com/186247/diff2/4:8/9 for the wording change and new startup data. Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
Hey Antoine, On Thu, Jan 21, 2010 at 4:25 AM, Antoine Pitrou solip...@pitrou.net wrote: The increased memory usage comes from a) LLVM code generation, analysis and optimization libraries; b) native code; c) memory usage issues or leaks in LLVM; d) data structures needed to optimize and generate machine code; e) as-yet uncategorized other sources. Does the increase in memory occupation disappear when the JIT is disabled from the command-line? It does not disappear, but it is significantly reduced. Running our django benchmark against three different configurations gives me these max memory usage numbers: CPython 2.6.4: 8508 kb Unladen Swallow default: 26768 kb Unladen Swallow -j never: 15144 kb -j never is Unladen Swallow's flag to disable JIT compilation. As it stands right now, -j never gives a 1.76x reduction in memory usage, but is still 1.77x larger than CPython. It occurs to me that we're still doing a lot of LLVM-side initialization and setup that we don't need to do under -j never. We're also collecting runtime feedback in the eval loop, which is yet more memory usage. Optimizing this mode has not yet been a priority for us, but it seems to be the emerging consensus of python-dev that we need to give -j never some more love. There's a lot of low-hanging fruit there. I've added this information to http://code.google.com/p/unladen-swallow/issues/detail?id=123, which is our issue tracking -j never improvements. Do you think LLVM might suffer from a lot of memory leaks? I don't know that it suffers from a lot of memory leaks, though we have certainly observed and fixed quadratic memory usage in some of the optimization passes. We've fixed all the memory leaks that Google's internal heapchecker has found. Thanks, Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
On Thu, Jan 21, 2010 at 10:14 AM, Reid Kleckner r...@mit.edu wrote: On Thu, Jan 21, 2010 at 9:35 AM, Floris Bruynooghe floris.bruynoo...@gmail.com wrote: I just compiled with the --without-llvm option and see that the binary, while only an acceptable 4.1M, still links with libstdc++. Is it possible to completely get rid of the C++ dependency if this option is used? Introducing a C++ dependency on all platforms for no additional benefit (with --without-llvm) seems like a bad tradeoff to me. There isn't (and shouldn't be) any real source-level dependency on libstdc++ when LLVM is turned off. However, the eval loop is now compiled as C++, and that may be adding some hidden dependency (exception handling code?). The final binary is linked with $(CXX), which adds an implicit -lstdc++, I think. Someone just has to go and track this down. We've opened http://code.google.com/p/unladen-swallow/issues/detail?id=124 to track this issue. It should be straight-forward to fix. Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
Hey Glyph, On Thu, Jan 21, 2010 at 9:11 AM, Glyph Lefkowitz gl...@twistedmatrix.com wrote: It would be hard for me to put an exact number on what I would find acceptable, but I was really hoping that we could get a *reduced* memory footprint in the long term. My real concern here is not absolute memory usage, but usage for each additional Python process on a system; even if Python supported fast, GIL-free multithreading, I'd still prefer the additional isolation of multiprocess concurrency. As it currently stands, starting cores+1 Python processes can start to really hurt, especially in many-core-low-RAM environments like the Playstation 3. So, if memory usage went up by 20%, but per-interpreter overhead were decreased by more than that, I'd personally be happy. There's been a recent thread on our mailing list about a patch that dramatically reduces the memory footprint of multiprocess concurrency by separating reference counts from objects. We're looking at possibly incorporating this work into Unladen Swallow, though I think it should really go into upstream CPython first (since it's largely orthogonal to the JIT work). You can see the thread here: http://groups.google.com/group/unladen-swallow/browse_thread/thread/21d7248e8279b328/2343816abd1bd669 Thanks, Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
On Thu, Jan 21, 2010 at 12:20 PM, Collin Winter collinwin...@google.com wrote: Hey Greg, On Wed, Jan 20, 2010 at 10:54 PM, Gregory P. Smith g...@krypto.org wrote: I think having a run time flag (or environment variable for those who like that) to disable the use of JIT at python3 execution time would be a good idea. Yep, we already have a -j flag that supports don't ever use the JIT (-j never), use the JIT when you think you should (-j whenhot), and always the use the JIT (-j always) options. I'll mention this in the PEP (we'll clearly need to make this an -X option before merger). FYI, I just committed http://code.google.com/p/unladen-swallow/source/detail?r=1027, which dramatically improves the performance of Unladen Swallow when running with `-j never`, making disabling the JIT at runtime more viable. We're continuing to make progress minimizing the impact of the JIT when running under `-j never`. Progress can be tracked at http://code.google.com/p/unladen-swallow/issues/detail?id=123. Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
On Thu, Jan 21, 2010 at 11:37 PM, Glyph Lefkowitz gl...@twistedmatrix.com wrote: On Jan 21, 2010, at 6:48 PM, Collin Winter wrote: Hey Glyph, There's been a recent thread on our mailing list about a patch that dramatically reduces the memory footprint of multiprocess concurrency by separating reference counts from objects. We're looking at possibly incorporating this work into Unladen Swallow, though I think it should really go into upstream CPython first (since it's largely orthogonal to the JIT work). You can see the thread here: http://groups.google.com/group/unladen-swallow/browse_thread/thread/21d7248e8279b328/2343816abd1bd669 AWESOME. Thanks for the pointer. I read through both of the threads but I didn't see any numbers on savings-per-multi-process. Do you have any? The data I've seen comes from http://groups.google.com/group/comp.lang.python/msg/c18b671f2c4fef9e: This test code[1] consumes roughly 2G of RAM on an x86_64 with python 2.6.1, with the patch, it *should* use 2.3G of RAM (as specified by its output), so you can see the footprint overhead... but better page sharing makes it consume about 6 times less - roughly 400M... which is the size of the dataset. Ie: near-optimal data sharing. Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
in exchange for greater runtime performance. I guess what I am mainly saying is that there are several possible ways to speed up Python 3 execution (including others not mentioned here) and it is not at all clear to me that this particular one is in any sense 'best of breed'. If it disables other approaches, I think it should be optional for the standard PSF distribution. We considered the three approaches you mentioned (Psyco, changing the language, using function annotations), but found them unworkable or inapplicable to Google's needs. Adding a just-in-time compiler to Python 2.6 while designing our changes for ease of portability to Python 3 made more sense for our environment, and we believe, is more applicable to the environments of other Python consumers and better-suited to the future roadmap of CPython. Thanks, Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3003 - Python Language Moratorium
On Thu, Nov 5, 2009 at 10:35 AM, Dino Viehland di...@microsoft.com wrote: Stefan wrote: It /does/ make some static assumptions in that it considers builtins true builtins. However, it does not prevent you from replacing them in your code, as long as you do it inside the module. Certainly a restriction compared to Python, where you can import a module into a changed dict environment that redefines 'object', but not a major restriction IMO, and certainly not one that impacts much code. To me this is a deal breaker which prevents Cython from being a Python implementation. From a talk given by Colin Winter at the LLVM dev meeting (http://llvm.org/devmtg/2009-10/) it seems like Unladen Swallow wanted to do something like this as well and Guido said no. In this case the breaking change is so subtle that I'd personally hate to run into something like this porting code to Cython and having to figure out why it's not working. To clarify, I was joking when I told that story (or at least I was joking with Guido when I asked him if we could break that). It clearly *would* be easier if we could just ignore this point of Python compatibility, but that's not an option, so we've had to optimize around it. It's not that hard to do, but it's still extra work. Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A wordcode-based Python
On Wed, Nov 4, 2009 at 4:20 AM, Mart Sõmermaa mrts.py...@gmail.com wrote: On Tue, May 12, 2009 at 8:54 AM, Cesare Di Mauro cesare.dima...@a-tono.com wrote: Also, I checked out wpython at head to run Unladen Swallow's benchmarks against it, but it refuses to compile with either gcc 4.0.1 or 4.3.1 on Linux (fails in Python/ast.c). I can send you the build failures off-list, if you're interested. Thanks, Collin Winter I'm very interested, thanks. That's because I worked only on Windows machines, so I definitely need to test and fix it to let it run on any other platform. Cesare Re-animating an old discussion -- Cesare, any news on the wpython front? I did a checkout from http://wpython.googlecode.com/svn/trunk and was able to ./configure and make successfully on my 64-bit Linux box as well as to run the Unladen benchmarks. Given svn co http://svn.python.org/projects/python/tags/r261 in py261 and svn co http://wpython.googlecode.com/svn/trunk in wpy, $ python unladen-tests/perf.py -rm --benchmarks=-2to3,all py261/python wpy/python Do note that the --track_memory option to perf.py imposes some overhead that interferes with the performance figures. I'd recommend running the benchmarks again without --track_memory. That extra overhead is almost certainly what's causing some of the variability in the results. Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] 2to3, 3to2: official status
Hi Ben, On Wed, Nov 4, 2009 at 6:49 PM, Ben Finney ben+pyt...@benfinney.id.au wrote: Martin v. Löwis mar...@v.loewis.de writes: Ben Finney wrote: Martin v. Löwis mar...@v.loewis.de writes: Well, 3to2 would then be an option for you: use Python 3 as the source language. I was under the impression that 2to3 was officially supported as part of Python, but 3to2 was a third-party tool. […] Is it an official part of Python? No, the status is exactly as you describe it. Okay. It's probably best for anyone with their Python developer hat on (which, in this forum, is all the time for any Python developer) to make the status of 3to2 clear when recommending it to people concerned about future plans. Are you implying that we shouldn't recommend 3to2 to people wanting to develop in Py3k and back-translate to 2.x? Thanks, Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Reworking the GIL
On Sun, Oct 25, 2009 at 1:22 PM, Antoine Pitrou solip...@pitrou.net wrote: Having other people test it would be fine. Even better if you have an actual multi-threaded py3k application. But ccbench results for other OSes would be nice too :-) My results for an 2.4 GHz Intel Core 2 Duo MacBook Pro (OS X 10.5.8): Control (py3k @ r75723) --- Throughput --- Pi calculation (Python) threads=1: 633 iterations/s. threads=2: 468 ( 74 %) threads=3: 443 ( 70 %) threads=4: 442 ( 69 %) regular expression (C) threads=1: 281 iterations/s. threads=2: 282 ( 100 %) threads=3: 282 ( 100 %) threads=4: 282 ( 100 %) bz2 compression (C) threads=1: 379 iterations/s. threads=2: 735 ( 193 %) threads=3: 733 ( 193 %) threads=4: 724 ( 190 %) --- Latency --- Background CPU task: Pi calculation (Python) CPU threads=0: 0 ms. (std dev: 0 ms.) CPU threads=1: 1 ms. (std dev: 1 ms.) CPU threads=2: 1 ms. (std dev: 2 ms.) CPU threads=3: 3 ms. (std dev: 6 ms.) CPU threads=4: 2 ms. (std dev: 3 ms.) Background CPU task: regular expression (C) CPU threads=0: 0 ms. (std dev: 0 ms.) CPU threads=1: 975 ms. (std dev: 577 ms.) CPU threads=2: 1035 ms. (std dev: 571 ms.) CPU threads=3: 1098 ms. (std dev: 556 ms.) CPU threads=4: 1195 ms. (std dev: 557 ms.) Background CPU task: bz2 compression (C) CPU threads=0: 0 ms. (std dev: 0 ms.) CPU threads=1: 0 ms. (std dev: 2 ms.) CPU threads=2: 4 ms. (std dev: 5 ms.) CPU threads=3: 0 ms. (std dev: 0 ms.) CPU threads=4: 1 ms. (std dev: 4 ms.) Experiment (newgil branch @ r75723) --- Throughput --- Pi calculation (Python) threads=1: 651 iterations/s. threads=2: 643 ( 98 %) threads=3: 637 ( 97 %) threads=4: 625 ( 95 %) regular expression (C) threads=1: 298 iterations/s. threads=2: 296 ( 99 %) threads=3: 288 ( 96 %) threads=4: 287 ( 96 %) bz2 compression (C) threads=1: 378 iterations/s. threads=2: 720 ( 190 %) threads=3: 724 ( 191 %) threads=4: 718 ( 189 %) --- Latency --- Background CPU task: Pi calculation (Python) CPU threads=0: 0 ms. (std dev: 0 ms.) CPU threads=1: 0 ms. (std dev: 1 ms.) CPU threads=2: 0 ms. (std dev: 1 ms.) CPU threads=3: 0 ms. (std dev: 0 ms.) CPU threads=4: 1 ms. (std dev: 5 ms.) Background CPU task: regular expression (C) CPU threads=0: 0 ms. (std dev: 0 ms.) CPU threads=1: 1 ms. (std dev: 0 ms.) CPU threads=2: 2 ms. (std dev: 1 ms.) CPU threads=3: 2 ms. (std dev: 2 ms.) CPU threads=4: 2 ms. (std dev: 1 ms.) Background CPU task: bz2 compression (C) CPU threads=0: 0 ms. (std dev: 0 ms.) CPU threads=1: 0 ms. (std dev: 0 ms.) CPU threads=2: 2 ms. (std dev: 3 ms.) CPU threads=3: 0 ms. (std dev: 1 ms.) CPU threads=4: 0 ms. (std dev: 0 ms.) I also ran this through Unladen Swallow's threading microbenchmark, which is a straight copy of what David Beazley was experimenting with (simply iterating over 100 ints in pure Python) [1]. iterative_count is doing the loops one after the other, threaded_count is doing the loops in parallel using threads. The results below are benchmarking py3k as the control, newgil as the experiment. When it says x% faster, that is a measure of newgil's performance over py3k's. With two threads: iterative_count: Min: 0.336573 - 0.387782: 13.21% slower # I've run this configuration multiple times and gotten the same slowdown. Avg: 0.338473 - 0.418559: 19.13% slower Significant (t=-38.434785, a=0.95) threaded_count: Min: 0.529859 - 0.397134: 33.42% faster Avg: 0.581786 - 0.429933: 35.32% faster Significant (t=70.100445, a=0.95) With four threads: iterative_count: Min: 0.766617 - 0.734354: 4.39% faster Avg: 0.771954 - 0.751374: 2.74% faster Significant (t=22.164103, a=0.95) Stddev: 0.00262 - 0.00891: 70.53% larger threaded_count: Min: 1.175750 - 0.829181: 41.80% faster Avg: 1.224157 - 0.867506: 41.11% faster Significant (t=161.715477, a=0.95) Stddev: 0.01900 - 0.01120: 69.65% smaller With eight threads: iterative_count: Min: 1.527794 - 1.447421: 5.55% faster Avg: 1.536911 - 1.479940: 3.85% faster Significant (t=35.559595, a=0.95) Stddev: 0.00394 - 0.01553: 74.61% larger threaded_count: Min: 2.424553 - 1.677180: 44.56% faster Avg: 2.484922 - 1.723093: 44.21% faster Significant (t=184.766131, a=0.95) Stddev: 0.02874 - 0.02956: 2.78% larger I'd be interested in multithreaded benchmarks with less-homogenous workloads. Collin Winter [1] - http://code.google.com/p/unladen-swallow/source/browse/tests/performance/bm_threading.py ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Reworking the GIL
On Mon, Oct 26, 2009 at 2:43 PM, Antoine Pitrou solip...@pitrou.net wrote: Collin Winter collinw at gmail.com writes: [the Dave Beazley benchmark] The results below are benchmarking py3k as the control, newgil as the experiment. When it says x% faster, that is a measure of newgil's performance over py3k's. With two threads: iterative_count: Min: 0.336573 - 0.387782: 13.21% slower # I've run this configuration multiple times and gotten the same slowdown. Avg: 0.338473 - 0.418559: 19.13% slower Those numbers are not very in line with the other iterative_count results. Since iterative_count just runs the loop N times in a row, results should be proportional to the number N (number of threads). Besides, there's no reason for single-threaded performance to be degraded since the fast path of the eval loop actually got a bit streamlined (there is no volatile ticker to decrement). I agree those numbers are out of line with the others and make no sense. I've run it with two threads several times and the results are consistent on this machine. I'm digging into it a bit more. Collin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Fast Implementation for ZIP decryption
On Sun, Aug 30, 2009 at 7:34 AM, Shashank Singhshashank.sunny.si...@gmail.com wrote: just to give you an idea of the speed up: a 3.3 mb zip file extracted using the current all-python implementation on my machine (win xp 1.67Ghz 1.5GB) takes approximately 38 seconds. the same file when extracted using c implementation takes 0.4 seconds. Are there any applications/frameworks which have zip files on their critical path, where this kind of (admittedly impressive) speedup would be beneficial? What was the motivation for writing the C version? Collin Winter On Sun, Aug 30, 2009 at 6:35 PM, exar...@twistedmatrix.com wrote: On 12:59 pm, st...@pearwood.info wrote: On Sun, 30 Aug 2009 06:55:33 pm Martin v. Löwis wrote: Does it sound worthy enough to create a patch for and integrate into python itself? Probably not, given that people think that the algorithm itself is fairly useless. I would think that for most people, the threat model isn't the CIA is reading my files but my little brother or nosey co-worker is reading my files, and for that, zip encryption with a good password is probably perfectly adequate. E.g. OpenOffice uses it for password-protected documents. Given that Python already supports ZIP decryption (as it should), are there any reasons to prefer the current pure-Python implementation over a faster version? Given that the use case is protect my biology homework from my little brother, how fast does the implementation really need to be? Is speeding it up from 0.1 seconds to 0.001 seconds worth the potential new problems that come with more C code (more code to maintain, less portability to other runtimes, potential for interpreter crashes or even arbitrary code execution vulnerabilities from specially crafted files)? Jean-Paul ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/shashank.sunny.singh%40gmail.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/collinw%40gmail.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Anyone against having a loop option for regrtest?
On Mon, Jun 29, 2009 at 6:59 PM, Jesse Nollerjnol...@gmail.com wrote: Something that's been helping me squirrel out wacky and fun bugs in multiprocessing is running the tests in a loop - sometimes hundreds of times. Right now, I hack this up with a bash script, but I'm sitting here wondering if adding a loop for x iterations option to regrtest.py would be useful to others as well. Any thoughts? Does anyone hate this idea with the power of a thousand suns? +1 for having this in regrtest. I've wished for this in the past, and ended up going the bash route, same as you. Collin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] draft pep: backwards compatibility
On Thu, Jun 18, 2009 at 7:17 PM, Benjamin Petersonbenja...@python.org wrote: [snip] Backwards Compatibility Rules = This policy applys to all public APIs. These include the C-API, the standard library, and the core language including syntax and operation as defined by the reference manual. This is the basic policy for backwards compatibility: * The behavior of an API *must* not change between any two consecutive releases. Is this intended to include performance changes? Clearly no-one will complain if things simply get faster, but I'm thinking about cases where, say, a function runs in half the time but uses double the memory (or vice versa). Collin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A wordcode-based Python
On Tue, May 12, 2009 at 4:45 AM, Cesare Di Mauro cesare.dima...@a-tono.com wrote: Another note. Fredrik Johansson let me note just few minutes ago that I've compiled my sources without PGO optimizations enabled. That's because I used Visual Studio Express Edition. So another gain in performances can be obtained. :) FWIW, Unladen Swallow experimented with gcc 4.4's FDO and got an additional 10-30% (depending on the benchmark). The training load is important, though: some training sets offered better performance than others. I'd be interested in how MSVC's PGO compares to gcc's FDO in terms of overall effectiveness. The results for gcc FDO with our 2009Q1 release are at the bottom of http://code.google.com/p/unladen-swallow/wiki/Releases. Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Shorter release schedule?
On Tue, May 12, 2009 at 3:06 PM, Antoine Pitrou solip...@pitrou.net wrote: Hello, Just food for thought here, but seeing how 3.1 is going to be a real featureful schedule despite being released shortly after 3.0, wouldn't it make sense to tighten future release planning a little? I was thinking something like doing a major release every 12 months (rather than 18 to 24 months as has been heuristically the case lately). This could also imply switching to some kind of loosely time-based release system. I'd be in favor of a shorter, 12-month release cycle. I think the limiting resource would be the time and energy of the release managers and the package builders for Windows, etc. Provided it's not a tax on the release staff, I think shorter release cycles would be a benefit to the community. My own experience with time-based releases at work is that it greatly helps focus energy and attention, knowing that you can't simply delay the release if you slack off on your features/bugs. Collin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A wordcode-based Python
Hi Cesare, On Mon, May 11, 2009 at 11:00 AM, Cesare Di Mauro cesare.dima...@a-tono.com wrote: At the last PyCon3 at Italy I've presented a new Python implementation, which you'll find at http://code.google.com/p/wpython/ Good to see some more attention on Python performance! There's quite a bit going on in your changes; do you have an optimization-by-optimization breakdown, to give an idea about how much performance each optimization gives? Looking over the slides, I see that you still need to implement functionality to make test_trace pass, for example; do you have a notion of how much performance it will cost to implement the rest of Python's semantics in these areas? Also, I checked out wpython at head to run Unladen Swallow's benchmarks against it, but it refuses to compile with either gcc 4.0.1 or 4.3.1 on Linux (fails in Python/ast.c). I can send you the build failures off-list, if you're interested. Thanks, Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Rethinking intern() and its data structure
Hi John, On Thu, Apr 9, 2009 at 8:02 AM, John Arbash Meinel j...@arbash-meinel.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I've been doing some memory profiling of my application, and I've found some interesting results with how intern() works. I was pretty surprised to see that the interned dict was actually consuming a significant amount of total memory. To give the specific values, after doing: bzr branch A B of a small project, the total memory consumption is ~21MB [snip] Anyway, I the internals of intern() could be done a bit better. Here are some concrete things: [snip] Memory usage is definitely something we're interested in improving. Since you've already looked at this in some detail, could you try implementing one or two of your ideas and see if it makes a difference in memory consumption? Changing from a dict to a set looks promising, and should be a fairly self-contained way of starting on this. If it works, please post the patch on http://bugs.python.org with your results and assign it to me for review. Thanks, Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Rethinking intern() and its data structure
On Thu, Apr 9, 2009 at 9:34 AM, John Arbash Meinel john.arbash.mei...@gmail.com wrote: ... Anyway, I the internals of intern() could be done a bit better. Here are some concrete things: [snip] Memory usage is definitely something we're interested in improving. Since you've already looked at this in some detail, could you try implementing one or two of your ideas and see if it makes a difference in memory consumption? Changing from a dict to a set looks promising, and should be a fairly self-contained way of starting on this. If it works, please post the patch on http://bugs.python.org with your results and assign it to me for review. Thanks, Collin Winter (I did end up subscribing, just with a different email address :) What is the best branch to start working from? trunk? That's a good place to start, yes. If the idea works well, we'll want to port it to the py3k branch, too, but that can wait. Collin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Rethinking intern() and its data structure
On Thu, Apr 9, 2009 at 6:24 PM, John Arbash Meinel john.arbash.mei...@gmail.com wrote: Greg Ewing wrote: John Arbash Meinel wrote: And the way intern is currently written, there is a third cost when the item doesn't exist yet, which is another lookup to insert the object. That's even rarer still, since it only happens the first time you load a piece of code that uses a given variable name anywhere in any module. Somewhat true, though I know it happens 25k times during startup of bzr... And I would be a *lot* happier if startup time was 100ms instead of 400ms. Quite so. We have a number of internal tools, and they find that frequently just starting up Python takes several times the duration of the actual work unit itself. I'd be very interested to review any patches you come up with to improve start-up time; so far on this thread, there's been a lot of theory and not much practice. I'd approach this iteratively: first replace the dict with a set, then if that bears fruit, consider a customized data structure; if that bears fruit, etc. Good luck, and be sure to let us know what you find, Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] core python tests
On Sat, Apr 4, 2009 at 7:33 AM, Michael Foord fuzzy...@voidspace.org.uk wrote: Antoine Pitrou wrote: Nick Coghlan ncoghlan at gmail.com writes: C. Titus Brown wrote: I vote for a separate mailing list -- 'python-tests'? -- but I don't know exactly how splintered to make the conversation. It probably belongs at python.org but if you want me to host it, I can. If too many things get moved off to SIGs there won't be anything left for python-dev to talk about ;) There is already an stdlib-sig, which has been almost unused. stdlib-sig isn't *quite* right (the testing and benchmarking are as much about core python as the stdlib) - although we could view the benchmarks and tests themselves as part of the standard library... Either way we should get it underway. Collin and Jeffrey - happy to use stdlib-sig? Works for me. Collin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PyDict_SetItem hook
On Fri, Apr 3, 2009 at 2:27 AM, Antoine Pitrou solip...@pitrou.net wrote: Thomas Wouters thomas at python.org writes: Pystone is pretty much a useless benchmark. If it measures anything, it's the speed of the bytecode dispatcher (and it doesn't measure it particularly well.) PyBench isn't any better, in my experience. I don't think pybench is useless. It gives a lot of performance data about crucial internal operations of the interpreter. It is of course very little real-world, but conversely makes you know immediately where a performance regression has happened. (by contrast, if you witness a regression in a high-level benchmark, you still have a lot of investigation to do to find out where exactly something bad happened) Perhaps someone should start maintaining a suite of benchmarks, high-level and low-level; we currently have them all scattered around (pybench, pystone, stringbench, richard, iobench, and the various Unladen Swallow benchmarks; not to mention other third-party stuff that can be found in e.g. the Computer Language Shootout). Already in the works :) As part of the common standard library and test suite that we agreed on at the PyCon language summit last week, we're going to include a common benchmark suite that all Python implementations can share. This is still some months off, though, so there'll be plenty of time to bikeshed^Wrationally discuss which benchmarks should go in there. Collin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PyDict_SetItem hook
On Fri, Apr 3, 2009 at 9:43 AM, Antoine Pitrou solip...@pitrou.net wrote: Thomas Wouters thomas at python.org writes: Really? Have you tried it? I get at least 5% noise between runs without any changes. I have gotten results that include *negative* run times. That's an implementation problem, not an issue with the tests themselves. Perhaps a better timing mechanism could be inspired from the timeit module. Perhaps the default numbers of iterations should be higher (many subtests run in less than 100ms on a modern CPU, which might be too low for accurate measurement). Perhaps the so-called calibration should just be disabled. etc. The tests in PyBench are not micro-benchmarks (they do way too much for that), Then I wonder what you call a micro-benchmark. Should it involve direct calls to low-level C API functions? I agree that a suite of microbenchmarks is supremely useful: I would very much like to be able to isolate, say, raise statement performance. PyBench suffers from implementation defects that in its current incarnation make it unsuitable for this, though: - It does not effectively isolate component performance as it claims. When I was working on a change to BINARY_MODULO to make string formatting faster, PyBench would report that floating point math got slower, or that generator yields got slower. There is a lot of random noise in the results. - We have observed overall performance swings of 10-15% between runs on the same machine, using the same Python binary. Using the same binary on the same unloaded machine should give as close an answer to 0% as possible. - I wish PyBench actually did more isolation. Call.py:ComplexPythonFunctionCalls is on my mind right now; I wish it didn't put keyword arguments and **kwargs in the same microbenchmark. - In experimenting with gcc 4.4's FDO support, I produced a training load that resulted in a 15-30% performance improvement (depending on benchmark) across all benchmarks. Using this trained binary, PyBench slowed down by 10%. - I would like to see PyBench incorporate better statistics for indicating the significance of the observed performance difference. I don't believe that these are insurmountable problems, though. A great contribution to Python performance work would be an improved version of PyBench that corrects these problems and offers more precise measurements. Is that something you might be interested in contributing to? As performance moves more into the wider consciousness, having good tools will become increasingly important. Thanks, Collin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PyDict_SetItem hook
On Fri, Apr 3, 2009 at 10:28 AM, Michael Foord fuzzy...@voidspace.org.uk wrote: Collin Winter wrote: As part of the common standard library and test suite that we agreed on at the PyCon language summit last week, we're going to include a common benchmark suite that all Python implementations can share. This is still some months off, though, so there'll be plenty of time to bikeshed^Wrationally discuss which benchmarks should go in there. Where is the right place for us to discuss this common benchmark and test suite? As the benchmark is developed I would like to ensure it can run on IronPython. The test suite changes will need some discussion as well - Jython and IronPython (and probably PyPy) have almost identical changes to tests that currently rely on deterministic finalisation (reference counting) so it makes sense to test changes on both platforms and commit a single solution. I believe Brett Cannon is the best person to talk to about this kind of thing. I don't know that any common mailing list has been set up, though there may be and Brett just hasn't told anyone yet :) Collin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PyDict_SetItem hook
On Fri, Apr 3, 2009 at 10:50 AM, Antoine Pitrou solip...@pitrou.net wrote: Collin Winter collinw at gmail.com writes: - I wish PyBench actually did more isolation. Call.py:ComplexPythonFunctionCalls is on my mind right now; I wish it didn't put keyword arguments and **kwargs in the same microbenchmark. Well, there is a balance to be found between having more subtests and keeping a reasonable total running time :-) (I have to plead guilty for ComplexPythonFunctionCalls, btw) Sure, there's definitely a balance to maintain. With perf.py, we're going down the road of having different tiers of benchmarks: the default set is the one we pay the most attention to, with other benchmarks available for benchmarking certain specific subsystems or workloads (like pickling list-heavy input data). Something similar could be done for PyBench, giving the user the option of increasing the level of detail (and run-time) as appropriate. - I would like to see PyBench incorporate better statistics for indicating the significance of the observed performance difference. I see you already have this kind of measurement in your perf.py script, would it be easy to port it? Yes, it should be straightforward to incorporate these statistics into PyBench. In the same directory as perf.py, you'll find test_perf.py which includes tests for the stats functions we're using. Collin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PyDict_SetItem hook
On Wed, Apr 1, 2009 at 4:29 PM, John Ehresman j...@wingware.com wrote: I've written a proof of concept patch to add a hook to PyDict_SetItem at http://bugs.python.org/issue5654 My motivation is to enable watchpoints in a python debugger that are called when an attribute or global changes. I know that this won't cover function locals and objects with slots (as Martin pointed out). We talked about this at the sprints and a few issues came up: * Is this worth it for debugger watchpoint support? This is a feature that probably wouldn't be used regularly but is extremely useful in some situations. * Would it be better to create a namespace dict subclass of dict, use it for modules, classes, instances, and only allow watches of the subclass instances? * To what extent should non-debugger code use the hook? At one end of the spectrum, the hook could be made readily available for non-debug use and at the other end, it could be documented as being debug only, disabled in python -O, not exposed in the stdlib to python code. Have you measured the impact on performance? Collin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] 3to2 Project
On Mon, Mar 30, 2009 at 7:44 AM, Jesse Noller jnol...@gmail.com wrote: During the Language summit this past Thursday, pretty much everyone agreed that a python 3 to python 2 tool would be a very large improvement in helping developers be able to write pure python 3 code. The idea being a large project such as Django could completely cut over to Python3, but then run the 3to2 tool on the code based to continue to support version 2.x. I raised my hand to help move this along, I've spoke to Benjamin Peterson, and he's amendable to mentoring a GSoC student for this project and he's already received at least one proposal for this. Additionally, there's been a number of developers here at PyCon who are more than ready to help contribute. So, if people are interested in helping, coordinating work/etc - feel free to sync up with Benjamin - he's started a wiki page here: http://wiki.python.org/moin/3to2 If anyone is interested in working on this during the PyCon sprints or otherwise, here are some easy, concrete starter projects that would really help move this along: - The core refactoring engine needs to be broken out from 2to3. In particular, the tests/ and fixes/ need to get pulled up a directory, out of lib2to3/. - Once that's done, lib2to3 should then be renamed to something like librefactor or something else that indicates its more general nature. This will allow both 2to3 and 3to2 to more easily share the core components. - If you're more performance-minded, 2to3 and 3to2 would benefit heavily from some work on the pattern matching system. The current pattern matcher is a fairly simple AST interpreter; compiling the patterns down to pure Python code would be a win, I believe. This is all pretty heavily tested, so you wouldn't run much risk of breaking it. Collin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] GPython?
On Fri, Mar 27, 2009 at 5:50 AM, Paul Moore p.f.mo...@gmail.com wrote: 2009/3/27 Collin Winter coll...@gmail.com: In particular, Windows support is one of those things we'll need to address on our end. LLVM's Windows support may be spotty, or there may be other Windows issues we inadvertently introduce. None of the three of us have Windows machines, nor do we particularly want to acquire them :), and Windows support isn't going to be a big priority. If we find that some of our patches have Windows issues, we will certainly fix those before proposing their inclusion in CPython. On the assumption (sorry, I've done little more than read the press releases so far) that you're starting from the CPython base and incrementally patching things, you currently have strong Windows support. It would be a shame if that got gradually chipped away through neglect, until it became a big job to reinstate it. That's correct, we're starting with CPython 2.6.1. If the Unladen Swallow team doesn't include any Windows developers, you're a bit stuck, I guess, but could you not at least have a Windows buildbot which keeps tabs on the current status? Then you might encourage interested Windows bystanders to check in occasionally and maybe offer fixes. We're definitely going to set up buildslaves for Windows and other platforms (currently we're only running Linux buildslaves). We're trying to solicit 20% time help from Google Windows developers, but that experience is relatively rare compared to the vast sea of Linux-focused engineers (though that's true of the open-source community in general). Also, it may be that some of the components we're reusing don't support Windows, or perhaps worse, offer degraded performance on Windows. We believe we can fix these problems as they come up -- we certainly don't want Windows issues to prevent patches from going into mainline -- but it's still a risk that Windows issues may slow down our development or prevent us from doing something fancy down the road, and I wanted to be up front about that risk. I've updated our ProjectPlan in hopes of clarifying this. That section of the docs was copy/pasted off a slide, and was a bit too terse :) Collin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Partial 2to3?
2009/3/27 s...@pobox.com: Following up on yesterday's conversation about 2to3 and 3to2, I wonder if it's easily possible to run 2to3 with a specific small subset of its fixers? For example, people not wanting to make the 2-3 leap yet might still be interersted in the exception handling changes (except Foo as exc)? Sure, that's easily possible: run 2to3 -f some_fixer,other_fixer,this_fixer,that_fixer. You can get a full list of fixers using the --list-fixes option. Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] GPython?
On Thu, Mar 26, 2009 at 8:05 PM, Terry Reedy tjre...@udel.edu wrote: An ars technica articla just linked to in a python-list post http://arstechnica.com/open-source/news/2009/03/google-launches-project-to-boost-python-performance-by-5x.ars calls the following project Google launched http://code.google.com/p/unladen-swallow/wiki/ProjectPlan (Though the project page does not really claim that.) Hi, I'm the tech lead for Unladen Swallow. Jeffrey Yasskin and Thomas Wouters are also working on this project. Unladen Swallow is Google-sponsored, but not Google-owned. This is an open-source branch that we're working on, focused on performance, and we want to move all of our work upstream as quickly as possible. In fact, right now I'm adding a last few tests before putting our cPickle patches up on the tracker for further review. I am sure some people here might find this interesting. I'd love to have a faster CPython, but this note: Will probably kill Python Windows support (for now). would kill merger back into mainline (for now) without one opposing being 'conservative'. To clarify, when I wrote 'conservative', I wasn't being disparaging. A resistance to change can certainly be a good thing, and something that I think is very healthy in these situations. We certainly have to prove ourselves, especially given some of the fairly radical things we're thinking of [1]. We believe we can justify these changes, but I *do* want to be forced to justify them publicly; I don't think python-dev would be doing its job if some of these things were merely accepted without discussion. In particular, Windows support is one of those things we'll need to address on our end. LLVM's Windows support may be spotty, or there may be other Windows issues we inadvertently introduce. None of the three of us have Windows machines, nor do we particularly want to acquire them :), and Windows support isn't going to be a big priority. If we find that some of our patches have Windows issues, we will certainly fix those before proposing their inclusion in CPython. If one adds type annotations so that values can be unboxed, would not Cython, etc, do even better for speedup? Possibly, but we want to see how far we can push the current language before we even start thinking of tinkering with the language spec. Assigning meaning to function annotations is something that PEP 3107 explicitly avoids, and I'm not sure Unladen Swallow (or anyone else) would want to take the plunge into coming up with broadly-acceptable type systems for Python. That would be a bikeshed discussion of such magnitude, you'd have to invent new colors to paint the thing. Collin Winter [1] - http://code.google.com/p/unladen-swallow/wiki/ProjectPlan ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] GPython?
On Thu, Mar 26, 2009 at 11:26 PM, Alexandre Vassalotti alexan...@peadrop.com wrote: On Thu, Mar 26, 2009 at 11:40 PM, Collin Winter coll...@gmail.com wrote: In fact, right now I'm adding a last few tests before putting our cPickle patches up on the tracker for further review. Put me in the nosy list when you do; and when I get some free time, I will give your patches a complete review. I've already taken a quick look at cPickle changes you did in Unladen and I think some (i.e., the custom memo table) are definitely worthy to be merged in the mainlines. Will do, thanks for volunteering! jyasskin has already reviewed them internally, but it'll be good to put them through another set of eyes. Collin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] speeding up PyObject_GetItem
2009/3/24 Daniel Stutzbach dan...@stutzbachenterprises.com: On Tue, Mar 24, 2009 at 10:13 AM, Mark Dickinson dicki...@gmail.com wrote: 2009/3/24 Daniel Stutzbach dan...@stutzbachenterprises.com: [...] 100 nanoseconds, py3k trunk: ceval - PyObject_GetItem (object.c) - list_subscript (listobject.c) - PyNumber_AsSsize_t (object.c) - PyLong_AsSsize_t (longobject.c) [more timings snipped] Does removing the PyLong_Check call in PyLong_AsSsize_t make any noticeable difference to these timings? Making no other changes from the trunk, removing the PyLong_Check and NULL check from PyLong_AsSsize_t shaves off 4 nanoseconds (or around 4% since the trunk is around 100 nanoseconds). Here's what I'm testing with, by the way: ./python.exe Lib/timeit.py -r 10 -s 'x = list(range(10))' 'x[5]' What difference does it make on real applications? Are you running any macro-benchmarks against this? Collin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] 3.1 performance
On Sun, Mar 8, 2009 at 7:30 AM, Christian Heimes li...@cheimes.de wrote: Antoine Pitrou wrote: Hi, Victor Stinner victor.stinner at haypocalc.com writes: Summary (minimum total) on 32 bits CPU: * Python 2.6.1: 8762 ms * Python 3.0.1: 8977 ms * Python 3.1a1: 9228 ms (slower than 3.0) Have you compiled with or without --with-computed-gotos? Why is the feature still disabled by default? Christian PS: Holy moly! Computed gotos totally put my Python on fire! The feature increases the minimum run-time by approx. 25% and the average run-time by approx. 40% on my Ubuntu 8.10 box (AMD64, Intel(R) Core(TM)2 CPU T7600 @ 2.33GHz). Note that of the benchmarks tested, PyBench benefits the most from threaded eval loop designs. Other systems benefit less; for example, Django template benchmarks were only sped up by 7-8% when I was testing it. Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pickler/Unpickler API clarification
On Fri, Mar 6, 2009 at 10:01 AM, Michael Haggerty mhag...@alum.mit.edu wrote: Antoine Pitrou wrote: Le vendredi 06 mars 2009 à 13:44 +0100, Michael Haggerty a écrit : Antoine Pitrou wrote: Michael Haggerty mhagger at alum.mit.edu writes: It is easy to optimize the pickling of instances by giving them __getstate__() and __setstate__() methods. But the pickler still records the type of each object (essentially, the name of its class) in each record. The space for these strings constituted a large fraction of the database size. If these strings are not interned, then perhaps they should be. There is a similar optimization proposal (w/ patch) for attribute names: http://bugs.python.org/issue5084 If I understand correctly, this would not help: - on writing, the strings are identical anyway, because they are read out of the class's __name__ and __module__ fields. Therefore the Pickler's usual memoizing behavior will prevent the strings from being written more than once. Then why did you say that the space for these strings constituted a large fraction of the database size, if they are already shared? Are your objects so tiny that even the space taken by the pointer to the type name grows the size of the database significantly? Sorry for the confusion. I thought you were suggesting the change to help the more typical use case, when a single Pickler is used for a lot of data. That use case will not be helped by interning the class __name__ and __module__ strings, for the reasons given in my previous email. In my case, the strings are shared via the Pickler memoizing mechanism because I pre-populate the memo (using the API that the OP proposes to remove), so your suggestion won't help my current code, either. It was before I implemented the pre-populated memoizer that the space for these strings constituted a large fraction of the database size. But your suggestion wouldn't help that case, either. Here are the main use cases: 1. Saving and loading one large record. A class's __name__ string is the same string object every time it is retrieved, so it only needs to be stored once and the Pickler memo mechanism works. Similarly for the class's __module__ string. 2. Saving and loading lots of records sequentially. Provided a single Pickler is used for all records and its memo is never cleared, this works just as well as case 1. 3. Saving and loading lots of records in random order, as for example in the shelve module. It is not possible to reuse a Pickler with retained memo, because the Unpickler might not encounter objects in the right order. There are two subcases: a. Use a clean Pickler/Unpickler object for each record. In this case the __name__ and __module__ of a class will appear once in each record in which the class appears. (This is the case regardless of whether they are interned.) On reading, the __name__ and __module__ are only used to look up the class, so interning them won't help. It is thus impossible to avoid wasting a lot of space in the database. b. Use a Pickler/Unpickler with a preset memo for each record (my unorthodox technique). In this case the class __name__ and __module__ will be memoized in the shared memo, so in other records only their ID needs to be stored (in fact, only the ID of the class object itself). This allows the database to be smaller, but does not have any effect on the RAM usage of the loaded objects. If the OP's proposal is accepted, 3b will become impossible. The technique seems not to be well known, so maybe it doesn't need to be supported. It would mean some extra work for me on the cvs2svn project though :-( Talking it over with Guido, support for the memo attribute will have to stay. I shall add it back to my patches. Collin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Pickler/Unpickler API clarification
I'm working on some performance patches for cPickle, and one of the bigger wins so far has been replacing the Pickler's memo dict with a custom hashtable (and hence removing memo's getters and setters). In looking over this, Jeffrey Yasskin commented that this would break anyone who was accessing the memo attribute. I've found a few examples of code using the memo attribute ([1], [2], [3]), and there are probably more out there, but the memo attribute doesn't look like part of the API to me. It's only documented in http://docs.python.org/library/pickle.html as you used to need this before Python 2.3, but don't anymore. However: I don't believe you should ever need this attribute. The usages of memo I've seen break down into two camps: clearing the memo, and wanting to explicitly populate the memo with predefined values. Clearing the memo is recommended as part of reusing Pickler objects, but I can't fathom when you would want to reuse a Pickler *without* clearing the memo. Reusing the Pickler without clearing the memo will produce pickles that are, as best I can see, invalid -- at least, pickletools.dis() rejects this, which is the closest thing we have to a validator. Explicitly setting memo values has the same problem: an easy, very brittle way to produce invalid data. So my questions are these: 1) Should Pickler/Unpickler objects automatically clear their memos when dumping/loading? 2) Is memo an intentionally exposed, supported part of the Pickler/Unpickler API, despite the lack of documentation and tests? Thanks, Collin [1] - http://google.com/codesearch/p?hl=en#Qx8E-7HUBTk/trunk/google/appengine/api/memcache/__init__.pyq=lang:py%20%5C.memo [2] - http://google.com/codesearch/p?hl=en#M-DDI-lCOgE/lib/python2.4/site-packages/cvs2svn_lib/primed_pickle.pyq=lang:py%20%5C.memo [3] - http://google.com/codesearch/p?hl=en#l_w_cA4dKMY/AtlasAnalysis/2.0.3-LST-1/PhysicsAnalysis/PyAnalysis/PyAnalysisUtils/python/root_pickle.pyq=lang:py%20pick.*%5C.memo%5Cb ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A suggestion: Do proto-PEPs in Google Docs
On Thu, Feb 19, 2009 at 7:17 PM, Stephen J. Turnbull turnb...@sk.tsukuba.ac.jp wrote: On Python-Ideas, Guido van Rossum writes: On Thu, Feb 19, 2009 at 2:12 AM, Greg Ewing wrote: Fifth draft of the PEP. Re-worded a few things slightly to hopefully make the proposal a bit clearer up front. Wow, how I long for the days when we routinely put things like this under revision control so its easy to compare versions. FWIW, Google Docs is almost there. Working with Brett et al on early drafts of PEP 0374 was easy and pleasant, and Google Docs gives control of access to the document to the editor, not the Subversion admin. The ability to make comments that are not visible to non-editors was nice. Now that it's in Subversion it's much less convenient for me (a non-committer). I actually have to *decide* to work on it, rather than simply raising a browser window, hitting refresh and fixing a typo or two (then back to day job work). The main problem with Google Docs is that is records a revision automatically every so often (good) but doesn't prune the automatic commits (possibly hard to do efficiently) OR mark user saves specially (easy to do). This lack of marking important revisions makes the diff functionality kind of tedious. I don't know how automatic the conversion to reST was, but the PEP in Subversion is a quite accurate conversion of the Google Doc version. Overall, I recommend use of Google Docs for Python-Ideas level of PEP drafts. Rietveld would also be a good option: it offers more at-will revision control (rather than whenever Google Docs decides), allows you to attach comments to the revisions, and will give you nice diffs between PEP iterations. Collin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] IO implementation: in C and Python?
On Thu, Feb 19, 2009 at 9:07 PM, Guido van Rossum gu...@python.org wrote: On Thu, Feb 19, 2009 at 8:38 PM, Brett Cannon br...@python.org wrote: On Thu, Feb 19, 2009 at 19:41, Benjamin Peterson benja...@python.org wrote: As we prepare to merge the io-c branch, the question has come up [1] about the original Python implementation. Should it just be deleted in favor C version? The wish to maintain the two implementations together has been raised on the basis that Python is easier to experiment on and read (for other vm implementors). Probably not a surprise, but +1 from me for keeping the pure Python version around for the benefit of other VMs as well as a reference implementation. You have been practice channeling me again, haven't you? I like the idea of having two (closely matching) implementations very much. Agreed. In particular, this helps any projects that are focused on improving the performance of pure-Python code: they can work on minimizing the delta between the Python and C versions. Collin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Partial function application 'from the right'
On Tue, Feb 3, 2009 at 5:44 AM, Ben North b...@redfrontdoor.org wrote: Hi, Thanks for the further responses. Again, I'll try to summarise: Scott David Daniels pointed out an awkward interaction when chaining partial applications, such that it could become very unclear what was going to happen when the final function is called: If you have: def button(root, position, action=None, text='*', color=None): ... ... blue_button = partial(button, my_root, color=(0,0,1)) Should partial_right(blue_button, 'red') change the color or the text? Calvin Spealman mentioned a previous patch of his which took the 'hole' approach, i.e.: [...] my partial.skip patch, which allows the following usage: split_one = partial(str.split, partial.skip, 1) This would solve my original problems, and, continuing Scott's example, def on_clicked(...): ... _ = partial.skip clickable_blue_button = partial(blue_button, _, on_clicked) has a clear enough meaning I think: clickable_blue_button('top-left corner') = blue_button('top-left corner', on_clicked) = button(my_root, 'top-left corner', on_clicked, color=(0,0,1)) Calvin's idea/patch sounds good to me, then. Others also liked it. Could it be re-considered, instead of the partial_right idea? Have any of the original objections to Calvin's patch (http://bugs.python.org/issue1706256) been addressed? If not, I don't see anything in these threads that justify resurrecting it. I still haven't seen any real code presented that would benefit from partial.skip or partial_right. Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Partial function application 'from the right'
On Tue, Feb 3, 2009 at 11:53 AM, Antoine Pitrou solip...@pitrou.net wrote: Collin Winter collinw at gmail.com writes: Have any of the original objections to Calvin's patch (http://bugs.python.org/issue1706256) been addressed? If not, I don't see anything in these threads that justify resurrecting it. I still haven't seen any real code presented that would benefit from partial.skip or partial_right. The arguments for and against the patch could be brought against partial() itself, so I don't understand the -1's at all. Quite so, but that doesn't justify adding more capabilities to partial(). Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Partial function application 'from the right'
On Thu, Jan 29, 2009 at 6:12 AM, Ben North b...@redfrontdoor.org wrote: Hi, I find 'functools.partial' useful, but occasionally I'm unable to use it because it lacks a 'from the right' version. E.g., to create a function which splits a string on commas, you can't say # Won't work when called: split_comma = partial(str.split, sep = ',') [snip] I've created a patch which adds a 'partial_right' function. The two examples above: import functools, math split_comma = functools.partial_right(str.split, ',') split_comma('a,b,c') ['a', 'b', 'c'] log_10 = functools.partial_right(math.log, 10.0) log_10(100.0) 2.0 Can you point to real code that this makes more readable? Collin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Decoder functions accept str in py3k
On Wed, Jan 7, 2009 at 2:35 PM, Brett Cannon br...@python.org wrote: On Wed, Jan 7, 2009 at 10:57, M.-A. Lemburg m...@egenix.com wrote: [SNIP] BTW: The _codecsmodule.c file is a 4 spaces indent file as well (just like all Unicode support source files). Someone apparently has added tabs when adding support for Py_buffers. It looks like this formatting mix-up is just going to get worse for the next few years while the 2.x series is still being worked on. Should we just bite the bullet and start adding modelines for Vim and Emacs to .c/.h files that are written in the old 2.x style? For Vim I can then update the vimrc in Misc/Vim to then have 4-space indent be the default for C files. Or better yet, really bite the bullet and just reindent everything to spaces. Not every one uses vim or emacs, nor do all tools understand their modelines. FYI, there are options to svn blame and git to skip whitespace-only changes. Just-spent-an-hour-fixing-screwed-up-indents-in-changes-to-Python/*.c-ly, Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP: Consolidating names in the `unittest` module
On Tue, Jul 15, 2008 at 6:03 PM, Michael Foord [EMAIL PROTECTED] wrote: Collin Winter wrote: On Tue, Jul 15, 2008 at 6:58 AM, Ben Finney [EMAIL PROTECTED] wrote: Backwards Compatibility === The names to be obsoleted should be deprecated and removed according to the schedule for modules in PEP 4 [#PEP-4]_. While deprecated, use of the deprecated attributes should raise a ``DeprecationWarning``, with a message stating which replacement name should be used. Is any provision being made for a 2to3 fixer/otherwise-automated transition for the changes you propose here? As the deprecation is intended for 2.X and 3.X - is 2to3 fixer needed? IMO some kind of automated transition tool is needed -- anyone who has the time to convert their codebase by hand (for some definition of by hand that involves sed) doesn't have enough to do. Collin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP: Consolidating names in the `unittest` module
On Wed, Jul 16, 2008 at 5:21 AM, Michael Foord [EMAIL PROTECTED] wrote: Terry Reedy wrote: Michael Foord wrote: Collin Winter wrote: Is any provision being made for a 2to3 fixer/otherwise-automated transition for the changes you propose here? As the deprecation is intended for 2.X and 3.X - is 2to3 fixer needed? A fixer will only be needed when it actually is needed, but when it is, it should be a unittest-name fixer since previous 3.x code will also need fixing. Since the duplicates are multiples names for the same objects, the fixer should be a trivial name substitution. Can 2to3 fixers be used for 2to2 and 3to3 translation then? The intention is for the infrastructure behind 2to3 to be generally reusable for other Python source-to-source translation tools, be that 2to2 or 3to3. That hasn't fully materialized yet, but it's getting there. Collin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Unittest PEP do's and don'ts (BDFL pronouncement)
On Wed, Jul 16, 2008 at 2:03 PM, Raymond Hettinger [EMAIL PROTECTED] wrote: If some people want to proceed down the path of useful additions, I challenge them to think bigger. Give me some test methods that improve my life. Don't give me thirty ways to spell something I can already do. From: Michael Foord [EMAIL PROTECTED] I assert that... the following changes do meet those conditions: assertRaisesWithMessage . . . Changes to assertEquals so that the failure messages are more useful ... assertIn / assertNotIn I use very regularly for collection membership - self.assert_(func(x) in result_set) + self.assertIn(func(x), result_set) Yawn. The gain is zero. Actually, it's negative because the second doesn't read as nicely as the pure python expression. It's only negative if the method doesn't do anything special. For example, an assertListEqual() method can tell you *how* the lists differ, which the pure Python expression can't -- all the Python expression can say is yes or no. We have methods like this at work and they're very useful. That said, I see no reason why these things have to be methods. The self. method boilerplate is cluttering line-noise in this case. I can easily imagine a module of nothing but comparison functions. Collin Winter Think bigger! No fat APIs. Do something cool! Checkout the dynamic test creation in test_decimal to see if it can be generalized. Give me some cool test runners. Maybe find a way to automatically launch pdb or to dump the locals variables at the time of failure. Maybe move the test_*.py search into the unittest module. We want *small* and powerful. The api for TestCase instances is already way too fat. See an old discussion on the subject at: http://bugs.python.org/issue2578 The run_tests function for running collections of tests. Almost every project I've worked on has had an ad-hoc imnplementation of this, collecting test modules and turning them into a suitable collection for use with unittest. Now, that's more like it.Propose more cool stuff like this and the module really will be improved. assertIs / assertIsNot also sounds good, but is not something I would miss if they weren't added. Doh! We're back to replacing clean expressions using pure python syntax with a method name equivalent. That's a step backwards. Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/collinw%40gmail.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP: Consolidating names in the `unittest` module
(…)`` Replaces ``countTestCases(…)`` ``TestLoader`` attributes ~ ``sort_test_methods_using`` Replaces ``sortTestMethodsUsing`` ``suite_class`` Replaces ``suiteClass`` ``test_method_prefix`` Replaces ``testMethodPrefix`` ``get_test_case_names(self, test_case_class)`` Replaces ``getTestCaseNames(self, testCaseClass)`` ``load_tests_from_module(…)`` Replaces ``loadTestsFromModule(…)`` ``load_tests_from_name(…)`` Replaces ``loadTestsFromName(…)`` ``load_tests_from_names(…)`` Replaces ``loadTestsFromNames(…)`` ``load_tests_from_test_case(self, test_case_class)`` Replaces ``loadTestsFromTestCase(self, testCaseClass)`` ``_TextTestResult`` attributes ~~ ``show_all`` Replaces ``showAll`` ``add_error(…)`` Replaces ``addError(…)`` ``add_failure(…)`` Replaces ``addFailure(…)`` ``add_success(…)`` Replaces ``addSuccess(…)`` ``get_description(…)`` Replaces ``getDescription(…)`` ``print_error_list(…)`` Replaces ``printErrorList(…)`` ``print_errors(…)`` Replaces ``printErrors(…)`` ``start_test(…)`` Replaces ``startTest(…)`` ``TextTestRunner`` attributes ~ ``_make_result(…)`` Replaces ``_makeResult(…)`` ``TestProgram`` attributes ~~ ``__init__(self, module, default_test, argv, test_runner, test_loader)`` Replaces ``__init__(self, module, defaultTest, argv, testRunner, testLoader)`` ``create_tests(…)`` Replaces ``createTests(…)`` ``parse_args(…)`` Replaces ``parseArgs(…)`` ``run_tests(…)`` Replaces ``runTests(…)`` ``usage_exit(…)`` Replaces ``usageExit(…)`` Rationale = Redundant names --- The current API, with two or in some cases three different names referencing exactly the same function, leads to an overbroad and redundant API that violates PEP 20 [#PEP-20]_ (there should be one, and preferably only one, obvious way to do it). Removal of ``assert*`` names While there is consensus support to `remove redundant names`_ for the ``TestCase`` test methods, the issue of which set of names should be retained is controversial. Arguments in favour of retaining only the ``assert*`` names: * BDFL preference: The BDFL has stated [#vanrossum-1]_ a preference for the ``assert*`` names. * Precedent: The Python standard library currently uses the ``assert*`` names by a roughly 8:1 majority over the ``fail*`` names. (Counting unit tests in the py3k tree at 2008-07-15 [#pitrou-1]_.) An ad-hoc sampling of other projects that use `unittest` also demonstrates strong preference for use of the ``assert*`` names [#bennetts-1]_. * Positive admonition: The ``assert*`` names state the intent of how the code under test *should* behave, while the ``fail*`` names are phrased in terms of how the code *should not* behave. Arguments in favour of retaining only the ``fail*`` names: * Explicit is better than implicit: The ``fail*`` names state *what the function will do* explicitly: fail the test. With the ``assert*`` names, the action to be taken is only implicit. * Avoid false implication: The test methods do not have any necessary connection with the built-in ``assert`` statement. Even the exception raised, while it defaults to ``AssertionException``, is explicitly customisable via the documented ``failure_exception`` attribute. Choosing the ``fail*`` names avoids the false association with either of these. This is exacerbated by the plain-boolean test using a name of ``assert_`` (with a trailing underscore) to avoid a name collision with the built-in ``assert`` statement. The corresponding ``fail_if`` name has no such issue. PEP 8 names --- Although `unittest` (and its predecessor `PyUnit`) are intended to be familiar to users of other xUnit interfaces, there is no attempt at direct API compatibility since the only code that Python's `unittest` interfaces with is other Python code. The module is in the standard library and its names should all conform with PEP 8 [#PEP-8]_. Backwards Compatibility === The names to be obsoleted should be deprecated and removed according to the schedule for modules in PEP 4 [#PEP-4]_. While deprecated, use of the deprecated attributes should raise a ``DeprecationWarning``, with a message stating which replacement name should be used. Is any provision being made for a 2to3 fixer/otherwise-automated transition for the changes you propose here? Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Can someone check my lib2to3 change for fix_imports?
On Tue, Jul 1, 2008 at 7:38 PM, Benjamin Peterson [EMAIL PROTECTED] wrote: On Tue, Jul 1, 2008 at 9:04 PM, Brett Cannon [EMAIL PROTECTED] wrote: I just committed r64651 which is my attempt to add support to fix_imports so that modules that have been split up in 3.0 can be properly fixed. 2to3's test suite passes and all, but I am not sure if I botched it somehow since I did the change slightly blind. Can someone just do a quick check to make sure I did it properly? Also, what order should renames be declared to give priority to certain renames (e.g., urllib should probably be renamed to urllib.requeste over urllib.error when not used in a ``from ... import`` statement). Well for starters, you know the test for fix_imports is disabled, right? Why was this test disabled, rather than fixed? That seems a rather poor solution to the problem of it taking longer than desired to run. Collin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Can someone check my lib2to3 change for fix_imports?
On Tue, Jul 1, 2008 at 11:32 PM, Brett Cannon [EMAIL PROTECTED] wrote: On Tue, Jul 1, 2008 at 8:36 PM, Brett Cannon [EMAIL PROTECTED] wrote: On Tue, Jul 1, 2008 at 7:38 PM, Benjamin Peterson [EMAIL PROTECTED] wrote: On Tue, Jul 1, 2008 at 9:04 PM, Brett Cannon [EMAIL PROTECTED] wrote: I just committed r64651 which is my attempt to add support to fix_imports so that modules that have been split up in 3.0 can be properly fixed. 2to3's test suite passes and all, but I am not sure if I botched it somehow since I did the change slightly blind. Can someone just do a quick check to make sure I did it properly? Also, what order should renames be declared to give priority to certain renames (e.g., urllib should probably be renamed to urllib.requeste over urllib.error when not used in a ``from ... import`` statement). Well for starters, you know the test for fix_imports is disabled, right? Nope, I forgot and turning it on has it failing running under 2.5. And refactor.py cannot be run directly from 2.5 because of a relative import and in 2.6 (where runpy has extra smarts) it still doesn't work thanks to main() not being passed an argument is needs (Issue3131). Why are you trying to run refactor.py directly, rather than using 2to3 (http://svn.python.org/view/sandbox/trunk/2to3/2to3) as an entry point? Looks like 2to3 needs some TLC. Agreed. A lot of the pending bugs seem to be related to the version of lib2to3 in the stdlib, rather than the stand-alone product. Neal Norwitz and I have been working to turn parts of 2to3 into a more general refactoring library; once that's done (or even preferably before), lib2to3 should be removed from the stdlib. It's causing far more trouble than it's worth. Collin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Can someone check my lib2to3 change for fix_imports?
On Wed, Jul 2, 2008 at 12:51 PM, Martin v. Löwis [EMAIL PROTECTED] wrote: Why was this test disabled, rather than fixed? That seems a rather poor solution to the problem of it taking longer than desired to run. I disabled it because I didn't know how to fix it, and created bug reports 2968 and 2969 in return. So you did. I didn't notice them, sorry. It is policy that tests that break get disabled, rather than keeping them broken. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Can someone check my lib2to3 change for fix_imports?
On Wed, Jul 2, 2008 at 1:09 PM, Martin v. Löwis [EMAIL PROTECTED] wrote: Agreed. A lot of the pending bugs seem to be related to the version of lib2to3 in the stdlib, rather than the stand-alone product. Neal Norwitz and I have been working to turn parts of 2to3 into a more general refactoring library; once that's done (or even preferably before), lib2to3 should be removed from the stdlib. It's causing far more trouble than it's worth. I disagree. I think it is quite useful that distutils is able to invoke it, and other people also asked for that feature on PyCon. But distutils currently *doesn't* invoke it, AFAICT (unless that support is implemented somewhere other than trunk/Lib/distutils/), and no-one has stepped up to make that happen in the months since PyCon. Moreover, as I told those people who asked for this at PyCon, 2to3 is and will never be perfect, meaning that at best, distutils/2to3 integration would look like python setup.py run2to3, where distutils is just a long-hand way of running 2to3 over your code. This strikes me as a waste of time. Why do you think the trouble wouldn't be caused if it wasn't a standard library package? Problems with the current setup: 1) People are currently confused as to where they should be commit fixes. 2) Changes to the sandbox version have to be manually merged into the stdlib version, which is more overhead than I think it's worth. In addition, the stdlib version lags the sandbox version. 3) At least one bug report (issue3131) has mentioned problems with the stdlib 2to3 exhibiting problems that the stand-alone version does not. This is again extra overhead. 4) The test_imports test was commented out because of stdlib test policy. I'd rather not have that policy imposed on 2to3. Collin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] lib2to3, need some light on the imports fixer
On Mon, May 12, 2008 at 1:58 PM, Guilherme Polo [EMAIL PROTECTED] wrote: Hello, Would someone tell me how can I add a new entry in the MAPPING dict in the lib2to3/fixes/fix_imports.py that does the following: import A gets fixed as import C.D as A Right now it is fixing by doing import C.D and changing several other lines in the code to use this new C.D name. I wanted to avoid these changes if possible. I don't believe there's a way to do that, but adding support for it should be fairly straight-forward. Assign the patch to me for review. Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com