Re: [Python-Dev] Unladen swallow status

2010-07-21 Thread Collin Winter
On Wed, Jul 21, 2010 at 2:43 PM, Maciej Fijalkowski fij...@gmail.com wrote:
 On Wed, Jul 21, 2010 at 6:50 PM, Reid Kleckner reid.kleck...@gmail.com 
 wrote:
 On Wed, Jul 21, 2010 at 8:11 AM, Tim Golden m...@timgolden.me.uk wrote:
 Brett suggested that
 the Unladen Swallow merge to trunk was waiting for some work to complete
 on the JIT compiler and Georg, as release manager for 3.2, confirmed that
 Unladen Swallow would not be merged before 3.3.

 Yeah, this has slipped.  I have patches that need review, and Jeff and
 Collin have been distracted with other work.  Hopefully when one of
 them gets around to that, I can proceed with the merge without
 blocking on them.

 Reid

 The merge py3k-jit to trunk?

I believe he's talking about the merger of the Unladen tree into the
py3k-jit branch.

Collin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] commit privs

2010-07-12 Thread Collin Winter
On Sun, Jul 11, 2010 at 9:28 AM, Antoine Pitrou solip...@pitrou.net wrote:
 On Sun, 11 Jul 2010 13:23:13 +
 Reid Kleckner reid.kleck...@gmail.com wrote:

 I'm also expecting to be doing more work merging unladen-swallow into
 the py3k-jit branch, so I was wondering if I could get commit
 privileges for that.

 It sounds good to me. Also, thanks for your threading patches!

+1 from me.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] New regex module for 3.2?

2010-07-12 Thread Collin Winter
On Mon, Jul 12, 2010 at 8:18 AM, Michael Foord
fuzzy...@voidspace.org.uk wrote:
 On 12/07/2010 15:07, Nick Coghlan wrote:

 On Mon, Jul 12, 2010 at 9:42 AM, Steven D'Apranost...@pearwood.info
  wrote:


 On Sun, 11 Jul 2010 09:37:22 pm Eric Smith wrote:


 re2 comparison is interesting from the point of if it should be
 included in stdlib.


 Is it re2 or regex? I don't see having 2 regular expression engines
 in the stdlib.


 There's precedence though... the old regex engine and the new re engine
 were side-by-side for many years before regex was deprecated and
 finally removed in 2.5. Hypothetically, re2 could similarly be added to
 the standard library while re is deprecated.


 re2 deliberately omits some features for efficiency reasons, hence is
 not even on the table as a possible replacement for the standard
 library version. If someone is in a position where re2 can solve their
 problems with the re module, they should also be in a position where
 they can track it down for themselves.



 If it has *partial* compatibility, and big enough performance improvements
 for common cases, it could perhaps be used where the regex doesn't use
 unsupported features. This would have some extra cost in the compile phase,
 but would mean Python could ship with two regex engines but only one
 interface exposed to the programmer...

FWIW, this has all been discussed before:
http://aspn.activestate.com/ASPN/Mail/Message/python-dev/3829265. In
particular, I still believe that, it's not obvious that enough Python
regexes would benefit from re2's performance/restrictions tradeoff to
make such a hybrid system worthwhile in the long term. (There is no
representative corpus of real-world Python regexes weighted for
dynamic execution frequency to use in assessing such tradeoffs
empirically like there is for JavaScript.)

Collin

 MRAB's module offers a superset of re's features rather than a subset
 though, so once it has had more of a chance to bake on PyPI it may be
 worth another look.

 Cheers,
 Nick.




 --
 http://www.ironpythoninaction.com/
 http://www.voidspace.org.uk/blog

 READ CAREFULLY. By accepting and reading this email you agree, on behalf of
 your employer, to release me from all obligations and waivers arising from
 any and all NON-NEGOTIATED agreements, licenses, terms-of-service,
 shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure,
 non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have
 entered into with your employer, its partners, licensors, agents and
 assigns, in perpetuity, without prejudice to my ongoing rights and
 privileges. You further represent that you have the authority to release me
 from any BOGUS AGREEMENTS on behalf of your employer.


 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 http://mail.python.org/mailman/options/python-dev/collinwinter%40google.com

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] New regex module for 3.2?

2010-07-09 Thread Collin Winter
On Fri, Jul 9, 2010 at 10:28 AM, MRAB pyt...@mrabarnett.plus.com wrote:
 anatoly techtonik wrote:

 On Thu, Jul 8, 2010 at 10:52 PM, MRAB pyt...@mrabarnett.plus.com wrote:

 Hi all,

 I re-implemented the re module, adding new features and speed
 improvements. It's available at:

   http://pypi.python.org/pypi/regex

 under the name regex so that it can be tried alongside re.

 I'd be interested in any comments or feedback. How does it compare with
 re in terms of speed on real-world data? The benchmarks suggest it
 should be faster, or at worst comparable.

 And where are the benchmarks?
 In particular it would be interesting to see it compared both to re
 from stdlib and re2 from  http://code.google.com/p/re2/

 The benchmarks bm_regex_effbot.py and bm_regex_v8.py both perform
 multiple runs of the tests multiple times, giving just the total times
 for each set. Here are the averages:

 Python26
 BENCHMARK        re         regex      ratio
 bm_regex_effbot  0.135secs  0.083secs  1.63
 bm_regex_v8      0.153secs  0.085secs  1.80


 Python31
 BENCHMARK        re         regex      ratio
 bm_regex_effbot  0.138secs  0.083secs  1.66
 bm_regex_v8      0.170secs  0.091secs  1.87

Out of curiosity, what are the results for the bm_regex_compile benchmark?

Collin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Unpickling memory usage problem, and a proposed solution

2010-04-23 Thread Collin Winter
On Fri, Apr 23, 2010 at 11:49 AM, Alexandre Vassalotti
alexan...@peadrop.com wrote:
 On Fri, Apr 23, 2010 at 2:38 PM, Alexandre Vassalotti
 alexan...@peadrop.com wrote:
 Collin Winter wrote a simple optimization pass for cPickle in Unladen
 Swallow [1]. The code reads through the stream and remove all the
 unnecessary PUTs in-place.


 I just noticed the code removes *all* PUT opcodes, regardless if they
 are needed or not. So, this code can only be used if there's no GET in
 the stream (which is unlikely for a large stream). I believe Collin
 made this trade-off for performance reasons. However, it wouldn't be
 hard to make the current code to work like pickletools.optimize().

The optimization pass is only run if you don't use any GETs. The
optimization is also disabled if you're writing to a file-like object.
These tradeoffs were appropriate for the workload I was optimizing
against.

Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Unpickling memory usage problem, and a proposed solution

2010-04-23 Thread Collin Winter
On Fri, Apr 23, 2010 at 11:53 AM, Collin Winter collinwin...@google.com wrote:
 On Fri, Apr 23, 2010 at 11:49 AM, Alexandre Vassalotti
 alexan...@peadrop.com wrote:
 On Fri, Apr 23, 2010 at 2:38 PM, Alexandre Vassalotti
 alexan...@peadrop.com wrote:
 Collin Winter wrote a simple optimization pass for cPickle in Unladen
 Swallow [1]. The code reads through the stream and remove all the
 unnecessary PUTs in-place.


 I just noticed the code removes *all* PUT opcodes, regardless if they
 are needed or not. So, this code can only be used if there's no GET in
 the stream (which is unlikely for a large stream). I believe Collin
 made this trade-off for performance reasons. However, it wouldn't be
 hard to make the current code to work like pickletools.optimize().

 The optimization pass is only run if you don't use any GETs. The
 optimization is also disabled if you're writing to a file-like object.
 These tradeoffs were appropriate for the workload I was optimizing
 against.

I should add that, adding the necessary bookkeeping to remove only
unused PUTs (instead of the current all-or-nothing scheme) should not
be hard. I'd watch out for a further performance/memory hit; the
pickling benchmarks in the benchmark suite should help assess this.
The current optimization penalizes pickling to speed up unpickling,
which made sense when optimizing pickles that would go into memcache
and be read out 13-15x more often than they were written.

Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Unpickling memory usage problem, and a proposed solution

2010-04-23 Thread Collin Winter
On Fri, Apr 23, 2010 at 1:53 PM, Alexandre Vassalotti
alexan...@peadrop.com wrote:
 On Fri, Apr 23, 2010 at 3:57 PM, Dan Gindikin dgindi...@gmail.com wrote:
 This wouldn't help our use case, your code needs the entire pickle
 stream to be in memory, which in our case would be about 475mb, this
 is on top of the 300mb+ data structures that generated the pickle
 stream.


 In that case, the best we could do is a two-pass algorithm to remove
 the unused PUTs. That won't be efficient, but it will satisfy the
 memory constraint. Another solution is to not generate the PUTs at all
 by setting the 'fast' attribute on Pickler. But that won't work if you
 have a recursive structure, or have code that requires that the
 identity of objects to be preserved.

I don't think it's possible in general to remove any PUTs if the
pickle is being written to a file-like object. It is possible to reuse
a single Pickler to pickle multiple objects: this causes the Pickler's
memo dict to be shared between the objects being pickled. If you
pickle foo, bar, and baz, foo may not have any GETs, but bar and baz
may have GETs that reference data added to the memo by foo's PUT
operations. Because you can't know what will be written to the
file-like object later, you can't remove any of the PUT instructions
in this scenario.

This kind of thing is done in real-world code like cvs2svn (which I
broke when I was optimizing cPickle; don't break cvs2svn, it's not fun
to fix :). I added some basic tests for this support in cPython's
Lib/test/pickletester.py.

There might be room for app-specific optimizations that do this, but
I'm not sure it would work for a general-usage cPickle that needs to
stay compatible with the current system.

Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] interesting article on regex performance

2010-03-12 Thread Collin Winter
On Fri, Mar 12, 2010 at 8:12 AM, Nick Coghlan ncogh...@gmail.com wrote:
[snip]
 To bring this on-topic for python-dev by considering how it could apply
 to Python's default re engine, I think the key issue is that any updates
 to the default engine would need to remain backwards compatible with all
 of the tricks that re2 doesn't support.

 There are major practical problems associated with making such a leap
 directly (Google's re2 engine is in C++ rather than C and we'd have to
 keep the existing implementation around regardless to handle the
 features that re2 doesn't support).

I don't see why C++ would be a deal-breaker in this case, since it
would be restricted to an extension module.

 I would say it is better to let re2 bake for a while and see if anyone
 is motivated to come up with Python bindings for it and release them on
 PyPI.

FWIW, re2 is heavily, heavily used in production at Google.
Stabilizing any proposed Python bindings would be a good idea, but I'm
not sure how much more baking re2's core functionality needs.

 Once that happens (and assuming the bindings earn a good
 reputation), the first step towards integration would be to include a
 See Also in the re module documentation to point people towards the
 more limited (but more robust) regex engine implementation.

 The next step would probably be a hybrid third party library that
 exploits the NFA approach when it can, but resorts to backtracking when
 it has to in order to handle full regex functionality. (Although
 developers would need to be able to disable the backtracking support in
 order to restore re2's guarantees of linear time execution)

We considered such a hybrid approach for Unladen Swallow, but rejected
it on the advice of the V8 team [1]: you end up maintaining two
totally separate, incompatible regex engines; the hybrid system comes
with interesting, possibly unintuitive performance/correctness issues
when bailing from one implementation to another; performance is
unstable as small changes are made to the regexes; and it's not
obvious that enough Python regexes would benefit from re2's
performance/restrictions tradeoff to make such a hybrid system
worthwhile in the long term. (There is no representative corpus of
real-world Python regexes weighted for dynamic execution frequency to
use in assessing such tradeoffs empirically like there is for
JavaScript.)

re2 is very useful when you want to run user-provided regexes and want
to protect your backends against pathological/malicious regex input,
but I'm not sure how applicable it is to Python. I think there are
more promising strategies to improve regex performance, such as
reusing the new JIT infrastructure to JIT-compile regular expressions
to machine code (along the lines of V8's irregexp). Some work has
already been done in this direction, and I'd be thrilled to mentor any
GSoC students interested in working on such a project this summer.

Lastly, anyone interested in working on Python regex performance
should take a look at the regex benchmarks in the standard benchmark
suite [2].

Thanks,
Collin Winter

[1] - 
http://blog.chromium.org/2009/02/irregexp-google-chromes-new-regexp.html#c4843826268005492354
[2] - http://hg.python.org/benchmarks/file/5b8fe389710b/performance
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] interesting article on regex performance

2010-03-12 Thread Collin Winter
On Fri, Mar 12, 2010 at 11:29 AM,  s...@pobox.com wrote:

     There are major practical problems associated with making such a leap
     directly (Google's re2 engine is in C++ rather than C and we'd have
     to keep the existing implementation around regardless to handle the
     features that re2 doesn't support).

    Collin I don't see why C++ would be a deal-breaker in this case, since
    Collin it would be restricted to an extension module.

 Traditionally Python has run on some (minority) platforms where C++ was
 unavailable.  While the re module is a dynamically linked extension module
 and thus could be considered optional, I doubt anybody thinks of it as
 optional nowadays.  It's used in the regression test suite anyway.  It would
 be tough to run unit tests on such minority platforms without it.  You'd
 have to maintain both the current sre implementation and the new re2
 implementation for a long while into the future.

re2 is not a full replacement for Python's current regex semantics: it
would only serve as an accelerator for a subset of the current regex
language. Given that, it makes perfect sense that it would be optional
on such minority platforms (much like the incoming JIT).

Collin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Caching function pointers in type objects

2010-03-03 Thread Collin Winter
Hey Daniel,

On Wed, Mar 3, 2010 at 1:24 PM, Daniel Stutzbach
dan...@stutzbachenterprises.com wrote:
 On Tue, Mar 2, 2010 at 9:06 PM, Reid Kleckner r...@mit.edu wrote:

 I don't think this will help you solve your problem, but one thing
 we've done in unladen swallow is to hack PyType_Modified to invalidate
 our own descriptor caches.  We may eventually want to extend that into
 a callback interface, but it probably will never be considered an API
 that outside code should depend on.

 Thanks Reid and Benjamin for the information.

 I think I see a way to dramatically speed up PyObject_RichCompareBool when
 comparing immutable, built-in, non-container objects (int, float, str,
 etc.).  It would speed up list.sort when the key is one of those types, as
 well as most operations on the ubiquitous dictionary with str keys.

That definitely sounds worth pursuing.

 Is that a worthwhile avenue to pursue, or is it likely to be redundant with
 Unladen Swallow's optimizations?

I don't believe it will be redundant with the optimizations in Unladen Swallow.

 If I can find time to pursue it, would it be best for me to implement it as
 a patch to Unladen Swallow, CPython trunk, or CPython py3k?

I would recommend patching py3k, with a backport to trunk.

Thanks,
Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Caching function pointers in type objects

2010-03-03 Thread Collin Winter
On Wed, Mar 3, 2010 at 2:41 PM, Daniel Stutzbach
dan...@stutzbachenterprises.com wrote:
 On Wed, Mar 3, 2010 at 4:34 PM, Collin Winter collinwin...@google.com
 wrote:

 I would recommend patching py3k, with a backport to trunk.

 After thinking it over, I'm inclined to patch trunk, so I can run the
 Unladen Swallow macro-benchmarks, then forward-port to py3k.

 I'm correct in understanding that it will be a while before the Unladen
 Swallow benchmarks can support Python 3, right?

That's correct; porting the full benchmark suite to Python 3 will
require projects like Django to support Python 3.

Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Packaging JIT-less versions of Python

2010-03-02 Thread Collin Winter
Hey David,

On Mon, Mar 1, 2010 at 7:29 PM, David Malcolm dmalc...@redhat.com wrote:
 On Mon, 2010-03-01 at 15:35 -0800, Collin Winter wrote:
[snip]
 - How would you prefer to build the JIT-less package (current options:
 via a ./configure flag; or by deleting _llvmjit.so from the
 JIT-enabled package)?
 - Would the two packages be able to exist side-by-side, or would they
 be mutually exclusive?

 I have a particular interest in ABI compatibility: if turning JIT on and
 off is going to change the ABI of extension modules, that would be a
 major pain, as I hope that we will have dozens of C extension modules
 available via RPM for our Python 3 stack by the time of the great
 unladen merger.

Do you have a good way of testing ABI compatibility, or is it just
build modules, see if they work? Some general way of testing ABI
compatibility would be really useful for PEP 384, too.

 So I'm keen for the ability to toggle the JIT code in the face of bugs
 and have it not affect ABI.  -Xjit will do this at runtime (once
 that's renamed), but I think it would be useful to be able to toggle the
 JIT on/off default during the build, so that I can fix a broken
 architecture for non-technical users, but have individual testers opt
 back in with -Xjit whilst tracking down a major bug.

That's something we can definitely do: you'd just change the default
value for the -Xjit flag from whenhot to never. Those individual
testers would pass -Xjit=whenhot to opt back in. We could make that a
./configure flag if it would be useful to you and the other distros.

 In either case, I don't want to have to recompile 30 extension modules
 to try with/without JIT; that would introduce too much change during
 bug-hunts, and be no fun at all.

That would suck indeed; I want to avoid that. I think that kind of
thing falls under PEP 384, which we will have to obey once it is
accepted.

 (In the blue-sky nirvana future, I'd love to be able to ship
 ahead-of-time compiled versions of the stdlib, pre-optimized based on
 realworld workloads.  Back in my reality, though, I have bugs to fix
 before I can work on _that_ patch :( )

Reid Kleckner may be looking at that for his Master's project. It's
definitely doable.

 My strong preference would be to have the JIT included by default so
 that it receives as much testing as possible.

 Sounds reasonable.  Hope the above made sense and is useful.

Thanks for your perspective,
Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Packaging JIT-less versions of Python

2010-03-01 Thread Collin Winter
Hey packaging guys,

We recently committed a change to Unladen Swallow [1] that moves all
the JIT infrastructure into a Python extension module. The theory [2]
behind this patch was that this would make it easier for downstream
packagers to ship a JIT-less Python package, with the JIT compiler
available via an optional add-on package.

Some questions for you, so we're not shooting blind here:
- Have you guys thought about how a JIT-enabled Python 3 installation
would be packaged by your respective distros?
- Would you prefer the default python3.x package to have a JIT, or
would you omit the JIT by default?
- How would you prefer to build the JIT-less package (current options:
via a ./configure flag; or by deleting _llvmjit.so from the
JIT-enabled package)?
- Would the two packages be able to exist side-by-side, or would they
be mutually exclusive?

My strong preference would be to have the JIT included by default so
that it receives as much testing as possible.

Thanks,
Collin Winter

[1] - http://code.google.com/p/unladen-swallow/source/detail?r=1110
[2] - http://code.google.com/p/unladen-swallow/issues/detail?id=136
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mercurial repository for Python benchmarks

2010-02-22 Thread Collin Winter
On Sun, Feb 21, 2010 at 9:43 PM, Collin Winter collinwin...@google.com wrote:
 Hey Daniel,

 On Sun, Feb 21, 2010 at 4:51 PM, Daniel Stutzbach
 dan...@stutzbachenterprises.com wrote:
 On Sun, Feb 21, 2010 at 2:28 PM, Collin Winter collinwin...@google.com
 wrote:

 Would it be possible for us to get a Mercurial repository on
 python.org for the Unladen Swallow benchmarks? Maciej and I would like
 to move the benchmark suite out of Unladen Swallow and into
 python.org, where all implementations can share it and contribute to
 it. PyPy has been adding some benchmarks to their copy of the Unladen
 benchmarks, and we'd like to have as well, and Mercurial seems to be
 an ideal solution to this.

 If and when you have a benchmark repository set up, could you announce it
 via a reply to this thread?  I'd like to check it out.

 Will do.

The benchmarks repository is now available at
http://hg.python.org/benchmarks/. It contains all the benchmarks that
the Unladen Swallow svn repository contains, including the beginnings
of a README.txt that describes the available benchmarks and a
quick-start guide for running perf.py (the main interface to the
benchmarks). This will eventually contain all the information from
http://code.google.com/p/unladen-swallow/wiki/Benchmarks, as well as
guidelines on how to write good benchmarks.

If you have svn commit access, you should be able to run `hg clone
ssh://h...@hg.python.org/repos/benchmarks`. I'm not sure how to get
read-only access; Dirkjan can comment on that.

Still todo:
- Replace the static snapshots of 2to3, Mercurial and other hg-based
projects with clones of the respective repositories.
- Fix the 2to3 and nbody benchmarks to work with Python 2.5 for Jython and PyPy.
- Import some of the benchmarks PyPy has been using.

Any access problems with the hg repo should be directed to Dirkjan.
Thanks so much for getting the repo set up so fast!

Thanks,
Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 385 progress report

2010-02-22 Thread Collin Winter
On Sat, Feb 13, 2010 at 2:23 PM, Benjamin Peterson benja...@python.org wrote:
 2010/2/13 Martin v. Löwis mar...@v.loewis.de:
 I still think that the best approach for projects to use 2to3 is to run
 2to3 at install time from a single-source release. For that, projects
 will have to adjust to whatever bugs certain 2to3 releases have, rather
 than requiring users to download a newer version of 2to3 that fixes
 them. For this use case, a tightly-integrated lib2to3 (with that name
 and sole purpose) is the best thing.

 Alright. That is reasonable.

 The other thing is that we will loose some vcs history and some
 history granularity by switching development to the trunk version,
 since just the svnmerged revisions will be converted.

So the consensus is that 2to3 should be pulled out of the main Python
tree? Should the 2to3 hg repository be deleted, then?

Thanks,
Collin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mercurial repository for Python benchmarks

2010-02-22 Thread Collin Winter
On Mon, Feb 22, 2010 at 3:17 PM, Collin Winter collinwin...@google.com wrote:
 The benchmarks repository is now available at
 http://hg.python.org/benchmarks/. It contains all the benchmarks that
 the Unladen Swallow svn repository contains, including the beginnings
 of a README.txt that describes the available benchmarks and a
 quick-start guide for running perf.py (the main interface to the
 benchmarks). This will eventually contain all the information from
 http://code.google.com/p/unladen-swallow/wiki/Benchmarks, as well as
 guidelines on how to write good benchmarks.

We now have a Benchmarks component in the bug tracker. Suggestions
for new benchmarks, feature requests for perf.py, and bugs in existing
benchmarks should be reported under that component.

Thanks,
Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 385 progress report

2010-02-22 Thread Collin Winter
On Mon, Feb 22, 2010 at 4:27 PM, Martin v. Löwis mar...@v.loewis.de wrote:
 The other thing is that we will loose some vcs history and some
 history granularity by switching development to the trunk version,
 since just the svnmerged revisions will be converted.

 So the consensus is that 2to3 should be pulled out of the main Python
 tree?

 Not sure what you mean by pull out; I had expect that the right verb
 should be pull into: 2to3 should be pulled into the main Python tree.

Sorry, I meant pulled out as in: I want an updated version for the
benchmark suite, where should I get that?

 Should the 2to3 hg repository be deleted, then?

 Which one? To my knowledge, there is no official 2to3 repository yet.
 When the switchover happens, 2to3 should not be converted to its own hg
 repository, yes.

This one: http://hg.python.org/2to3

Collin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 385 progress report

2010-02-22 Thread Collin Winter
On Mon, Feb 22, 2010 at 5:03 PM, Nick Coghlan ncogh...@gmail.com wrote:
 Dirkjan Ochtman wrote:
 On Mon, Feb 22, 2010 at 16:09, Collin Winter coll...@gmail.com wrote:
 So the consensus is that 2to3 should be pulled out of the main Python
 tree? Should the 2to3 hg repository be deleted, then?

 Wouldn't the former be reason to officialize the hg repository,
 instead of deleting it?

 I think the difference between pull out and pull from is causing
 confusion here (and no, I'm not sure which of those Collin actually
 meant either).

Sorry, I meant pull from. I want an updated snapshot of 2to3 for the
benchmark suite, and I'm looking for the best place to grab it from.

Collin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Mercurial repository for Python benchmarks

2010-02-21 Thread Collin Winter
Hey Dirkjan,

Would it be possible for us to get a Mercurial repository on
python.org for the Unladen Swallow benchmarks? Maciej and I would like
to move the benchmark suite out of Unladen Swallow and into
python.org, where all implementations can share it and contribute to
it. PyPy has been adding some benchmarks to their copy of the Unladen
benchmarks, and we'd like to have as well, and Mercurial seems to be
an ideal solution to this.

Thanks,
Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mercurial repository for Python benchmarks

2010-02-21 Thread Collin Winter
On Sun, Feb 21, 2010 at 3:31 PM, Dirkjan Ochtman djc.ocht...@gmail.com wrote:
 Hi Collin (and others),

 On Sun, Feb 21, 2010 at 15:28, Collin Winter collinwin...@google.com wrote:
 Would it be possible for us to get a Mercurial repository on
 python.org for the Unladen Swallow benchmarks? Maciej and I would like
 to move the benchmark suite out of Unladen Swallow and into
 python.org, where all implementations can share it and contribute to
 it. PyPy has been adding some benchmarks to their copy of the Unladen
 benchmarks, and we'd like to have as well, and Mercurial seems to be
 an ideal solution to this.

 Just a repository on hg.python.org?

 Sounds good to me. Are you staying for the sprints? We'll just do it.
 (Might need to figure out some hooks we want to put up with it.)

Yep, that's all we want. I'll be around for the sprints through
Tuesday, sitting at the Unladen Swallow sprint.

Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mercurial repository for Python benchmarks

2010-02-21 Thread Collin Winter
Hey Daniel,

On Sun, Feb 21, 2010 at 4:51 PM, Daniel Stutzbach
dan...@stutzbachenterprises.com wrote:
 On Sun, Feb 21, 2010 at 2:28 PM, Collin Winter collinwin...@google.com
 wrote:

 Would it be possible for us to get a Mercurial repository on
 python.org for the Unladen Swallow benchmarks? Maciej and I would like
 to move the benchmark suite out of Unladen Swallow and into
 python.org, where all implementations can share it and contribute to
 it. PyPy has been adding some benchmarks to their copy of the Unladen
 benchmarks, and we'd like to have as well, and Mercurial seems to be
 an ideal solution to this.

 If and when you have a benchmark repository set up, could you announce it
 via a reply to this thread?  I'd like to check it out.

Will do.

In the meantime, you can read
http://code.google.com/p/unladen-swallow/wiki/Benchmarks to find out
how to check out the current draft of the benchmarks, as well as which
benchmarks are currently included.

Thanks,
Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-02-18 Thread Collin Winter
On Sat, Feb 13, 2010 at 12:12 AM, Maciej Fijalkowski fij...@gmail.com wrote:
 I like this wording far more. It's at the very least far more precise.
 Those examples are fair enough (except the fact that PyPy is not 32bit
 x86 only, the JIT is).
[snip]
 slower than US on some workloads is true, while not really telling
 much to a potential reader. For any X and Y implementing the same
 language X is faster than Y on some workloads is usually true.

 To be precise you would need to include the above table in the PEP,
 which is probably a bit too much, given that PEP is not about PyPy at
 all. I'm fine with any wording that is at least correct.

I've updated the language:
http://codereview.appspot.com/186247/diff2/9005:11001/11002. Thanks
for the clarifications.

Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-02-12 Thread Collin Winter
Hey Maciej,

On Thu, Feb 11, 2010 at 6:39 AM, Maciej Fijalkowski fij...@gmail.com wrote:
 Snippet from:

 http://codereview.appspot.com/186247/diff2/5014:8003/7002

 *PyPy*: PyPy [#pypy]_ has good performance on numerical code, but is
 slower than Unladen Swallow on non-numerical workloads. PyPy only
 supports 32-bit x86 code generation. It has poor support for CPython
 extension modules, making migration for large applications
 prohibitively expensive.

 That part at the very least has some sort of personal opinion
 prohibitively,

Of course; difficulty is always in the eye of the person doing the
work. Simply put, PyPy is not a drop-in replacement for CPython: there
is no embedding API, much less the same one exported by CPython;
important libraries, such as MySQLdb and pycrypto, do not build
against PyPy; PyPy is 32-bit x86 only.

All of these problems can be overcome with enough time/effort/money,
but I think you'd agree that, if all I'm trying to do is speed up my
application, adding a new x86-64 backend or implementing support for
CPython extension modules is certainly north of prohibitively
expensive. I stand by that wording. I'm willing to enumerate all of
PyPy's deficiencies in this regard in the PEP, rather than the current
vaguer wording, if you'd prefer.

 while the other part is not completely true slower
 than US on non-numerical workloads. Fancy providing a proof for that?
 I'm well aware that there are benchmarks on which PyPy is slower than
 CPython or US, however, I would like a bit more weighted opinion in
 the PEP.

Based on the benchmarks you're running at
http://codespeak.net:8099/plotsummary.html, PyPy is slower than
CPython on many non-numerical workloads, which Unladen Swallow is
faster than CPython at. Looking at the benchmarks there at which PyPy
is faster than CPython, they are primarily numerical; this was the
basis for the wording in the PEP.

My own recent benchmarking of PyPy and Unladen Swallow (both trunk;
PyPy wouldn't run some benchmarks):

| Benchmark| PyPy  | Unladen | Change  |
+==+===+=+=+
| ai   | 0.61  | 0.51|  1.1921x faster |
| django   | 0.68  | 0.8 |  1.1898x slower |
| float| 0.03  | 0.07|  2.7108x slower |
| html5lib | 20.04 | 16.42   |  1.2201x faster |
| pickle   | 17.7  | 1.09| 16.2465x faster |
| rietveld | 1.09  | 0.59|  1.8597x faster |
| slowpickle   | 0.43  | 0.56|  1.2956x slower |
| slowspitfire | 2.5   | 0.63|  3.9853x faster |
| slowunpickle | 0.26  | 0.27|  1.0585x slower |
| unpickle | 28.45 | 0.78| 36.6427x faster |

I'm happy to change the wording to slower than US on some workloads.

Thanks,
Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-02-09 Thread Collin Winter
To follow up on some of the open issues:

On Wed, Jan 20, 2010 at 2:27 PM, Collin Winter collinwin...@google.com wrote:
[snip]
 Open Issues
 ===

 - *Code review policy for the ``py3k-jit`` branch.* How does the CPython
  community want us to procede with respect to checkins on the ``py3k-jit``
  branch? Pre-commit reviews? Post-commit reviews?

  Unladen Swallow has enforced pre-commit reviews in our trunk, but we realize
  this may lead to long review/checkin cycles in a purely-volunteer
  organization. We would like a non-Google-affiliated member of the CPython
  development team to review our work for correctness and compatibility, but we
  realize this may not be possible for every commit.

The feedback we've gotten so far is that at most, only larger, more
critical commits should be sent for review, while most commits can
just go into the branch. Is that broadly agreeable to python-dev?

 - *How to link LLVM.* Should we change LLVM to better support shared linking,
  and then use shared linking to link the parts of it we need into CPython?

The consensus has been that we should link shared against LLVM.
Jeffrey Yasskin is now working on this in upstream LLVM. We are
tracking this at
http://code.google.com/p/unladen-swallow/issues/detail?id=130 and
http://llvm.org/PR3201.

 - *Prioritization of remaining issues.* We would like input from the CPython
  development team on how to prioritize the remaining issues in the Unladen
  Swallow codebase. Some issues like memory usage are obviously critical before
  merger with ``py3k``, but others may fall into a nice to have category that
  could be kept for resolution into a future CPython 3.x release.

The big-ticket items here are what we expected: reducing memory usage
and startup time. We also need to improve profiling options, both for
oProfile and cProfile.

 - *Create a C++ style guide.* Should PEP 7 be extended to include C++, or
  should a separate C++ style PEP be created? Unladen Swallow maintains its own
  style guide [#us-styleguide]_, which may serve as a starting point; the
  Unladen Swallow style guide is based on both LLVM's [#llvm-styleguide]_ and
  Google's [#google-styleguide]_ C++ style guides.

Any thoughts on a CPython C++ style guide? My personal preference
would be to extend PEP 7 to cover C++ by taking elements from
http://code.google.com/p/unladen-swallow/wiki/StyleGuide and the LLVM
and Google style guides (which is how we've been developing Unladen
Swallow). If that's broadly agreeable, Jeffrey and I will work on a
patch to PEP 7.

Thanks,
Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-02-08 Thread Collin Winter
Hi Craig,

On Tue, Feb 2, 2010 at 4:42 PM, Craig Citro craigci...@gmail.com wrote:
 Done. The diff is at
 http://codereview.appspot.com/186247/diff2/5014:8003/7002. I listed
 Cython, Shedskin and a bunch of other alternatives to pure CPython.
 Some of that information is based on conversations I've had with the
 respective developers, and I'd appreciate corrections if I'm out of
 date.


 Well, it's a minor nit, but it might be more fair to say something
 like Cython provides the biggest improvements once type annotations
 are added to the code. After all, Cython is more than happy to take
 arbitrary Python code as input -- it's just much more effective when
 it knows something about types. The code to make Cython handle
 closures has just been merged ... hopefully support for the full
 Python language isn't so far off. (Let me know if you want me to
 actually make a comment on Rietveld ...)

Indeed, you're quite right. I've corrected the description here:
http://codereview.appspot.com/186247/diff2/7005:9001/10001

 Now what's more interesting is whether or not U-S and Cython could
 play off one another -- take a Python program, run it with some
 generic input data under Unladen and record info about which
 functions are hot, and what types they tend to take, then let
 Cython/gcc -O3 have a go at these, and lather, rinse, repeat ... JIT
 compilation and static compilation obviously serve different purposes,
 but I'm curious if there aren't other interesting ways to take
 advantage of both.

Definitely! Someone approached me about possibly reusing the profile
data for a feedback-enhanced code coverage tool, which has interesting
potential, too. I've added a note about this under the Future Work
section: http://codereview.appspot.com/186247/diff2/9001:10002/9003

Thanks,
Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] API for VM branches (was: PEP 3146)

2010-02-02 Thread Collin Winter
[Moving to python-ideas; python-dev to bcc]

On Tue, Feb 2, 2010 at 2:02 AM, M.-A. Lemburg m...@egenix.com wrote:
 Collin Winter wrote:
[snip]
 If such a restrictive plugin-based scheme had been available when we
 began Unladen Swallow, I do not doubt that we would have ignored it
 entirely. I do not like the idea of artificially tying the hands of
 people trying to make CPython faster. I do not see any part of Unladen
 Swallow that would have been made easier by such a scheme. If
 anything, it would have made our project more difficult.

 I don't think that it has to be restrictive - much to the contrary,
 it would provide a consistent API to those CPython internals and
 also clarify the separation between the various parts. Something
 which currently does not exist in CPython.

We do not need an API to CPython's internals: we are not interfacing
with them, we are replacing and augmenting them.

 Note that it may be easier for you (and others) to just take
 CPython and patch it as necessary. However, this doesn't relieve
 you from the needed maintenance - which, I presume, is one of the
 reasons why you are suggesting to merge U-S back into CPython ;-)

That is incorrect. In the year we have been working on Unladen
Swallow, we have only updated our vendor branch of CPython 2.6 once,
going from 2.6.1 to 2.6.4. We have occasionally cherrypicked patches
from the 2.6 maintenance branch to fix specific problems. The
maintenance required by upstream CPython changes has been effectively
zero.

We are seeking to merge with CPython for three reasons: 1) verify that
python-dev is interested in this project, and that we are not wasting
our time; 2) expose the codebase to a wider, more heterogenous testing
environment; 3) accelerate development by having more hands on the
code. Upstream maintenance requirements have had zero impact on our
planning.

In any case, I'll be interested in reading your PEP that outlines how
the plugin interface should work, which systems will be pluggable, and
exactly how Unladen Swallow, WPython and Stackless would benefit.
Let's move further discussion of this to python-ideas until there's
something more concrete here. The py3k-jit branch will live long
enough that we could update it to work with a plugin system, assuming
it is demonstrated to be beneficial.

Thanks,
Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-02-02 Thread Collin Winter
Hey Dirkjan,

[Circling back to this part of the thread]

On Thu, Jan 21, 2010 at 1:37 PM, Dirkjan Ochtman dirk...@ochtman.nl wrote:
 On Thu, Jan 21, 2010 at 21:14, Collin Winter collinwin...@google.com wrote:
[snip]
 My quick take on Cython and Shedskin is that they are
 useful-but-limited workarounds for CPython's historically-poor
 performance. Shedskin, for example, does not support the entire Python
 language or standard library
 (http://shedskin.googlecode.com/files/shedskin-tutorial-0.3.html).

 Perfect, now put something like this in the PEP, please. ;)

Done. The diff is at
http://codereview.appspot.com/186247/diff2/5014:8003/7002. I listed
Cython, Shedskin and a bunch of other alternatives to pure CPython.
Some of that information is based on conversations I've had with the
respective developers, and I'd appreciate corrections if I'm out of
date.

Thanks,
Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-02-02 Thread Collin Winter
Hey MA,

On Fri, Jan 29, 2010 at 11:14 AM, M.-A. Lemburg m...@egenix.com wrote:
 Collin Winter wrote:
 I added startup benchmarks for Mercurial and Bazaar yesterday
 (http://code.google.com/p/unladen-swallow/source/detail?r=1019) so we
 can use them as more macro-ish benchmarks, rather than merely starting
 the CPython binary over and over again. If you have ideas for better
 Mercurial/Bazaar startup scenarios, I'd love to hear them. The new
 hg_startup and bzr_startup benchmarks should give us some more data
 points for measuring improvements in startup time.

 One idea we had for improving startup time for apps like Mercurial was
 to allow the creation of hermetic Python binaries, with all
 necessary modules preloaded. This would be something like Smalltalk
 images. We haven't yet really fleshed out this idea, though.

 In Python you can do the same with the freeze.py utility. See

 http://www.egenix.com/www2002/python/mxCGIPython.html

 for an old project where we basically put the Python
 interpreter and stdlib into a single executable.

 We've recently revisited that project and created something
 we call pyrun. It fits Python 2.5 into a single executable
 and a set of shared modules (which for various reasons cannot
 be linked statically)... 12MB in total.

 If you load lots of modules from the stdlib this does provide
 a significant improvement over standard Python.

Good to know there are options. One feature we had in mind for a
system of this sort would be the ability to take advantage of the
limited/known set of modules in the image to optimize the application
further, similar to link-time optimizations in gcc/LLVM
(http://www.airs.com/blog/archives/100).

 Back to the PEP's proposal:

 Looking at the data you currently have, the negative results
 currently don't really look good in the light of the small
 performance improvements.

The JIT compiler we are offering is more than just its current
performance benefit. An interpreter loop will simply never be as fast
as machine code. An interpreter loop, no matter how well-optimized,
will hit a performance ceiling and before that ceiling will run into
diminishing returns. Machine code is a more versatile optimization
target, and as such, allows many optimizations that would be
impossible or prohibitively difficult in an interpreter.

Unladen Swallow offers a platform to extract increasing performance
for years to come. The current generation of modern, JIT-based
JavaScript engines are instructive in this regard: V8 (which I'm most
familiar with) delivers consistently improving performance
release-over-release (see the graphs at the top of
http://googleblog.blogspot.com/2009/09/google-chrome-after-year-sporting-new.html).
 I'd like to see CPython be able to achieve the same thing, like the
new implementations of JavaScript and Ruby are able to do.

We are aware that Unladen Swallow is not finished; that's why we're
not asking to go into py3k directly. Unladen Swallow's memory usage
will continue to decrease, and its performance will only go up. The
current state is not its permanent state; I'd hate to see the perfect
become the enemy of the good.

 Wouldn't it be possible to have the compiler approach work
 in three phases in order to reduce the memory footprint and
 startup time hit, ie.

  1. run an instrumented Python interpreter to collect all
    the needed compiler information; write this information into
    a .pys file (Python stats)

  2. create compiled versions of the code for various often
    used code paths and type combinations by reading the
    .pys file and generating an .so file as regular
    Python extension module

  3. run an uninstrumented Python interpreter and let it
    use the .so files instead of the .py ones

 In production, you'd then only use step 3 and avoid the
 overhead of steps 1 and 2.

That is certainly a possibility if we are unable to reduce memory
usage to a satisfactory level. I've added a Contingency Plans
section to the PEP, including this option:
http://codereview.appspot.com/186247/diff2/8004:7005/8006.

Thanks,
Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com



Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-02-01 Thread Collin Winter
Hey MA,

On Mon, Feb 1, 2010 at 9:58 AM, M.-A. Lemburg m...@egenix.com wrote:
 BTW: Some years ago we discussed the idea of pluggable VMs for
 Python. Wouldn't U-S be a good motivation to revisit this idea ?

 We could then have a VM based on byte code using a stack
 machines, one based on word code using a register machine
 and perhaps one that uses the Stackless approach.

What is the usecase for having pluggable VMs? Is the idea that, at
runtime, the user would select which virtual machine they want to run
their code under? How would the user make that determination
intelligently?

I think this idea underestimates a) how deeply the current CPython VM
is intertwined with the rest of the implementation, and b) the nature
of the changes required by these separate VMs. For example, Unladen
Swallow adds fields to the C-level structs for dicts, code objects and
frame objects; how would those changes be pluggable? Stackless
requires so many modifications that it is effectively a fork; how
would those changes be pluggable?

Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-02-01 Thread Collin Winter
On Mon, Feb 1, 2010 at 11:17 AM, M.-A. Lemburg m...@egenix.com wrote:
 Collin Winter wrote:
 I think this idea underestimates a) how deeply the current CPython VM
 is intertwined with the rest of the implementation, and b) the nature
 of the changes required by these separate VMs. For example, Unladen
 Swallow adds fields to the C-level structs for dicts, code objects and
 frame objects; how would those changes be pluggable? Stackless
 requires so many modifications that it is effectively a fork; how
 would those changes be pluggable?

 They wouldn't be pluggable. Such changes would have to be made
 in a more general way in order to serve more than just one VM.

I believe these VMs would have little overlap. I cannot imagine that
Unladen Swallow's needs have much in common with Stackless's, or with
those of a hypothetical register machine to replace the current stack
machine.

Let's consider that last example in more detail: a register machine
would require completely different bytecode. This would require
replacing the bytecode compiler, the peephole optimizer, and the
bytecode eval loop. The frame object would need to be changed to hold
the registers and a new blockstack design; the code object would have
to potentially hold a new bytecode layout.

I suppose making all this pluggable would be possible, but I don't see
the point. This kind of experimentation is ideal for a branch: go off,
test your idea, report your findings, merge back. Let the branch be
long-lived, if need be. The Mercurial migration will make all this
easier.

 Getting the right would certainly require a major effort, but it
 would also reduce the need to have several branches of C-based
 Python implementations.

If such a restrictive plugin-based scheme had been available when we
began Unladen Swallow, I do not doubt that we would have ignored it
entirely. I do not like the idea of artificially tying the hands of
people trying to make CPython faster. I do not see any part of Unladen
Swallow that would have been made easier by such a scheme. If
anything, it would have made our project more difficult.

Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-29 Thread Collin Winter
On Fri, Jan 29, 2010 at 7:22 AM, Nick Coghlan ncogh...@gmail.com wrote:
 Antoine Pitrou wrote:
 Or you could submit patches piecewise on http://bugs.python.org
 I think the first step would be to switch to 16-bit bytecodes. It would be
 uncontroversial (the increase in code size probably has no negative effect) 
 and
 would provide the foundation for all of your optimizations.

 I wouldn't consider changing from bytecode to wordcode uncontroversial -
 the potential to have an effect on cache hit ratios means it needs to be
 benchmarked (the U-S performance tests should be helpful there).

 It's the same basic problem where any changes to the ceval loop can have
 surprising performance effects due to the way they affect the compiled
 switch statements ability to fit into the cache and other low level
 processor weirdness.

Agreed. We originally switched Unladen Swallow to wordcode in our
2009Q1 release, and saw a performance improvement from this across the
board. We switched back to bytecode for the JIT compiler to make
upstream merger easier. The Unladen Swallow benchmark suite should
provided a thorough assessment of the impact of the wordcode -
bytecode switch. This would be complementary to a JIT compiler, rather
than a replacement for it.

I would note that the switch will introduce incompatibilities with
libraries like Twisted. IIRC, Twisted has a traceback prettifier that
removes its trampoline functions from the traceback, parsing CPython's
bytecode in the process. If running under CPython, it assumes that the
bytecode is as it expects. We broke this in Unladen's wordcode switch.
I think parsing bytecode is a bad idea, but any switch to wordcode
should be advertised widely.

Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-29 Thread Collin Winter
Hey Terry,

On Fri, Jan 29, 2010 at 2:47 PM, Terry Reedy tjre...@udel.edu wrote:
 On 1/29/2010 4:19 PM, Collin Winter wrote:

 On Fri, Jan 29, 2010 at 7:22 AM, Nick Coghlanncogh...@gmail.com  wrote:

 Agreed. We originally switched Unladen Swallow to wordcode in our
 2009Q1 release, and saw a performance improvement from this across the
 board. We switched back to bytecode for the JIT compiler to make
 upstream merger easier. The Unladen Swallow benchmark suite should
 provided a thorough assessment of the impact of the wordcode -
 bytecode switch. This would be complementary to a JIT compiler, rather
 than a replacement for it.

 I would note that the switch will introduce incompatibilities with
 libraries like Twisted. IIRC, Twisted has a traceback prettifier that
 removes its trampoline functions from the traceback, parsing CPython's
 bytecode in the process. If running under CPython, it assumes that the
 bytecode is as it expects. We broke this in Unladen's wordcode switch.
 I think parsing bytecode is a bad idea, but any switch to wordcode
 should be advertised widely.

 Several years, there was serious consideration of switching to a
 registerbased vm, which would have been even more of a change. Since I
 learned 1.4, Guido has consistently insisted that the CPython vm is not part
 of the language definition and, as far as I know, he has rejected any
 byte-code hackery in the stdlib. While he is not one to, say, randomly
 permute the codes just to frustrate such hacks, I believe he has always
 considered vm details private and subject to change and any usage thereof
 'at one's own risk'.

No, I agree entirely: bytecode is an implementation detail that could
be changed at any time. But like reference counting, it's an
implementation detail that people have -- for better or worse -- come
to rely on. My only point was that a switch to wordcode should be
announced prominently in the release notes and not assumed to be
without impact on user code. That people are directly munging CPython
bytecode means that CPython should provide a better, more abstract way
to do the same thing that's more resistant to these kinds of changes.

Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-27 Thread Collin Winter
Hi William,

On Wed, Jan 27, 2010 at 7:26 AM, William Dode w...@flibuste.net wrote:
 Hi (as a simple user),

 I'd like to know why you didn't followed the same way as V8 Javascript,
 or the opposite, why for V8 they didn't choose llvm ?

 I imagine that startup time and memory was also critical for V8.

Startup time and memory usage are arguably *more* critical for a
Javascript implementation, since if you only spend a few milliseconds
executing Javascript code, but your engine takes 10-20ms to startup,
then you've lost. Also, a minimized memory profile is important if you
plan to embed your JS engine on a mobile platform, for example, or you
need to run in a heavily-multiprocessed browser on low-memory consumer
desktops and netbooks.

Among other reasons we chose LLVM, we didn't want to write code
generators for each platform we were targeting. LLVM has done this for
us. V8, on the other hand, has to implement a new code generator for
each new platform they want to target. This is non-trivial work: it
takes a long time, has a lot of finicky details, and it greatly
increases the maintenance burden on the team. We felt that requiring
python-dev to understand code generation on multiple platforms was a
distraction from what python-dev is trying to do -- develop Python. V8
still doesn't have x86-64 code generation working on Windows
(http://code.google.com/p/v8/issues/detail?id=330), so I wouldn't
underestimate the time required for that kind of project.

Thanks,
Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-27 Thread Collin Winter
, but will sometimes cherrypick
upstream patches that we want (since doing the full vendor merge can
take a while). Right now, we're using an unmodified snapshot of LLVM.

I've added language to the PEP to clarify some of these points:
- No in-tree copies of LLVM/Clang:
http://codereview.appspot.com/186247/diff2/5004:5006/5007
- Shared linking of LLVM:
http://codereview.appspot.com/186247/diff2/5006:6007/5008

I've filed http://code.google.com/p/unladen-swallow/issues/detail?id=130
so that we have our own issue to track this, in addition to the
upstream LLVM bug (http://llvm.org/PR3201).

Thanks,
Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-27 Thread Collin Winter
Hi William,

On Wed, Jan 27, 2010 at 11:02 AM, William Dode w...@flibuste.net wrote:
 The startup time and memory comsumption are a limitation of llvm that
 their developers plan to resolve or is it only specific to the current
 python integration ? I mean the work to correct this is more on U-S or
 on llvm ?

Part of it is LLVM, part of it is Unladen Swallow. LLVM is very
flexible, and there's a price for that. We have also found and fixed
several cases of quadratic memory usage in LLVM optimization passes,
and there may be more of those lurking around. On the Unladen Swallow
side, there are doubtless things we can do to improve our usage of
LLVM; http://code.google.com/p/unladen-swallow/issues/detail?id=68 has
most of our work on this, and there are still more ideas to implement.

Part of the issue is that Unladen Swallow is using LLVM's JIT
infrastructure in ways that it really hasn't been used before, and so
there's a fair amount of low-hanging fruit left in LLVM that no-one
has needed to pick yet.

Thanks,
Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-26 Thread Collin Winter
Hi Cesare,

On Tue, Jan 26, 2010 at 12:29 AM, Cesare Di Mauro
cesare.di.ma...@gmail.com wrote:
 Hi Collin,

 One more question: is it easy to support more opcodes, or a different opcode
 structure, in Unladen Swallow project?

I assume you're asking about integrating WPython. Yes, adding new
opcodes to Unladen Swallow is still pretty easy. The PEP includes a
section on this,
http://www.python.org/dev/peps/pep-3146/#experimenting-with-changes-to-python-or-cpython-bytecode,
though it doesn't cover something more complex like converting from
bytecode to wordcode, as a purely hypothetical example ;) Let me know
if that section is unclear or needs more data.

Converting from bytecode to wordcode should be relatively
straightforward, assuming that the arrangement of opcode arguments is
the main change. I believe the only real place you would need to
update is the JIT compiler's bytecode iterator (see
http://code.google.com/p/unladen-swallow/source/browse/trunk/Util/PyBytecodeIterator.cc).
Depending on the nature of the changes, the runtime feedback system
might need to be updated, too, but it wouldn't be too difficult, and
the changes should be localized.

Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-26 Thread Collin Winter
Hey Martin,

On Thu, Jan 21, 2010 at 2:25 PM, Martin v. Löwis mar...@v.loewis.de wrote:
 Reid Kleckner wrote:
 On Thu, Jan 21, 2010 at 4:34 PM, Martin v. Löwis mar...@v.loewis.de 
 wrote:
 How large is the LLVM shared library? One surprising data point is that the
 binary is much larger than some of the memory footprint measurements given 
 in
 the PEP.
 Could it be that you need to strip the binary, or otherwise remove
 unneeded debug information?

 Python is always built with debug information (-g), at least it was in
 2.6.1 which unladen is based off of, and we've made sure to build LLVM
 the same way.  We had to muck with the LLVM build system to get it to
 include debugging information.  On my system, stripping the python
 binary takes it from 82 MB to 9.7 MB.  So yes, it contains extra debug
 info, which explains the footprint measurements.  The question is
 whether we want LLVM built with debug info or not.

 Ok, so if 70MB are debug information, I think a lot of the concerns are
 removed:
 - debug information doesn't consume any main memory, as it doesn't get
  mapped when the process is started.
 - debug information also doesn't take up space in the system
  distributions, as they distribute stripped binaries.

 As 10MB is still 10 times as large as a current Python binary,people
 will probably search for ways to reduce that further, or at least split
 it up into pieces.

70MB of the increase was indeed debug information. Since the Linux
distros that I checked ship stripped Python binaries, I've stripped
the Unladen Swallow binaries as well, and while the size increase is
still significant, it's not as large as it once was.

Stripped CPython 2.6.4: 1.3 MB
Stripped CPython 3.1.1: 1.4 MB
Stripped Unladen r1041: 12 MB

A 9x increase is better than a 20x increase, but it's not great,
either. There is still room to trim the set of LLVM libraries used by
Unladen Swallow, and we're continuing to investigate reducing on-disk
binary size (http://code.google.com/p/unladen-swallow/issues/detail?id=118
tracks this).

I've updated the PEP to reflect this configuration, since it's what
most users will pick up via their system package managers. The exact
change to the PEP wording is
http://codereview.appspot.com/186247/diff2/6001:6003/5002.

Thanks,
Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-25 Thread Collin Winter
Hi Floris,

On Sun, Jan 24, 2010 at 3:40 AM, Floris Bruynooghe
floris.bruynoo...@gmail.com wrote:
 On Sat, Jan 23, 2010 at 10:09:14PM +0100, Cesare Di Mauro wrote:
 Introducing C++ is a big step, also. Aside the problems it can bring on some
 platforms, it means that C++ can now be used by CPython developers. It
 doesn't make sense to force people use C for everything but the JIT part. In
 the end, CPython could become a mix of C and C++ code, so a bit more
 difficult to understand and manage.

 Introducing C++ is a big step, but I disagree that it means C++ should
 be allowed in the other CPython code.  C++ can be problematic on more
 obscure platforms (certainly when static initialisers are used) and
 being able to build a python without C++ (no JIT/LLVM) would be a huge
 benefit, effectively having the option to build an old-style CPython
 at compile time.  (This is why I ased about --without-llvm being able
 not to link with libstdc++).

I'm working on a patch to completely remove all traces of C++ with
configured with --without-llvm. It's a straightforward change, and
should present no difficulties.

For reference, what are these obscure platforms where static
initializers cause problems?

Thanks,
Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-25 Thread Collin Winter
Hi Cesare,

On Sat, Jan 23, 2010 at 1:09 PM, Cesare Di Mauro
cesare.di.ma...@gmail.com wrote:
 Hi Collin

 IMO it'll be better to make Unladen Swallow project a module, to be
 installed and used if needed, so demanding to users the choice of having it
 or not. The same way psyco does, indeed.
 Nowadays it requires too much memory, longer loading time, and fat binaries
 for not-so-great performances. I know that some issues have being worked on,
 but I don't think that they'll show something comparable to the current
 CPython status.

You're proposing that, even once the issues of memory usage and
startup time are addressed, Unladen Swallow should still be an
extension module? I don't see why. You're assuming that these issues
cannot be fixed, which I disagree with.

I think maintaining something like a JIT compiler out-of-line, as
Psyco is, causes long-term maintainability problems. Such extension
modules are forever playing catchup with the CPython code, depending
on implementation details that the CPython developers are right to
regard as open to change. It also limits what kind of optimizations
you can implement or forces those optimizations to be implemented with
workarounds that might be suboptimal or fragile. I'd recommend reading
the Psyco codebase, if you haven't yet.

As others have requested, we are working hard to minimize the impact
of the JIT so that it can be turned off entirely at runtime. We have
an active issue tracking our progress at
http://code.google.com/p/unladen-swallow/issues/detail?id=123.

 Introducing C++ is a big step, also. Aside the problems it can bring on some
 platforms, it means that C++ can now be used by CPython developers.

Which platforms, specifically? What is it about C++ on those platforms
that is problematic? Can you please provide details?

 It
 doesn't make sense to force people use C for everything but the JIT part. In
 the end, CPython could become a mix of C and C++ code, so a bit more
 difficult to understand and manage.

Whether CPython should allow wider usage of C++ or whether developer
should be force[d] to use C is not our decision, and is not part of
this PEP. With the exception of Python/eval.c, we deliberately have
not converted any CPython code to C++ so that if you're not working on
the JIT, python-dev's workflow remains the same. Even within eval.cc,
the only C++ parts are related to the JIT, and so disappear completely
with configured with --without-llvm (or if you're not working on the
JIT).

In any case, developers can easily tell which language to use based on
file extension. The compiler errors that would result from compiling
C++ with a C compiler would be a good indication as well.

 What I see is that LLVM is a too big project for the goal of having just a
 JIT-ed Python VM. It can be surely easier to use and integrate into CPython,
 but requires too much resources

Which resources do you feel that LLVM would tax, machine resources or
developer resources? Are you referring to the portions of LLVM used by
Unladen Swallow, or the entire wider LLVM project, including the
pieces Unladen Swallow doesn't use at runtime?

 (on the contrary, Psyco demands little
 resources, give very good performances, but seems to be like a mess to
 manage and extend).

This is not my experience. For the workloads I have experience with,
Psyco doubles memory usage while only providing a 15-30% speed
improvement. Psyco's benefits are not uniform.

Unladen Swallow has been designed to be much more maintainable and
easier to extend and modify than Psyco: the compiler and its attendant
optimizations are well-tested (see Lib/test/test_llvm.py, for one) and
well-documented (see Python/llvm_notes.txt for one). I think that the
project is bearing out the success of our design: Google's full-time
engineers are a small minority on the project at this point, and
almost all performance-improving patches are coming from non-Google
developers.

Thanks,
Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-25 Thread Collin Winter
Hey Floris,

On Mon, Jan 25, 2010 at 1:25 PM, Floris Bruynooghe
floris.bruynoo...@gmail.com wrote:
 On Mon, Jan 25, 2010 at 10:14:35AM -0800, Collin Winter wrote:
 I'm working on a patch to completely remove all traces of C++ with
 configured with --without-llvm. It's a straightforward change, and
 should present no difficulties.

 Great to hear that, thanks for caring.

This has now been resolved. As of
http://code.google.com/p/unladen-swallow/source/detail?r=1036,
./configure --without-llvm has no dependency on libstdc++:

Before: $ otool -L ./python.exe
./python.exe:
/usr/lib/libSystem.B.dylib
/usr/lib/libstdc++.6.dylib
/usr/lib/libgcc_s.1.dylib


After: $ otool -L ./python.exe
./python.exe:
/usr/lib/libSystem.B.dylib
/usr/lib/libgcc_s.1.dylib

I've explicitly noted this in the PEP (see
http://codereview.appspot.com/186247/diff2/2001:4001/5001).

Thanks,
Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-22 Thread Collin Winter
Hey Tony,

On Fri, Jan 22, 2010 at 10:11 AM, Tony Nelson
tonynel...@georgeanelson.com wrote:
 On 10-01-22 02:53:21, Collin Winter wrote:
 On Thu, Jan 21, 2010 at 11:37 PM, Glyph Lefkowitz
 gl...@twistedmatrix.com wrote:
 
  On Jan 21, 2010, at 6:48 PM, Collin Winter wrote:
  ...
  There's been a recent thread on our mailing list about a patch that
  dramatically reduces the memory footprint of multiprocess
  concurrency by separating reference counts from objects. ...

 Currently, CPython gets a performance advantage from having reference
 counts hot in the cache when the referenced object is used.  There is
 still the write pressure from updating the counts.  With separate
 reference counts, an extra cache line must be loaded from memory (it is
 unlikely to be in the cache unless the program is trivial).  I see from
 the referenced posting that this is a 10% speed hit (the poster
 attributes the hit to extra instructions).

 Perhaps the speed and memory hits could be minimized by only doing this
 for some objects?  Only objects that are fully shared (such as read-
 only data) benefit from this change.  I don't know but shared objects
 may already be treated separately.

One option that we discussed was to create a ./configure flag to
toggle between inline refcounts and separate refcounts. Advanced users
that care about the memory usage of multiprocess concurrency could
compile their own CPython binary to enable this space optimization at
the cost of some performance.

On the other hand, once we get enough performance out of the JIT that
python-dev is willing to take a 10% hit, then I'd say we should just
turn the space optimization on by default. In the meantime, though, a
configure flag would be a useful intermediate point for a number of
people.

Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-22 Thread Collin Winter
Hey Jake,

On Thu, Jan 21, 2010 at 10:48 AM, Jake McGuire mcgu...@google.com wrote:
 On Thu, Jan 21, 2010 at 10:19 AM, Reid Kleckner r...@mit.edu wrote:
 On Thu, Jan 21, 2010 at 12:27 PM, Jake McGuire mcgu...@google.com wrote:
 On Wed, Jan 20, 2010 at 2:27 PM, Collin Winter collinwin...@google.com 
 wrote:
 Profiling
 -

 Unladen Swallow integrates with oProfile 0.9.4 and newer [#oprofile]_ to 
 support
 assembly-level profiling on Linux systems. This means that oProfile will
 correctly symbolize JIT-compiled functions in its reports.

 Do the current python profiling tools (profile/cProfile/pstats) still
 work with Unladen Swallow?

 Sort of.  They disable the use of JITed code, so they don't quite work
 the way you would want them to.  Checking tstate-c_tracefunc every
 line generated too much code.  They still give you a rough idea of
 where your application hotspots are, though, which I think is
 acceptable.

 Hmm.  So cProfile doesn't break, but it causes code to run under a
 completely different execution model so the numbers it produces are
 not connected to reality?

 We've found the call graph and associated execution time information
 from cProfile to be extremely useful for understanding performance
 issues and tracking down regressions.  Giving that up would be a huge
 blow.

FWIW, cProfile's call graph information is still perfectly accurate,
but you're right: turning on cProfile does trigger execution under a
different codepath. That's regrettable, but instrumentation-based
profiling is always going to introduce skew into your numbers. That's
why we opted to improve oProfile, since we believe sampling-based
profiling to be a better model.

Profiling was problematic to support in machine code because in
Python, you can turn profiling on from user code at arbitrary points.
To correctly support that, we would need to add lots of hooks to the
generated code to check whether profiling is enabled, and if so, call
out to the profiler. Those is profiling enabled now? checks are
(almost) always going to be false, which means we spend cycles for no
real benefit.

Can YouTube use oProfile for profiling, or is instrumented profiling
critical? oProfile does have its downsides for profiling user code:
you see all the C-language support functions, not just the pure-Python
functions. That extra data might be useful, but it's probably more
information than most people want. YouTube might want it, though.

Assuming YouTube can't use oProfile as-is, there are some options:
- Write a script around oProfile's reporting tool to strip out all C
functions from the report. Enhance oProfile to fix any deficiencies
compared to cProfile's reporting.
- Develop a sampling profiler for Python that only samples pure-Python
functions, ignoring C code (but including JIT-compiled Python code).
- Add the necessary profiling hooks to JITted code to better support
cProfile, but add a command-line flag (something explicit like -O3)
that removes the hooks and activates the current behaviour (or
something even more restrictive, possibly).
- Initially compile Python code without the hooks, but have a
trip-wire set to detect the installation of profiling hooks. When
profiling hooks are installed, purge all machine code from the system
and recompile all hot functions to include the profiling hooks.

Thoughts?

Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PyCon Keynote

2010-01-21 Thread Collin Winter
On Thu, Jan 21, 2010 at 8:37 AM, Kortatu glorybo...@gmail.com wrote:
 Hi!

 For me, could be very interesting something about Unladen Swallow, and your
 opinion about JIT compilers.

FWIW, there will be a separate talk about Unladen Swallow at PyCon. I
for one would like to hear Guido talk about something else :)

Collin

 2010/1/21 Michael Foord fuzzy...@voidspace.org.uk

 On 21/01/2010 15:03, Thomas Wouters wrote:

 On Wed, Jan 13, 2010 at 19:51, Guido van Rossum gu...@python.org wrote:

 Please mail me topics you'd like to hear me talk about in my keynote
 at PyCon this year.

 How about something completely different... ?

 Your history of Python stuff has been really interesting.


 I'd like to hear you lay to rest that nonsense about you retiring :

 Well, ditto. :-)

 All the best,

 Michael


 --
 Thomas Wouters tho...@python.org

 Hi! I'm a .signature virus! copy me into your .signature file to help me
 spread!

 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk


 --
 http://www.ironpythoninaction.com/
 http://www.voidspace.org.uk/blog

 READ CAREFULLY. By accepting and reading this email you agree, on behalf
 of your employer, to release me from all obligations and waivers arising
 from any and all NON-NEGOTIATED agreements, licenses, terms-of-service,
 shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure,
 non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have
 entered into with your employer, its partners, licensors, agents and
 assigns, in perpetuity, without prejudice to my ongoing rights and
 privileges. You further represent that you have the authority to release me
 from any BOGUS AGREEMENTS on behalf of your employer.


 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 http://mail.python.org/mailman/options/python-dev/gloryboy84%40gmail.com



 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 http://mail.python.org/mailman/options/python-dev/collinw%40gmail.com


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-21 Thread Collin Winter
Hi Dirkjan,

On Wed, Jan 20, 2010 at 10:55 PM, Dirkjan Ochtman dirk...@ochtman.nl wrote:
 On Thu, Jan 21, 2010 at 02:56, Collin Winter collinwin...@google.com wrote:
 Agreed. We are actively working to improve the startup time penalty.
 We're interested in getting guidance from the CPython community as to
 what kind of a startup slow down would be sufficient in exchange for
 greater runtime performance.

 For some apps (like Mercurial, which I happen to sometimes hack on),
 increased startup time really sucks. We already have our demandimport
 code (I believe bzr has something similar) to try and delay imports,
 to prevent us spending time on imports we don't need. Maybe it would
 be possible to do something like that in u-s? It could possibly also
 keep track of the thorny issues, like imports where there's an except
 ImportError that can do fallbacks.

I added startup benchmarks for Mercurial and Bazaar yesterday
(http://code.google.com/p/unladen-swallow/source/detail?r=1019) so we
can use them as more macro-ish benchmarks, rather than merely starting
the CPython binary over and over again. If you have ideas for better
Mercurial/Bazaar startup scenarios, I'd love to hear them. The new
hg_startup and bzr_startup benchmarks should give us some more data
points for measuring improvements in startup time.

One idea we had for improving startup time for apps like Mercurial was
to allow the creation of hermetic Python binaries, with all
necessary modules preloaded. This would be something like Smalltalk
images. We haven't yet really fleshed out this idea, though.

Thanks,
Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-21 Thread Collin Winter
Hey Dirkjan,

On Thu, Jan 21, 2010 at 11:16 AM, Dirkjan Ochtman dirk...@ochtman.nl wrote:
 On Thu, Jan 21, 2010 at 18:32, Collin Winter collinwin...@google.com wrote:
 I added startup benchmarks for Mercurial and Bazaar yesterday
 (http://code.google.com/p/unladen-swallow/source/detail?r=1019) so we
 can use them as more macro-ish benchmarks, rather than merely starting
 the CPython binary over and over again. If you have ideas for better
 Mercurial/Bazaar startup scenarios, I'd love to hear them. The new
 hg_startup and bzr_startup benchmarks should give us some more data
 points for measuring improvements in startup time.

 Sounds good! I seem to remember from a while ago that you included the
 Mercurial test suite in your performance tests, but maybe those were
 the correctness tests rather than the performance tests (or maybe I'm
 just mistaken). I didn't see any mention of that in the proto-PEP, in
 any case.

We used to run the Mercurial correctness tests at every revision, but
they were incredibly slow and a bit flaky under CPython 2.6. Bazaar's
tests were faster, but were flakier, so we ended up disabling them,
too. We only run these tests occasionally.

 One idea we had for improving startup time for apps like Mercurial was
 to allow the creation of hermetic Python binaries, with all
 necessary modules preloaded. This would be something like Smalltalk
 images. We haven't yet really fleshed out this idea, though.

 Yeah, that might be interesting. I think V8 can do something similar, right?

Correct; V8 loads a pre-compiled image of its builtins to reduce startup time.

 What I personally would consider interesting for the PEP is a (not too
 big) section evaluating where other Python-performance efforts are at.
 E.g. does it make sense to propose a u-s merge now when, by the time
 3.3 (or whatever) is released, there'll be a very good PyPy that
 sports memory usage competitive for embedded development (already does
 right now, I think) and a good tracing JIT? Or when we can compile
 Python using Cython, or Shedskin -- probably not as likely; but I
 think it might be worth assessing the landscape a bit before this huge
 change is implemented.

I can definitely work on that.
http://codespeak.net:8099/plotsummary.html should give you a quick
starting point for PyPy's performance. My reading of those graphs is
that it does very well on heavily-numerical workloads, but is much
slower than CPython on more diverse workloads. When I initially
benchmarked PyPy vs CPython last year, PyPy was 3-5x slower on
non-numerical workloads, and 60x slower on one benchmark (./perf.py -b
pickle,unpickle, IIRC).

My quick take on Cython and Shedskin is that they are
useful-but-limited workarounds for CPython's historically-poor
performance. Shedskin, for example, does not support the entire Python
language or standard library
(http://shedskin.googlecode.com/files/shedskin-tutorial-0.3.html).
Cython is a super-set of Python, and files annotated for maximum
Cython performance are no longer valid Python code, and will not run
on any other Python implementation. The advantage of using an
integrated JIT compiler is that we can support Python-as-specified,
without workarounds or changes in workflow. The compiler can observe
which parts of user code are static (or static-ish) and take advantage
of that, without the manual annotations needed by Cython. Cython is
good for writing extension modules without worrying about the details
of reference counting, etc, but I don't see it as an either-or
alternative for a JIT compiler.

 P.S. Is there any chance of LLVM doing something like tracing JITs?
 Those seem somewhat more promising to me (even though I understand
 they're quite hard in the face of Python features like stack frames).

Yes, you could implement a tracing JIT with LLVM. We chose a
function-at-a-time JIT because it would a) be an easy-to-implement
baseline to measure future improvement, and b) create much of the
infrastructure for a future tracing JIT. Implementing a tracing JIT
that crosses the C/Python boundary would be interesting.

Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-21 Thread Collin Winter
Hey Greg,

On Wed, Jan 20, 2010 at 10:54 PM, Gregory P. Smith g...@krypto.org wrote:
 +1
 My biggest concern is memory usage but it sounds like addressing that is
 already in your mind.  I don't so much mind an additional up front constant
 and per-line-of-code hit for instrumentation but leaks are unacceptable.
  Any instrumentation data or jit caches should be managed (and tunable at
 run time when possible and it makes sense).

Reducing memory usage is a high priority. One thing being worked on
right now is to avoid collecting runtime data for functions that will
never be considered hot. That's one leak in the current
implementation.

 I think having a run time flag (or environment variable for those who like
 that) to disable the use of JIT at python3 execution time would be a good
 idea.

Yep, we already have a -j flag that supports don't ever use the JIT
(-j never), use the JIT when you think you should (-j whenhot), and
always the use the JIT (-j always) options. I'll mention this in the
PEP (we'll clearly need to make this an -X option before merger).

Thanks,
Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-21 Thread Collin Winter
Hey Barry,

On Thu, Jan 21, 2010 at 3:34 AM, Barry Warsaw ba...@python.org wrote:
 On Jan 20, 2010, at 11:05 PM, Jack Diederich wrote:

Does disabling the LLVM change binary compatibility between modules
targeted at the same version?  At tonight's Boston PIG we had some
binary package maintainers but most people (including myself) only
cared about source compatibility.    I assume linux distros care about
binary compatibility _a lot_.

 A few questions come to mind:

 1. What are the implications for PEP 384 (Stable ABI) if U-S is added?

PEP 384 looks to be incomplete at this writing, but reading the
section Structures, it says


Only the following structures and structure fields are accessible to
applications:

- PyObject (ob_refcnt, ob_type)
- PyVarObject (ob_base, ob_size)
- Py_buffer (buf, obj, len, itemsize, readonly, ndim, shape, strides,
suboffsets, smalltable, internal)
- PyMethodDef (ml_name, ml_meth, ml_flags, ml_doc)
- PyMemberDef (name, type, offset, flags, doc)
- PyGetSetDef (name, get, set, doc, closure)


Of these, the only one we have changed is PyMethodDef, and then to add
two fields to the end of the structure. We have changed other types
(dicts and code come to mind), but I believe we have only appended
fields and not deleted or reordered existing fields. I don't believe
that introducing the Unladen Swallow JIT will make maintaining a
stable ABI per PEP 384 more difficult. We've been careful about not
exporting any C++ symbols via PyAPI_FUNC(), so I don't believe that
will be an issue either, but Jeffrey can comment more deeply on this
issue.

If PEP 384 is accepted, I'd like it to include a testing strategy so
that we can be sure that we haven't accidentally broken ABI
compatibility. That testing should ideally be automated.

 2. What effect does requiring C++ have on the embedded applications across the
   set of platforms that Python is currently compatible on?  In a previous
   life I had to integrate a C++ library with Python as an embedded language
   and had lots of problems on some OSes (IIRC Solaris and Windows) getting
   all the necessary components to link properly.

To be clear, you're talking about embedding Python in a C/C++
application/library?

We have successfully integrated Unladen Swallow into a large C++
application that uses Python as an embedded scripting language. There
were no special issues or restrictions that I had to overcome to do
this. If you have any applications/libraries in particular that you'd
like me to test, I'd be happy to do that.

 3. Will the U-S bits come with a roadmap to the code?  It seems like this is
   dropping a big black box of code on the Python developers, and I would want
   to reduce the learning curve as much as possible.

Yes; there is 
http://code.google.com/p/unladen-swallow/source/browse/trunk/Python/llvm_notes.txt,
which goes into developer-level detail about various optimizations and
subsystems. We have other documentation in the Unladen Swallow wiki
that is being merged into llvm_notes.txt. Simply dropping this code
onto python-dev without a guide to it would be unacceptable.
llvm_notes.txt also details available instrumentation, useful to
CPython developers who are investigating performance changes.

Thanks,
Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-21 Thread Collin Winter
On Wed, Jan 20, 2010 at 2:27 PM, Collin Winter collinwin...@google.com wrote:
[snip]
 Incremental builds, however, are significantly slower. The table below shows
 incremental rebuild times after touching ``Objects/listobject.c``.

 +-+---+---+--+
 | Incr make   | CPython 2.6.4 | CPython 3.1.1 | Unladen Swallow r988 |
 +=+===+===+==+
 | Run 1       | 0m1.854s      | 0m1.456s      | 0m24.464s            |
 +-+---+---+--+
 | Run 2       | 0m1.437s      | 0m1.442s      | 0m24.416s            |
 +-+---+---+--+
 | Run 3       | 0m1.440s      | 0m1.425s      | 0m24.352s            |
 +-+---+---+--+

http://code.google.com/p/unladen-swallow/source/detail?r=1015 has
significantly improved this situation. The new table of incremental
build times:

+-+---+---+---+
| Incr make   | CPython 2.6.4 | CPython 3.1.1 | Unladen Swallow r1024 |
+=+===+===+===+
| Run 1   | 0m1.854s  | 0m1.456s  | 0m6.680s  |
+-+---+---+---+
| Run 2   | 0m1.437s  | 0m1.442s  | 0m5.310s  |
+-+---+---+---+
| Run 3   | 0m1.440s  | 0m1.425s  | 0m7.639s  |
+-+---+---+---+

The remaining increase is from statically linking LLVM into libpython.

PEP updated: http://codereview.appspot.com/186247/diff2/1:4/5

Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-21 Thread Collin Winter
Hi Paul,

On Thu, Jan 21, 2010 at 12:56 PM, Paul Moore p.f.mo...@gmail.com wrote:
 I'm concerned about the memory and startup time penalties. It's nice
 that you're working on them - I'd like to see them remain a priority.
 Ultimately a *lot* of people use Python for short-running transient
 commands (not just adhoc scripts, think hg log) and startup time and
 memory penalties can really hurt there.

I think final merger from the proposed py3k-jit branch into py3k
should block on reducing the startup and memory usage penalties.
Improving startup time is high on my list of priorities, and I think
there's a fair amount of low-hanging fruit there. I've just updated
http://code.google.com/p/unladen-swallow/issues/detail?id=64 with some
ideas that have recently come up to improve startup time, as well
results from the recently-added hg_startup and bzr_startup benchmarks.
I'll also update the PEP with these benchmark results, since they're
important to a lot of people.

 Windows compatibility is a big deal to me. And IMHO, it's a great
 strength of Python at the moment that it has solid Windows support. I
 would be strongly *against* this PEP if it was going to be Unix or
 Linux only. As it is, I have concerns that Windows could suffer from
 the common none of the developers use Windows, but we do our best
 problem. I'm hoping that having U-S integrated into the core will mean
 that there will be more Windows developers able to contribute and
 alleviate that problem.

One of our contributors, James Abbatiello (cc'd), has done a bang-up
job of making Unladen Swallow work on Windows. My understanding from
his last update is that Unladen Swallow works well on Windows, but he
can comment further as to the precise state of Windows support and any
remaining challenges faced on that platform, if any.

 One question - once Unladen Swallow is integrated, will Google's
 support (in terms of dedicated developer time) remain? If not, I'd
 rather see more of the potential gains realised before integration, as
 otherwise it could be a long time before it happens. Ideally, I'd like
 to see a commitment from Google - otherwise the cynic in me is
 inclined to say no until the suggested speed benefits have
 materialised and only then accept U-S for integration. Less cynically,
 it's clear that there's quite a way to go before the key advertised
 benefits of U-S are achieved, and I don't want the project to lose
 steam before it gets there.

While this decision is not mine, I don't believe our director would be
open to an open-ended commitment of full-time Google engineering
resources, though we still have another few quarters of engineering
time allocated to the Google team (myself, Jeffrey Yasskin). At this
point, the clear majority of Unladen Swallow's performance patches are
coming from the non-Google developers on the project (Jeffrey and I
are mostly working on infrastructure), and I believe that pattern will
continue once the py3k-jit branch is established.

Thanks,
Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-21 Thread Collin Winter
On Thu, Jan 21, 2010 at 2:24 PM, Collin Winter collinwin...@google.com wrote:
 I'll also update the PEP with these benchmark results, since they're
 important to a lot of people.

Done; see http://codereview.appspot.com/186247/diff2/4:8/9 for the
wording change and new startup data.

Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-21 Thread Collin Winter
Hey Antoine,

On Thu, Jan 21, 2010 at 4:25 AM, Antoine Pitrou solip...@pitrou.net wrote:
 The increased memory usage comes from a) LLVM code generation, analysis
 and optimization libraries; b) native code; c) memory usage issues or
 leaks in LLVM; d) data structures needed to optimize and generate
 machine code; e) as-yet uncategorized other sources.

 Does the increase in memory occupation disappear when the JIT is disabled
 from the command-line?

It does not disappear, but it is significantly reduced. Running our
django benchmark against three different configurations gives me these
max memory usage numbers:

CPython 2.6.4: 8508 kb
Unladen Swallow default: 26768 kb
Unladen Swallow -j never: 15144 kb

-j never is Unladen Swallow's flag to disable JIT compilation.

As it stands right now, -j never gives a 1.76x reduction in memory
usage, but is still 1.77x larger than CPython. It occurs to me that
we're still doing a lot of LLVM-side initialization and setup that we
don't need to do under -j never. We're also collecting runtime
feedback in the eval loop, which is yet more memory usage. Optimizing
this mode has not yet been a priority for us, but it seems to be the
emerging consensus of python-dev that we need to give -j never some
more love. There's a lot of low-hanging fruit there.

I've added this information to
http://code.google.com/p/unladen-swallow/issues/detail?id=123, which
is our issue tracking -j never improvements.

 Do you think LLVM might suffer from a lot of memory leaks?

I don't know that it suffers from a lot of memory leaks, though we
have certainly observed and fixed quadratic memory usage in some of
the optimization passes. We've fixed all the memory leaks that
Google's internal heapchecker has found.

Thanks,
Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-21 Thread Collin Winter
On Thu, Jan 21, 2010 at 10:14 AM, Reid Kleckner r...@mit.edu wrote:
 On Thu, Jan 21, 2010 at 9:35 AM, Floris Bruynooghe
 floris.bruynoo...@gmail.com wrote:
 I just compiled with the --without-llvm option and see that the
 binary, while only an acceptable 4.1M, still links with libstdc++.  Is
 it possible to completely get rid of the C++ dependency if this option
 is used?  Introducing a C++ dependency on all platforms for no
 additional benefit (with --without-llvm) seems like a bad tradeoff to
 me.

 There isn't (and shouldn't be) any real source-level dependency on
 libstdc++ when LLVM is turned off.  However, the eval loop is now
 compiled as C++, and that may be adding some hidden dependency
 (exception handling code?).  The final binary is linked with $(CXX),
 which adds an implicit -lstdc++, I think.  Someone just has to go and
 track this down.

We've opened http://code.google.com/p/unladen-swallow/issues/detail?id=124
to track this issue. It should be straight-forward to fix.

Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-21 Thread Collin Winter
Hey Glyph,

On Thu, Jan 21, 2010 at 9:11 AM, Glyph Lefkowitz
gl...@twistedmatrix.com wrote:
 It would be hard for me to put an exact number on what I would find 
 acceptable, but I was really hoping that we could get a *reduced* memory 
 footprint in the long term.

 My real concern here is not absolute memory usage, but usage for each 
 additional Python process on a system; even if Python supported fast, 
 GIL-free multithreading, I'd still prefer the additional isolation of 
 multiprocess concurrency.  As it currently stands, starting cores+1 Python 
 processes can start to really hurt, especially in many-core-low-RAM 
 environments like the Playstation 3.

 So, if memory usage went up by 20%, but per-interpreter overhead were 
 decreased by more than that, I'd personally be happy.

There's been a recent thread on our mailing list about a patch that
dramatically reduces the memory footprint of multiprocess concurrency
by separating reference counts from objects. We're looking at possibly
incorporating this work into Unladen Swallow, though I think it should
really go into upstream CPython first (since it's largely orthogonal
to the JIT work). You can see the thread here:
http://groups.google.com/group/unladen-swallow/browse_thread/thread/21d7248e8279b328/2343816abd1bd669

Thanks,
Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-21 Thread Collin Winter
On Thu, Jan 21, 2010 at 12:20 PM, Collin Winter collinwin...@google.com wrote:
 Hey Greg,

 On Wed, Jan 20, 2010 at 10:54 PM, Gregory P. Smith g...@krypto.org wrote:
 I think having a run time flag (or environment variable for those who like
 that) to disable the use of JIT at python3 execution time would be a good
 idea.

 Yep, we already have a -j flag that supports don't ever use the JIT
 (-j never), use the JIT when you think you should (-j whenhot), and
 always the use the JIT (-j always) options. I'll mention this in the
 PEP (we'll clearly need to make this an -X option before merger).

FYI, I just committed
http://code.google.com/p/unladen-swallow/source/detail?r=1027, which
dramatically improves the performance of Unladen Swallow when running
with `-j never`, making disabling the JIT at runtime more viable.
We're continuing to make progress minimizing the impact of the JIT
when running under `-j never`. Progress can be tracked at
http://code.google.com/p/unladen-swallow/issues/detail?id=123.

Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-21 Thread Collin Winter
On Thu, Jan 21, 2010 at 11:37 PM, Glyph Lefkowitz
gl...@twistedmatrix.com wrote:

 On Jan 21, 2010, at 6:48 PM, Collin Winter wrote:

 Hey Glyph,

 There's been a recent thread on our mailing list about a patch that
 dramatically reduces the memory footprint of multiprocess concurrency
 by separating reference counts from objects. We're looking at possibly
 incorporating this work into Unladen Swallow, though I think it should
 really go into upstream CPython first (since it's largely orthogonal
 to the JIT work). You can see the thread here:
 http://groups.google.com/group/unladen-swallow/browse_thread/thread/21d7248e8279b328/2343816abd1bd669

 AWESOME.
 Thanks for the pointer.  I read through both of the threads but I didn't see
 any numbers on savings-per-multi-process.  Do you have any?

The data I've seen comes from
http://groups.google.com/group/comp.lang.python/msg/c18b671f2c4fef9e:


This test code[1] consumes roughly 2G of RAM on an x86_64 with python
2.6.1, with the patch, it *should* use 2.3G of RAM (as specified by
its output), so you can see the footprint overhead... but better page
sharing makes it consume about 6 times less - roughly 400M... which is
the size of the dataset. Ie: near-optimal data sharing.


Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-20 Thread Collin Winter
 in exchange for
greater runtime performance.

 I guess what I am mainly saying is that there are several possible ways to
 speed up Python 3 execution (including others not mentioned here) and it is
 not at all clear to me that this particular one is in any sense 'best of
 breed'. If it disables other approaches, I think it should be optional for
 the standard PSF distribution.

We considered the three approaches you mentioned (Psyco, changing the
language, using function annotations), but found them unworkable or
inapplicable to Google's needs. Adding a just-in-time compiler to
Python 2.6 while designing our changes for ease of portability to
Python 3 made more sense for our environment, and we believe, is more
applicable to the environments of other Python consumers and
better-suited to the future roadmap of CPython.

Thanks,
Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3003 - Python Language Moratorium

2009-11-05 Thread Collin Winter
On Thu, Nov 5, 2009 at 10:35 AM, Dino Viehland di...@microsoft.com wrote:
 Stefan wrote:
 It /does/ make some static assumptions in that it considers builtins
 true
 builtins. However, it does not prevent you from replacing them in your
 code, as long as you do it inside the module. Certainly a restriction
 compared to Python, where you can import a module into a changed dict
 environment that redefines 'object', but not a major restriction IMO,
 and certainly not one that impacts much code.

 To me this is a deal breaker which prevents Cython from being a Python
 implementation.  From a talk given by Colin Winter at the LLVM dev meeting
 (http://llvm.org/devmtg/2009-10/) it seems like Unladen Swallow wanted to
 do something like this as well and Guido said no.  In this case the breaking
 change is so subtle that I'd personally hate to run into something like
 this porting code to Cython and having to figure out why it's not working.

To clarify, I was joking when I told that story (or at least I was
joking with Guido when I asked him if we could break that). It clearly
*would* be easier if we could just ignore this point of Python
compatibility, but that's not an option, so we've had to optimize
around it. It's not that hard to do, but it's still extra work.

Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A wordcode-based Python

2009-11-04 Thread Collin Winter
On Wed, Nov 4, 2009 at 4:20 AM, Mart Sõmermaa mrts.py...@gmail.com wrote:
 On Tue, May 12, 2009 at 8:54 AM, Cesare Di Mauro
 cesare.dima...@a-tono.com wrote:
 Also, I checked out wpython at head to run Unladen Swallow's
 benchmarks against it, but it refuses to compile with either gcc 4.0.1
 or 4.3.1 on Linux (fails in Python/ast.c). I can send you the build
 failures off-list, if you're interested.

 Thanks,
 Collin Winter

 I'm very interested, thanks. That's because I worked only on Windows
 machines, so I definitely need to test and fix it to let it run on any other
 platform.

 Cesare

 Re-animating an old discussion -- Cesare, any news on the wpython front?

 I did a checkout from http://wpython.googlecode.com/svn/trunk and
 was able to ./configure and make successfully on my 64-bit Linux box
 as well as to run the Unladen benchmarks.

 Given svn co http://svn.python.org/projects/python/tags/r261 in py261
 and svn co http://wpython.googlecode.com/svn/trunk in wpy,

 $ python unladen-tests/perf.py -rm --benchmarks=-2to3,all py261/python
 wpy/python

Do note that the --track_memory option to perf.py imposes some
overhead that interferes with the performance figures. I'd recommend
running the benchmarks again without --track_memory. That extra
overhead is almost certainly what's causing some of the variability in
the results.

Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 2to3, 3to2: official status

2009-11-04 Thread Collin Winter
Hi Ben,

On Wed, Nov 4, 2009 at 6:49 PM, Ben Finney ben+pyt...@benfinney.id.au wrote:
 Martin v. Löwis mar...@v.loewis.de writes:

 Ben Finney wrote:
  Martin v. Löwis mar...@v.loewis.de writes:
 
  Well, 3to2 would then be an option for you: use Python 3 as the
  source language.
 
  I was under the impression that 2to3 was officially supported as
  part of Python, but 3to2 was a third-party tool. […] Is it an
  official part of Python?

 No, the status is exactly as you describe it.

 Okay. It's probably best for anyone with their Python developer hat on
 (which, in this forum, is all the time for any Python developer) to make
 the status of 3to2 clear when recommending it to people concerned about
 future plans.

Are you implying that we shouldn't recommend 3to2 to people wanting to
develop in Py3k and back-translate to 2.x?

Thanks,
Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Reworking the GIL

2009-10-26 Thread Collin Winter
On Sun, Oct 25, 2009 at 1:22 PM, Antoine Pitrou solip...@pitrou.net wrote:
 Having other people test it would be fine. Even better if you have an
 actual multi-threaded py3k application. But ccbench results for other
 OSes would be nice too :-)

My results for an 2.4 GHz Intel Core 2 Duo MacBook Pro (OS X 10.5.8):

Control (py3k @ r75723)

--- Throughput ---

Pi calculation (Python)

threads=1: 633 iterations/s.
threads=2: 468 ( 74 %)
threads=3: 443 ( 70 %)
threads=4: 442 ( 69 %)

regular expression (C)

threads=1: 281 iterations/s.
threads=2: 282 ( 100 %)
threads=3: 282 ( 100 %)
threads=4: 282 ( 100 %)

bz2 compression (C)

threads=1: 379 iterations/s.
threads=2: 735 ( 193 %)
threads=3: 733 ( 193 %)
threads=4: 724 ( 190 %)

--- Latency ---

Background CPU task: Pi calculation (Python)

CPU threads=0: 0 ms. (std dev: 0 ms.)
CPU threads=1: 1 ms. (std dev: 1 ms.)
CPU threads=2: 1 ms. (std dev: 2 ms.)
CPU threads=3: 3 ms. (std dev: 6 ms.)
CPU threads=4: 2 ms. (std dev: 3 ms.)

Background CPU task: regular expression (C)

CPU threads=0: 0 ms. (std dev: 0 ms.)
CPU threads=1: 975 ms. (std dev: 577 ms.)
CPU threads=2: 1035 ms. (std dev: 571 ms.)
CPU threads=3: 1098 ms. (std dev: 556 ms.)
CPU threads=4: 1195 ms. (std dev: 557 ms.)

Background CPU task: bz2 compression (C)

CPU threads=0: 0 ms. (std dev: 0 ms.)
CPU threads=1: 0 ms. (std dev: 2 ms.)
CPU threads=2: 4 ms. (std dev: 5 ms.)
CPU threads=3: 0 ms. (std dev: 0 ms.)
CPU threads=4: 1 ms. (std dev: 4 ms.)



Experiment (newgil branch @ r75723)

--- Throughput ---

Pi calculation (Python)

threads=1: 651 iterations/s.
threads=2: 643 ( 98 %)
threads=3: 637 ( 97 %)
threads=4: 625 ( 95 %)

regular expression (C)

threads=1: 298 iterations/s.
threads=2: 296 ( 99 %)
threads=3: 288 ( 96 %)
threads=4: 287 ( 96 %)

bz2 compression (C)

threads=1: 378 iterations/s.
threads=2: 720 ( 190 %)
threads=3: 724 ( 191 %)
threads=4: 718 ( 189 %)

--- Latency ---

Background CPU task: Pi calculation (Python)

CPU threads=0: 0 ms. (std dev: 0 ms.)
CPU threads=1: 0 ms. (std dev: 1 ms.)
CPU threads=2: 0 ms. (std dev: 1 ms.)
CPU threads=3: 0 ms. (std dev: 0 ms.)
CPU threads=4: 1 ms. (std dev: 5 ms.)

Background CPU task: regular expression (C)

CPU threads=0: 0 ms. (std dev: 0 ms.)
CPU threads=1: 1 ms. (std dev: 0 ms.)
CPU threads=2: 2 ms. (std dev: 1 ms.)
CPU threads=3: 2 ms. (std dev: 2 ms.)
CPU threads=4: 2 ms. (std dev: 1 ms.)

Background CPU task: bz2 compression (C)

CPU threads=0: 0 ms. (std dev: 0 ms.)
CPU threads=1: 0 ms. (std dev: 0 ms.)
CPU threads=2: 2 ms. (std dev: 3 ms.)
CPU threads=3: 0 ms. (std dev: 1 ms.)
CPU threads=4: 0 ms. (std dev: 0 ms.)


I also ran this through Unladen Swallow's threading microbenchmark,
which is a straight copy of what David Beazley was experimenting with
(simply iterating over 100 ints in pure Python) [1].
iterative_count is doing the loops one after the other,
threaded_count is doing the loops in parallel using threads.

The results below are benchmarking py3k as the control, newgil as the
experiment. When it says x% faster, that is a measure of newgil's
performance over py3k's.

With two threads:

iterative_count:
Min: 0.336573 - 0.387782: 13.21% slower  # I've run this
configuration multiple times and gotten the same slowdown.
Avg: 0.338473 - 0.418559: 19.13% slower
Significant (t=-38.434785, a=0.95)

threaded_count:
Min: 0.529859 - 0.397134: 33.42% faster
Avg: 0.581786 - 0.429933: 35.32% faster
Significant (t=70.100445, a=0.95)


With four threads:

iterative_count:
Min: 0.766617 - 0.734354: 4.39% faster
Avg: 0.771954 - 0.751374: 2.74% faster
Significant (t=22.164103, a=0.95)
Stddev: 0.00262 - 0.00891: 70.53% larger

threaded_count:
Min: 1.175750 - 0.829181: 41.80% faster
Avg: 1.224157 - 0.867506: 41.11% faster
Significant (t=161.715477, a=0.95)
Stddev: 0.01900 - 0.01120: 69.65% smaller


With eight threads:

iterative_count:
Min: 1.527794 - 1.447421: 5.55% faster
Avg: 1.536911 - 1.479940: 3.85% faster
Significant (t=35.559595, a=0.95)
Stddev: 0.00394 - 0.01553: 74.61% larger

threaded_count:
Min: 2.424553 - 1.677180: 44.56% faster
Avg: 2.484922 - 1.723093: 44.21% faster
Significant (t=184.766131, a=0.95)
Stddev: 0.02874 - 0.02956: 2.78% larger


I'd be interested in multithreaded benchmarks with less-homogenous workloads.

Collin Winter

[1] - 
http://code.google.com/p/unladen-swallow/source/browse/tests/performance/bm_threading.py
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Reworking the GIL

2009-10-26 Thread Collin Winter
On Mon, Oct 26, 2009 at 2:43 PM, Antoine Pitrou solip...@pitrou.net wrote:
 Collin Winter collinw at gmail.com writes:
 [the Dave Beazley benchmark]
 The results below are benchmarking py3k as the control, newgil as the
 experiment. When it says x% faster, that is a measure of newgil's
 performance over py3k's.

 With two threads:

 iterative_count:
 Min: 0.336573 - 0.387782: 13.21% slower  # I've run this
 configuration multiple times and gotten the same slowdown.
 Avg: 0.338473 - 0.418559: 19.13% slower

 Those numbers are not very in line with the other iterative_count results.
 Since iterative_count just runs the loop N times in a row, results should be
 proportional to the number N (number of threads).

 Besides, there's no reason for single-threaded performance to be degraded 
 since
 the fast path of the eval loop actually got a bit streamlined (there is no
 volatile ticker to decrement).

I agree those numbers are out of line with the others and make no
sense. I've run it with two threads several times and the results are
consistent on this machine. I'm digging into it a bit more.

Collin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Fast Implementation for ZIP decryption

2009-08-30 Thread Collin Winter
On Sun, Aug 30, 2009 at 7:34 AM, Shashank
Singhshashank.sunny.si...@gmail.com wrote:
 just to give you an idea of the speed up:

 a 3.3 mb zip file extracted using the current all-python implementation on
 my machine (win xp 1.67Ghz 1.5GB)
 takes approximately 38 seconds.

 the same file when extracted using c implementation takes 0.4 seconds.

Are there any applications/frameworks which have zip files on their
critical path, where this kind of (admittedly impressive) speedup
would be beneficial? What was the motivation for writing the C
version?

Collin Winter

 On Sun, Aug 30, 2009 at 6:35 PM, exar...@twistedmatrix.com wrote:

 On 12:59 pm, st...@pearwood.info wrote:

 On Sun, 30 Aug 2009 06:55:33 pm Martin v. Löwis wrote:

  Does it sound worthy enough to create a patch for and integrate
  into python itself?

 Probably not, given that people think that the algorithm itself is
 fairly useless.

 I would think that for most people, the threat model isn't the CIA is
 reading my files but my little brother or nosey co-worker is reading
 my files, and for that, zip encryption with a good password is
 probably perfectly adequate. E.g. OpenOffice uses it for
 password-protected documents.

 Given that Python already supports ZIP decryption (as it should), are
 there any reasons to prefer the current pure-Python implementation over
 a faster version?

 Given that the use case is protect my biology homework from my little
 brother, how fast does the implementation really need to be?  Is speeding
 it up from 0.1 seconds to 0.001 seconds worth the potential new problems
 that come with more C code (more code to maintain, less portability to other
 runtimes, potential for interpreter crashes or even arbitrary code execution
 vulnerabilities from specially crafted files)?

 Jean-Paul

 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 http://mail.python.org/mailman/options/python-dev/shashank.sunny.singh%40gmail.com



 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 http://mail.python.org/mailman/options/python-dev/collinw%40gmail.com


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Anyone against having a loop option for regrtest?

2009-06-29 Thread Collin Winter
On Mon, Jun 29, 2009 at 6:59 PM, Jesse Nollerjnol...@gmail.com wrote:
 Something that's been helping me squirrel out wacky and fun bugs
 in multiprocessing is running the tests in a loop - sometimes hundreds
 of times. Right now, I hack this up with a bash script, but I'm
 sitting here wondering if adding a loop for x iterations option to
 regrtest.py would be useful to others as well.

 Any thoughts? Does anyone hate this idea with the power of a thousand suns?

+1 for having this in regrtest. I've wished for this in the past, and
ended up going the bash route, same as you.

Collin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] draft pep: backwards compatibility

2009-06-20 Thread Collin Winter
On Thu, Jun 18, 2009 at 7:17 PM, Benjamin Petersonbenja...@python.org wrote:
[snip]
 Backwards Compatibility Rules
 =

 This policy applys to all public APIs.  These include the C-API, the standard
 library, and the core language including syntax and operation as defined by 
 the
 reference manual.

 This is the basic policy for backwards compatibility:

 * The behavior of an API *must* not change between any two consecutive 
 releases.

Is this intended to include performance changes? Clearly no-one will
complain if things simply get faster, but I'm thinking about cases
where, say, a function runs in half the time but uses double the
memory (or vice versa).

Collin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A wordcode-based Python

2009-05-12 Thread Collin Winter
On Tue, May 12, 2009 at 4:45 AM, Cesare Di Mauro
cesare.dima...@a-tono.com wrote:
 Another note. Fredrik Johansson let me note just few minutes ago that I've
 compiled my sources without PGO optimizations enabled.

 That's because I used Visual Studio Express Edition.

 So another gain in performances can be obtained. :)

FWIW, Unladen Swallow experimented with gcc 4.4's FDO and got an
additional 10-30% (depending on the benchmark). The training load is
important, though: some training sets offered better performance than
others. I'd be interested in how MSVC's PGO compares to gcc's FDO in
terms of overall effectiveness. The results for gcc FDO with our
2009Q1 release are at the bottom of
http://code.google.com/p/unladen-swallow/wiki/Releases.

Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Shorter release schedule?

2009-05-12 Thread Collin Winter
On Tue, May 12, 2009 at 3:06 PM, Antoine Pitrou solip...@pitrou.net wrote:
 Hello,

 Just food for thought here, but seeing how 3.1 is going to be a real 
 featureful
 schedule despite being released shortly after 3.0, wouldn't it make sense to
 tighten future release planning a little? I was thinking something like doing 
 a
 major release every 12 months (rather than 18 to 24 months as has been
 heuristically the case lately). This could also imply switching to some kind 
 of
 loosely time-based release system.

I'd be in favor of a shorter, 12-month release cycle. I think the
limiting resource would be the time and energy of the release managers
and the package builders for Windows, etc. Provided it's not a tax on
the release staff, I think shorter release cycles would be a benefit
to the community. My own experience with time-based releases at work
is that it greatly helps focus energy and attention, knowing that you
can't simply delay the release if you slack off on your features/bugs.

Collin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A wordcode-based Python

2009-05-11 Thread Collin Winter
Hi Cesare,

On Mon, May 11, 2009 at 11:00 AM, Cesare Di Mauro
cesare.dima...@a-tono.com wrote:
 At the last PyCon3 at Italy I've presented a new Python implementation,
 which you'll find at http://code.google.com/p/wpython/

Good to see some more attention on Python performance! There's quite a
bit going on in your changes; do you have an
optimization-by-optimization breakdown, to give an idea about how much
performance each optimization gives?

Looking over the slides, I see that you still need to implement
functionality to make test_trace pass, for example; do you have a
notion of how much performance it will cost to implement the rest of
Python's semantics in these areas?

Also, I checked out wpython at head to run Unladen Swallow's
benchmarks against it, but it refuses to compile with either gcc 4.0.1
or 4.3.1 on Linux (fails in Python/ast.c). I can send you the build
failures off-list, if you're interested.

Thanks,
Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Rethinking intern() and its data structure

2009-04-09 Thread Collin Winter
Hi John,

On Thu, Apr 9, 2009 at 8:02 AM, John Arbash Meinel
j...@arbash-meinel.com wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 I've been doing some memory profiling of my application, and I've found
 some interesting results with how intern() works. I was pretty surprised
 to see that the interned dict was actually consuming a significant
 amount of total memory.
 To give the specific values, after doing:
  bzr branch A B
 of a small project, the total memory consumption is ~21MB

[snip]

 Anyway, I the internals of intern() could be done a bit better. Here are
 some concrete things:

[snip]

Memory usage is definitely something we're interested in improving.
Since you've already looked at this in some detail, could you try
implementing one or two of your ideas and see if it makes a difference
in memory consumption? Changing from a dict to a set looks promising,
and should be a fairly self-contained way of starting on this. If it
works, please post the patch on http://bugs.python.org with your
results and assign it to me for review.

Thanks,
Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Rethinking intern() and its data structure

2009-04-09 Thread Collin Winter
On Thu, Apr 9, 2009 at 9:34 AM, John Arbash Meinel
john.arbash.mei...@gmail.com wrote:
 ...

 Anyway, I the internals of intern() could be done a bit better. Here are
 some concrete things:


 [snip]

 Memory usage is definitely something we're interested in improving.
 Since you've already looked at this in some detail, could you try
 implementing one or two of your ideas and see if it makes a difference
 in memory consumption? Changing from a dict to a set looks promising,
 and should be a fairly self-contained way of starting on this. If it
 works, please post the patch on http://bugs.python.org with your
 results and assign it to me for review.

 Thanks,
 Collin Winter

 (I did end up subscribing, just with a different email address :)

 What is the best branch to start working from? trunk?

That's a good place to start, yes. If the idea works well, we'll want
to port it to the py3k branch, too, but that can wait.

Collin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Rethinking intern() and its data structure

2009-04-09 Thread Collin Winter
On Thu, Apr 9, 2009 at 6:24 PM, John Arbash Meinel
john.arbash.mei...@gmail.com wrote:
 Greg Ewing wrote:
 John Arbash Meinel wrote:
 And the way intern is currently
 written, there is a third cost when the item doesn't exist yet, which is
 another lookup to insert the object.

 That's even rarer still, since it only happens the first
 time you load a piece of code that uses a given variable
 name anywhere in any module.


 Somewhat true, though I know it happens 25k times during startup of
 bzr... And I would be a *lot* happier if startup time was 100ms instead
 of 400ms.

Quite so. We have a number of internal tools, and they find that
frequently just starting up Python takes several times the duration of
the actual work unit itself. I'd be very interested to review any
patches you come up with to improve start-up time; so far on this
thread, there's been a lot of theory and not much practice. I'd
approach this iteratively: first replace the dict with a set, then if
that bears fruit, consider a customized data structure; if that bears
fruit, etc.

Good luck, and be sure to let us know what you find,
Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] core python tests

2009-04-04 Thread Collin Winter
On Sat, Apr 4, 2009 at 7:33 AM, Michael Foord fuzzy...@voidspace.org.uk wrote:
 Antoine Pitrou wrote:

 Nick Coghlan ncoghlan at gmail.com writes:


 C. Titus Brown wrote:


 I vote for a separate mailing list -- 'python-tests'? -- but I don't
 know exactly how splintered to make the conversation.  It probably
 belongs at python.org but if you want me to host it, I can.


 If too many things get moved off to SIGs there won't be anything left
 for python-dev to talk about ;)


 There is already an stdlib-sig, which has been almost unused.



 stdlib-sig isn't *quite* right (the testing and benchmarking are as much
 about core python as the stdlib) - although we could view the benchmarks and
 tests themselves as part of the standard library...

 Either way we should get it underway. Collin and Jeffrey - happy to use
 stdlib-sig?

Works for me.

Collin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PyDict_SetItem hook

2009-04-03 Thread Collin Winter
On Fri, Apr 3, 2009 at 2:27 AM, Antoine Pitrou solip...@pitrou.net wrote:
 Thomas Wouters thomas at python.org writes:


 Pystone is pretty much a useless benchmark. If it measures anything, it's the
 speed of the bytecode dispatcher (and it doesn't measure it particularly 
 well.)
 PyBench isn't any better, in my experience.

 I don't think pybench is useless. It gives a lot of performance data about
 crucial internal operations of the interpreter. It is of course very little
 real-world, but conversely makes you know immediately where a performance
 regression has happened. (by contrast, if you witness a regression in a
 high-level benchmark, you still have a lot of investigation to do to find out
 where exactly something bad happened)

 Perhaps someone should start maintaining a suite of benchmarks, high-level and
 low-level; we currently have them all scattered around (pybench, pystone,
 stringbench, richard, iobench, and the various Unladen Swallow benchmarks; not
 to mention other third-party stuff that can be found in e.g. the Computer
 Language Shootout).

Already in the works :)

As part of the common standard library and test suite that we agreed
on at the PyCon language summit last week, we're going to include a
common benchmark suite that all Python implementations can share. This
is still some months off, though, so there'll be plenty of time to
bikeshed^Wrationally discuss which benchmarks should go in there.

Collin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PyDict_SetItem hook

2009-04-03 Thread Collin Winter
On Fri, Apr 3, 2009 at 9:43 AM, Antoine Pitrou solip...@pitrou.net wrote:
 Thomas Wouters thomas at python.org writes:

 Really? Have you tried it? I get at least 5% noise between runs without any
 changes. I have gotten results that include *negative* run times.

 That's an implementation problem, not an issue with the tests themselves.
 Perhaps a better timing mechanism could be inspired from the timeit module.
 Perhaps the default numbers of iterations should be higher (many subtests run
 in less than 100ms on a modern CPU, which might be too low for accurate
 measurement). Perhaps the so-called calibration should just be disabled.
 etc.

 The tests in PyBench are not micro-benchmarks (they do way too much for
 that),

 Then I wonder what you call a micro-benchmark. Should it involve direct calls
 to
 low-level C API functions?

I agree that a suite of microbenchmarks is supremely useful: I would
very much like to be able to isolate, say, raise statement
performance. PyBench suffers from implementation defects that in its
current incarnation make it unsuitable for this, though:
- It does not effectively isolate component performance as it claims.
When I was working on a change to BINARY_MODULO to make string
formatting faster, PyBench would report that floating point math got
slower, or that generator yields got slower. There is a lot of random
noise in the results.
- We have observed overall performance swings of 10-15% between runs
on the same machine, using the same Python binary. Using the same
binary on the same unloaded machine should give as close an answer to
0% as possible.
- I wish PyBench actually did more isolation.
Call.py:ComplexPythonFunctionCalls is on my mind right now; I wish it
didn't put keyword arguments and **kwargs in the same microbenchmark.
- In experimenting with gcc 4.4's FDO support, I produced a training
load that resulted in a 15-30% performance improvement (depending on
benchmark) across all benchmarks. Using this trained binary, PyBench
slowed down by 10%.
- I would like to see PyBench incorporate better statistics for
indicating the significance of the observed performance difference.

I don't believe that these are insurmountable problems, though. A
great contribution to Python performance work would be an improved
version of PyBench that corrects these problems and offers more
precise measurements. Is that something you might be interested in
contributing to? As performance moves more into the wider
consciousness, having good tools will become increasingly important.

Thanks,
Collin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PyDict_SetItem hook

2009-04-03 Thread Collin Winter
On Fri, Apr 3, 2009 at 10:28 AM, Michael Foord
fuzzy...@voidspace.org.uk wrote:
 Collin Winter wrote:
 As part of the common standard library and test suite that we agreed
 on at the PyCon language summit last week, we're going to include a
 common benchmark suite that all Python implementations can share. This
 is still some months off, though, so there'll be plenty of time to
 bikeshed^Wrationally discuss which benchmarks should go in there.


 Where is the right place for us to discuss this common benchmark and test
 suite?

 As the benchmark is developed I would like to ensure it can run on
 IronPython.

 The test suite changes will need some discussion as well - Jython and
 IronPython (and probably PyPy) have almost identical changes to tests that
 currently rely on deterministic finalisation (reference counting) so it
 makes sense to test changes on both platforms and commit a single solution.

I believe Brett Cannon is the best person to talk to about this kind
of thing. I don't know that any common mailing list has been set up,
though there may be and Brett just hasn't told anyone yet :)

Collin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PyDict_SetItem hook

2009-04-03 Thread Collin Winter
On Fri, Apr 3, 2009 at 10:50 AM, Antoine Pitrou solip...@pitrou.net wrote:
 Collin Winter collinw at gmail.com writes:

 - I wish PyBench actually did more isolation.
 Call.py:ComplexPythonFunctionCalls is on my mind right now; I wish it
 didn't put keyword arguments and **kwargs in the same microbenchmark.

 Well, there is a balance to be found between having more subtests and keeping 
 a
 reasonable total running time :-)
 (I have to plead guilty for ComplexPythonFunctionCalls, btw)

Sure, there's definitely a balance to maintain. With perf.py, we're
going down the road of having different tiers of benchmarks: the
default set is the one we pay the most attention to, with other
benchmarks available for benchmarking certain specific subsystems or
workloads (like pickling list-heavy input data). Something similar
could be done for PyBench, giving the user the option of increasing
the level of detail (and run-time) as appropriate.

 - I would like to see PyBench incorporate better statistics for
 indicating the significance of the observed performance difference.

 I see you already have this kind of measurement in your perf.py script, would 
 it
 be easy to port it?

Yes, it should be straightforward to incorporate these statistics into
PyBench. In the same directory as perf.py, you'll find test_perf.py
which includes tests for the stats functions we're using.

Collin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PyDict_SetItem hook

2009-04-01 Thread Collin Winter
On Wed, Apr 1, 2009 at 4:29 PM, John Ehresman j...@wingware.com wrote:
 I've written a proof of concept patch to add a hook to PyDict_SetItem at
  http://bugs.python.org/issue5654  My motivation is to enable watchpoints in
 a python debugger that are called when an attribute or global changes.  I
 know that this won't cover function locals and objects with slots (as Martin
 pointed out).

 We talked about this at the sprints and a few issues came up:

 * Is this worth it for debugger watchpoint support?  This is a feature that
 probably wouldn't be used regularly but is extremely useful in some
 situations.

 * Would it be better to create a namespace dict subclass of dict, use it for
 modules, classes,  instances, and only allow watches of the subclass
 instances?

 * To what extent should non-debugger code use the hook?  At one end of the
 spectrum, the hook could be made readily available for non-debug use and at
 the other end, it could be documented as being debug only, disabled in
 python -O,  not exposed in the stdlib to python code.

Have you measured the impact on performance?

Collin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 3to2 Project

2009-03-30 Thread Collin Winter
On Mon, Mar 30, 2009 at 7:44 AM, Jesse Noller jnol...@gmail.com wrote:
 During the Language summit this past Thursday, pretty much everyone
 agreed that a python 3 to python 2 tool would be a very large
 improvement in helping developers be able to write pure python 3
 code. The idea being a large project such as Django could completely
 cut over to Python3, but then run the 3to2 tool on the code based to
 continue to support version 2.x.

 I raised my hand to help move this along, I've spoke to Benjamin
 Peterson, and he's amendable to mentoring a GSoC student for this
 project and he's  already received at least one proposal for this.

 Additionally, there's been a number of developers here at PyCon who
 are more than ready to help contribute.

 So, if people are interested in helping, coordinating work/etc - feel
 free to sync up with Benjamin - he's started a wiki page here:

 http://wiki.python.org/moin/3to2

If anyone is interested in working on this during the PyCon sprints or
otherwise, here are some easy, concrete starter projects that would
really help move this along:
- The core refactoring engine needs to be broken out from 2to3. In
particular, the tests/ and fixes/ need to get pulled up a directory,
out of lib2to3/.
- Once that's done, lib2to3 should then be renamed to something like
librefactor or something else that indicates its more general nature.
This will allow both 2to3 and 3to2 to more easily share the core
components.
- If you're more performance-minded, 2to3 and 3to2 would benefit
heavily from some work on the pattern matching system. The current
pattern matcher is a fairly simple AST interpreter; compiling the
patterns down to pure Python code would be a win, I believe. This is
all pretty heavily tested, so you wouldn't run much risk of breaking
it.

Collin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] GPython?

2009-03-27 Thread Collin Winter
On Fri, Mar 27, 2009 at 5:50 AM, Paul Moore p.f.mo...@gmail.com wrote:
 2009/3/27 Collin Winter coll...@gmail.com:
 In particular, Windows support is one of those things we'll need to
 address on our end. LLVM's Windows support may be spotty, or there may
 be other Windows issues we inadvertently introduce. None of the three
 of us have Windows machines, nor do we particularly want to acquire
 them :), and Windows support isn't going to be a big priority. If we
 find that some of our patches have Windows issues, we will certainly
 fix those before proposing their inclusion in CPython.

 On the assumption (sorry, I've done little more than read the press
 releases so far) that you're starting from the CPython base and
 incrementally patching things, you currently have strong Windows
 support. It would be a shame if that got gradually chipped away
 through neglect, until it became a big job to reinstate it.

That's correct, we're starting with CPython 2.6.1.

 If the Unladen Swallow team doesn't include any Windows developers,
 you're a bit stuck, I guess, but could you not at least have a Windows
 buildbot which keeps tabs on the current status? Then you might
 encourage interested Windows bystanders to check in occasionally and
 maybe offer fixes.

We're definitely going to set up buildslaves for Windows and other
platforms (currently we're only running Linux buildslaves). We're
trying to solicit 20% time help from Google Windows developers, but
that experience is relatively rare compared to the vast sea of
Linux-focused engineers (though that's true of the open-source
community in general). Also, it may be that some of the components
we're reusing don't support Windows, or perhaps worse, offer degraded
performance on Windows. We believe we can fix these problems as they
come up -- we certainly don't want Windows issues to prevent patches
from going into mainline -- but it's still a risk that Windows issues
may slow down our development or prevent us from doing something fancy
down the road, and I wanted to be up front about that risk.

I've updated our ProjectPlan in hopes of clarifying this. That section
of the docs was copy/pasted off a slide, and was a bit too terse :)

Collin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Partial 2to3?

2009-03-27 Thread Collin Winter
2009/3/27  s...@pobox.com:
 Following up on yesterday's conversation about 2to3 and 3to2, I wonder if
 it's easily possible to run 2to3 with a specific small subset of its fixers?
 For example, people not wanting to make the 2-3 leap yet might still be
 interersted in the exception handling changes (except Foo as exc)?

Sure, that's easily possible: run 2to3 -f
some_fixer,other_fixer,this_fixer,that_fixer. You can get a full list
of fixers using the --list-fixes option.

Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] GPython?

2009-03-26 Thread Collin Winter
On Thu, Mar 26, 2009 at 8:05 PM, Terry Reedy tjre...@udel.edu wrote:
 An ars technica articla just linked to in a python-list post

 http://arstechnica.com/open-source/news/2009/03/google-launches-project-to-boost-python-performance-by-5x.ars

 calls the following project Google launched
 http://code.google.com/p/unladen-swallow/wiki/ProjectPlan

 (Though the project page does not really claim that.)

Hi, I'm the tech lead for Unladen Swallow. Jeffrey Yasskin and Thomas
Wouters are also working on this project.

Unladen Swallow is Google-sponsored, but not Google-owned. This is an
open-source branch that we're working on, focused on performance, and
we want to move all of our work upstream as quickly as possible. In
fact, right now I'm adding a last few tests before putting our cPickle
patches up on the tracker for further review.

 I am sure some people here might find this interesting.

 I'd love to have a faster CPython, but this note:
 Will probably kill Python Windows support (for now).
 would kill merger back into mainline (for now) without one opposing being
 'conservative'.

To clarify, when I wrote 'conservative', I wasn't being disparaging. A
resistance to change can certainly be a good thing, and something that
I think is very healthy in these situations. We certainly have to
prove ourselves, especially given some of the fairly radical things
we're thinking of [1]. We believe we can justify these changes, but I
*do* want to be forced to justify them publicly; I don't think
python-dev would be doing its job if some of these things were merely
accepted without discussion.

In particular, Windows support is one of those things we'll need to
address on our end. LLVM's Windows support may be spotty, or there may
be other Windows issues we inadvertently introduce. None of the three
of us have Windows machines, nor do we particularly want to acquire
them :), and Windows support isn't going to be a big priority. If we
find that some of our patches have Windows issues, we will certainly
fix those before proposing their inclusion in CPython.

 If one adds type annotations so that values can be unboxed, would not
 Cython, etc, do even better for speedup?

Possibly, but we want to see how far we can push the current language
before we even start thinking of tinkering with the language spec.
Assigning meaning to function annotations is something that PEP 3107
explicitly avoids, and I'm not sure Unladen Swallow (or anyone else)
would want to take the plunge into coming up with broadly-acceptable
type systems for Python. That would be a bikeshed discussion of such
magnitude, you'd have to invent new colors to paint the thing.

Collin Winter

[1] - http://code.google.com/p/unladen-swallow/wiki/ProjectPlan
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] GPython?

2009-03-26 Thread Collin Winter
On Thu, Mar 26, 2009 at 11:26 PM, Alexandre Vassalotti
alexan...@peadrop.com wrote:
 On Thu, Mar 26, 2009 at 11:40 PM, Collin Winter coll...@gmail.com wrote:
 In fact, right now I'm adding a last few tests before putting our cPickle
 patches up on the tracker for further review.


 Put me in the nosy list when you do; and when I get some free time, I
 will give your patches a complete review. I've already taken a quick
 look at cPickle changes you did in Unladen and I think some (i.e., the
 custom memo table) are definitely worthy to be merged in the
 mainlines.

Will do, thanks for volunteering! jyasskin has already reviewed them
internally, but it'll be good to put them through another set of eyes.

Collin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] speeding up PyObject_GetItem

2009-03-24 Thread Collin Winter
2009/3/24 Daniel Stutzbach dan...@stutzbachenterprises.com:
 On Tue, Mar 24, 2009 at 10:13 AM, Mark Dickinson dicki...@gmail.com wrote:

 2009/3/24 Daniel Stutzbach dan...@stutzbachenterprises.com:
  [...]
  100 nanoseconds, py3k trunk:
  ceval - PyObject_GetItem (object.c) - list_subscript (listobject.c) -
  PyNumber_AsSsize_t (object.c) - PyLong_AsSsize_t (longobject.c)
  [more timings snipped]

 Does removing the PyLong_Check call in PyLong_AsSsize_t
 make any noticeable difference to these timings?

 Making no other changes from the trunk, removing the PyLong_Check and NULL
 check from PyLong_AsSsize_t shaves off 4 nanoseconds (or around 4% since the
 trunk is around 100 nanoseconds).

 Here's what I'm testing with, by the way:

 ./python.exe Lib/timeit.py -r 10 -s 'x = list(range(10))' 'x[5]'

What difference does it make on real applications? Are you running any
macro-benchmarks against this?

Collin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 3.1 performance

2009-03-08 Thread Collin Winter
On Sun, Mar 8, 2009 at 7:30 AM, Christian Heimes li...@cheimes.de wrote:
 Antoine Pitrou wrote:
 Hi,

 Victor Stinner victor.stinner at haypocalc.com writes:
 Summary (minimum total) on 32 bits CPU:
  * Python 2.6.1: 8762 ms
  * Python 3.0.1: 8977 ms
  * Python 3.1a1: 9228 ms (slower than 3.0)

 Have you compiled with or without --with-computed-gotos?

 Why is the feature still disabled by default?

 Christian

 PS: Holy moly! Computed gotos totally put my Python on fire! The feature
 increases the minimum run-time by approx. 25% and the average run-time
 by approx. 40% on my Ubuntu 8.10 box (AMD64, Intel(R) Core(TM)2 CPU
 T7600  @ 2.33GHz).

Note that of the benchmarks tested, PyBench benefits the most from
threaded eval loop designs. Other systems benefit less; for example,
Django template benchmarks were only sped up by 7-8% when I was
testing it.

Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pickler/Unpickler API clarification

2009-03-06 Thread Collin Winter
On Fri, Mar 6, 2009 at 10:01 AM, Michael Haggerty mhag...@alum.mit.edu wrote:
 Antoine Pitrou wrote:
 Le vendredi 06 mars 2009 à 13:44 +0100, Michael Haggerty a écrit :
 Antoine Pitrou wrote:
 Michael Haggerty mhagger at alum.mit.edu writes:
 It is easy to optimize the pickling of instances by giving them
 __getstate__() and __setstate__() methods.  But the pickler still
 records the type of each object (essentially, the name of its class) in
 each record.  The space for these strings constituted a large fraction
 of the database size.
 If these strings are not interned, then perhaps they should be.
 There is a similar optimization proposal (w/ patch) for attribute names:
 http://bugs.python.org/issue5084
 If I understand correctly, this would not help:

 - on writing, the strings are identical anyway, because they are read
 out of the class's __name__ and __module__ fields.  Therefore the
 Pickler's usual memoizing behavior will prevent the strings from being
 written more than once.

 Then why did you say that the space for these strings constituted a
 large fraction of the database size, if they are already shared? Are
 your objects so tiny that even the space taken by the pointer to the
 type name grows the size of the database significantly?

 Sorry for the confusion.  I thought you were suggesting the change to
 help the more typical use case, when a single Pickler is used for a lot
 of data.  That use case will not be helped by interning the class
 __name__ and __module__ strings, for the reasons given in my previous email.

 In my case, the strings are shared via the Pickler memoizing mechanism
 because I pre-populate the memo (using the API that the OP proposes to
 remove), so your suggestion won't help my current code, either.  It was
 before I implemented the pre-populated memoizer that the space for
 these strings constituted a large fraction of the database size.  But
 your suggestion wouldn't help that case, either.

 Here are the main use cases:

 1. Saving and loading one large record.  A class's __name__ string is
 the same string object every time it is retrieved, so it only needs to
 be stored once and the Pickler memo mechanism works.  Similarly for the
 class's __module__ string.

 2. Saving and loading lots of records sequentially.  Provided a single
 Pickler is used for all records and its memo is never cleared, this
 works just as well as case 1.

 3. Saving and loading lots of records in random order, as for example in
 the shelve module.  It is not possible to reuse a Pickler with retained
 memo, because the Unpickler might not encounter objects in the right
 order.  There are two subcases:

   a. Use a clean Pickler/Unpickler object for each record.  In this
 case the __name__ and __module__ of a class will appear once in each
 record in which the class appears.  (This is the case regardless of
 whether they are interned.)  On reading, the __name__ and __module__ are
 only used to look up the class, so interning them won't help.  It is
 thus impossible to avoid wasting a lot of space in the database.

   b. Use a Pickler/Unpickler with a preset memo for each record (my
 unorthodox technique).  In this case the class __name__ and __module__
 will be memoized in the shared memo, so in other records only their ID
 needs to be stored (in fact, only the ID of the class object itself).
 This allows the database to be smaller, but does not have any effect on
 the RAM usage of the loaded objects.

 If the OP's proposal is accepted, 3b will become impossible.  The
 technique seems not to be well known, so maybe it doesn't need to be
 supported.  It would mean some extra work for me on the cvs2svn project
 though :-(

Talking it over with Guido, support for the memo attribute will have
to stay. I shall add it back to my patches.

Collin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Pickler/Unpickler API clarification

2009-03-05 Thread Collin Winter
I'm working on some performance patches for cPickle, and one of the
bigger wins so far has been replacing the Pickler's memo dict with a
custom hashtable (and hence removing memo's getters and setters). In
looking over this, Jeffrey Yasskin commented that this would break
anyone who was accessing the memo attribute.

I've found a few examples of code using the memo attribute ([1], [2],
[3]), and there are probably more out there, but the memo attribute
doesn't look like part of the API to me. It's only documented in
http://docs.python.org/library/pickle.html as you used to need this
before Python 2.3, but don't anymore. However: I don't believe you
should ever need this attribute.

The usages of memo I've seen break down into two camps: clearing the
memo, and wanting to explicitly populate the memo with predefined
values. Clearing the memo is recommended as part of reusing Pickler
objects, but I can't fathom when you would want to reuse a Pickler
*without* clearing the memo. Reusing the Pickler without clearing the
memo will produce pickles that are, as best I can see, invalid -- at
least, pickletools.dis() rejects this, which is the closest thing we
have to a validator. Explicitly setting memo values has the same
problem: an easy, very brittle way to produce invalid data.

So my questions are these:
1) Should Pickler/Unpickler objects automatically clear their memos
when dumping/loading?
2) Is memo an intentionally exposed, supported part of the
Pickler/Unpickler API, despite the lack of documentation and tests?

Thanks,
Collin

[1] - 
http://google.com/codesearch/p?hl=en#Qx8E-7HUBTk/trunk/google/appengine/api/memcache/__init__.pyq=lang:py%20%5C.memo
[2] - 
http://google.com/codesearch/p?hl=en#M-DDI-lCOgE/lib/python2.4/site-packages/cvs2svn_lib/primed_pickle.pyq=lang:py%20%5C.memo
[3] - 
http://google.com/codesearch/p?hl=en#l_w_cA4dKMY/AtlasAnalysis/2.0.3-LST-1/PhysicsAnalysis/PyAnalysis/PyAnalysisUtils/python/root_pickle.pyq=lang:py%20pick.*%5C.memo%5Cb
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A suggestion: Do proto-PEPs in Google Docs

2009-02-19 Thread Collin Winter
On Thu, Feb 19, 2009 at 7:17 PM, Stephen J. Turnbull
turnb...@sk.tsukuba.ac.jp wrote:
 On Python-Ideas, Guido van Rossum writes:

   On Thu, Feb 19, 2009 at 2:12 AM, Greg Ewing wrote:

Fifth draft of the PEP. Re-worded a few things slightly
to hopefully make the proposal a bit clearer up front.
  
   Wow, how I long for the days when we routinely put things like this
   under revision control so its easy to compare versions.

 FWIW, Google Docs is almost there.  Working with Brett et al on early
 drafts of PEP 0374 was easy and pleasant, and Google Docs gives
 control of access to the document to the editor, not the Subversion
 admin.  The ability to make comments that are not visible to
 non-editors was nice.  Now that it's in Subversion it's much less
 convenient for me (a non-committer).  I actually have to *decide* to
 work on it, rather than simply raising a browser window, hitting
 refresh and fixing a typo or two (then back to day job work).

 The main problem with Google Docs is that is records a revision
 automatically every so often (good) but doesn't prune the automatic
 commits (possibly hard to do efficiently) OR mark user saves specially
 (easy to do).  This lack of marking important revisions makes the
 diff functionality kind of tedious.

 I don't know how automatic the conversion to reST was, but the PEP in
 Subversion is a quite accurate conversion of the Google Doc version.

 Overall, I recommend use of Google Docs for Python-Ideas level of
 PEP drafts.

Rietveld would also be a good option: it offers more at-will revision
control (rather than whenever Google Docs decides), allows you to
attach comments to the revisions, and will give you nice diffs between
PEP iterations.

Collin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] IO implementation: in C and Python?

2009-02-19 Thread Collin Winter
On Thu, Feb 19, 2009 at 9:07 PM, Guido van Rossum gu...@python.org wrote:
 On Thu, Feb 19, 2009 at 8:38 PM, Brett Cannon br...@python.org wrote:
 On Thu, Feb 19, 2009 at 19:41, Benjamin Peterson benja...@python.org
 wrote:
 As we prepare to merge the io-c branch, the question has come up [1]
 about the original Python implementation. Should it just be deleted in
 favor C version? The wish to maintain the two implementations together
 has been raised on the basis that Python is easier to experiment on
 and read (for other vm implementors).

 Probably not a surprise, but +1 from me for keeping the pure Python version
 around for the benefit of other VMs as well as a reference implementation.

 You have been practice channeling me again, haven't you? I like the
 idea of having two (closely matching) implementations very much.

Agreed. In particular, this helps any projects that are focused on
improving the performance of pure-Python code: they can work on
minimizing the delta between the Python and C versions.

Collin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Partial function application 'from the right'

2009-02-03 Thread Collin Winter
On Tue, Feb 3, 2009 at 5:44 AM, Ben North b...@redfrontdoor.org wrote:
 Hi,

 Thanks for the further responses.  Again, I'll try to summarise:

 Scott David Daniels pointed out an awkward interaction when chaining
 partial applications, such that it could become very unclear what was
 going to happen when the final function is called:

 If you have:
 def button(root, position, action=None, text='*', color=None):
 ...
 ...
 blue_button = partial(button, my_root, color=(0,0,1))

 Should partial_right(blue_button, 'red') change the color or the text?

 Calvin Spealman mentioned a previous patch of his which took the 'hole'
 approach, i.e.:

 [...] my partial.skip patch, which allows the following usage:

split_one = partial(str.split, partial.skip, 1)

 This would solve my original problems, and, continuing Scott's example,

   def on_clicked(...): ...

   _ = partial.skip
   clickable_blue_button = partial(blue_button, _, on_clicked)

 has a clear enough meaning I think:

   clickable_blue_button('top-left corner')
   = blue_button('top-left corner', on_clicked)
   = button(my_root, 'top-left corner', on_clicked, color=(0,0,1))

 Calvin's idea/patch sounds good to me, then.  Others also liked it.
 Could it be re-considered, instead of the partial_right idea?

Have any of the original objections to Calvin's patch
(http://bugs.python.org/issue1706256) been addressed? If not, I don't
see anything in these threads that justify resurrecting it.

I still haven't seen any real code presented that would benefit from
partial.skip or partial_right.

Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Partial function application 'from the right'

2009-02-03 Thread Collin Winter
On Tue, Feb 3, 2009 at 11:53 AM, Antoine Pitrou solip...@pitrou.net wrote:
 Collin Winter collinw at gmail.com writes:

 Have any of the original objections to Calvin's patch
 (http://bugs.python.org/issue1706256) been addressed? If not, I don't
 see anything in these threads that justify resurrecting it.

 I still haven't seen any real code presented that would benefit from
 partial.skip or partial_right.

 The arguments for and against the patch could be brought against partial()
 itself, so I don't understand the -1's at all.

Quite so, but that doesn't justify adding more capabilities to partial().

Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Partial function application 'from the right'

2009-01-29 Thread Collin Winter
On Thu, Jan 29, 2009 at 6:12 AM, Ben North b...@redfrontdoor.org wrote:
 Hi,

 I find 'functools.partial' useful, but occasionally I'm unable to use it
 because it lacks a 'from the right' version.  E.g., to create a function
 which splits a string on commas, you can't say

   # Won't work when called:
   split_comma = partial(str.split, sep = ',')
[snip]
 I've created a patch which adds a 'partial_right' function.  The two
 examples above:

import functools, math

split_comma = functools.partial_right(str.split, ',')
split_comma('a,b,c')
   ['a', 'b', 'c']

log_10 = functools.partial_right(math.log, 10.0)
log_10(100.0)
   2.0

Can you point to real code that this makes more readable?

Collin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decoder functions accept str in py3k

2009-01-07 Thread Collin Winter
On Wed, Jan 7, 2009 at 2:35 PM, Brett Cannon br...@python.org wrote:
 On Wed, Jan 7, 2009 at 10:57, M.-A. Lemburg m...@egenix.com wrote:
 [SNIP]
 BTW: The _codecsmodule.c file is a 4 spaces indent file as well (just
 like all Unicode support source files). Someone apparently has added
 tabs when adding support for Py_buffers.


 It looks like this formatting mix-up is just going to get worse for
 the next few years while the 2.x series is still being worked on.
 Should we just bite the bullet and start adding modelines for Vim and
 Emacs to .c/.h files that are written in the old 2.x style? For Vim I
 can then update the vimrc in Misc/Vim to then have 4-space indent be
 the default for C files.

Or better yet, really bite the bullet and just reindent everything to
spaces. Not every one uses vim or emacs, nor do all tools understand
their modelines. FYI, there are options to svn blame and git to skip
whitespace-only changes.

Just-spent-an-hour-fixing-screwed-up-indents-in-changes-to-Python/*.c-ly,
Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Consolidating names in the `unittest` module

2008-07-16 Thread Collin Winter
On Tue, Jul 15, 2008 at 6:03 PM, Michael Foord
[EMAIL PROTECTED] wrote:
 Collin Winter wrote:

 On Tue, Jul 15, 2008 at 6:58 AM, Ben Finney [EMAIL PROTECTED]
 wrote:

 Backwards Compatibility
 ===

 The names to be obsoleted should be deprecated and removed according
 to the schedule for modules in PEP 4 [#PEP-4]_.

 While deprecated, use of the deprecated attributes should raise a
 ``DeprecationWarning``, with a message stating which replacement name
 should be used.


 Is any provision being made for a 2to3 fixer/otherwise-automated
 transition for the changes you propose here?


 As the deprecation is intended for 2.X and 3.X - is 2to3 fixer needed?

IMO some kind of automated transition tool is needed -- anyone who has
the time to convert their codebase by hand (for some definition of by
hand that involves sed) doesn't have enough to do.

Collin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Consolidating names in the `unittest` module

2008-07-16 Thread Collin Winter
On Wed, Jul 16, 2008 at 5:21 AM, Michael Foord
[EMAIL PROTECTED] wrote:
 Terry Reedy wrote:


 Michael Foord wrote:

 Collin Winter wrote:

 Is any provision being made for a 2to3 fixer/otherwise-automated
 transition for the changes you propose here?


 As the deprecation is intended for 2.X and 3.X - is 2to3 fixer needed?

 A fixer will only be needed when it actually is needed, but when it is, it
 should be a unittest-name fixer since previous 3.x code will also need
 fixing.  Since the duplicates are multiples names for the same objects, the
 fixer should be a trivial name substitution.

 Can 2to3 fixers be used for 2to2 and 3to3 translation then?

The intention is for the infrastructure behind 2to3 to be generally
reusable for other Python source-to-source translation tools, be that
2to2 or 3to3. That hasn't fully materialized yet, but it's getting
there.

Collin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Unittest PEP do's and don'ts (BDFL pronouncement)

2008-07-16 Thread Collin Winter
On Wed, Jul 16, 2008 at 2:03 PM, Raymond Hettinger [EMAIL PROTECTED] wrote:
 If some people want to proceed down the path of useful additions,
 I challenge them to think bigger.  Give me some test methods that
 improve my life.  Don't give me thirty ways to spell something I can
 already do.

 From: Michael Foord [EMAIL PROTECTED]

 I assert that... the following changes do meet those conditions:

 assertRaisesWithMessage

 . . .

 Changes to assertEquals so that the failure messages are more useful

 ...

 assertIn / assertNotIn I use very regularly for collection membership

 - self.assert_(func(x) in result_set)
 + self.assertIn(func(x), result_set)

 Yawn.  The gain is zero.  Actually, it's negative because the second
 doesn't read as nicely as the pure python expression.

It's only negative if the method doesn't do anything special. For
example, an assertListEqual() method can tell you *how* the lists
differ, which the pure Python expression can't -- all the Python
expression can say is yes or no. We have methods like this at work
and they're very useful.

That said, I see no reason why these things have to be methods. The
self. method boilerplate is cluttering line-noise in this case.  I can
easily imagine a module of nothing but comparison functions.

Collin Winter

 Think bigger!  No fat APIs.  Do something cool!  Checkout the
 dynamic test creation in test_decimal to see if it can be generalized.
 Give me some cool test runners.  Maybe find a way to automatically
 launch pdb or to dump the locals variables at the time of failure.
 Maybe move the test_*.py search into the unittest module.

 We want *small* and powerful.  The api for TestCase instances is
 already way too fat.  See an old discussion on the subject at:
  http://bugs.python.org/issue2578


 The run_tests function for running collections of tests. Almost every
 project I've worked on has had an ad-hoc imnplementation of this, collecting
 test modules and turning them into a suitable collection for use with
 unittest.

 Now, that's more like it.Propose more cool stuff like this and
 the module really will be improved.


 assertIs / assertIsNot also sounds good, but is not something I would miss
 if they weren't added.

 Doh!  We're back to replacing clean expressions using pure python syntax
 with a method name equivalent.  That's a step backwards.



 Raymond
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 http://mail.python.org/mailman/options/python-dev/collinw%40gmail.com

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Consolidating names in the `unittest` module

2008-07-15 Thread Collin Winter
(…)``
  Replaces ``countTestCases(…)``

 ``TestLoader`` attributes
 ~

 ``sort_test_methods_using``
  Replaces ``sortTestMethodsUsing``

 ``suite_class``
  Replaces ``suiteClass``

 ``test_method_prefix``
  Replaces ``testMethodPrefix``

 ``get_test_case_names(self, test_case_class)``
  Replaces ``getTestCaseNames(self, testCaseClass)``

 ``load_tests_from_module(…)``
  Replaces ``loadTestsFromModule(…)``

 ``load_tests_from_name(…)``
  Replaces ``loadTestsFromName(…)``

 ``load_tests_from_names(…)``
  Replaces ``loadTestsFromNames(…)``

 ``load_tests_from_test_case(self, test_case_class)``
  Replaces ``loadTestsFromTestCase(self, testCaseClass)``

 ``_TextTestResult`` attributes
 ~~

 ``show_all``
  Replaces ``showAll``

 ``add_error(…)``
  Replaces ``addError(…)``

 ``add_failure(…)``
  Replaces ``addFailure(…)``

 ``add_success(…)``
  Replaces ``addSuccess(…)``

 ``get_description(…)``
  Replaces ``getDescription(…)``

 ``print_error_list(…)``
  Replaces ``printErrorList(…)``

 ``print_errors(…)``
  Replaces ``printErrors(…)``

 ``start_test(…)``
  Replaces ``startTest(…)``

 ``TextTestRunner`` attributes
 ~

 ``_make_result(…)``
  Replaces ``_makeResult(…)``

 ``TestProgram`` attributes
 ~~

 ``__init__(self, module, default_test, argv, test_runner, test_loader)``
  Replaces ``__init__(self, module, defaultTest, argv, testRunner, 
 testLoader)``

 ``create_tests(…)``
  Replaces ``createTests(…)``

 ``parse_args(…)``
  Replaces ``parseArgs(…)``

 ``run_tests(…)``
  Replaces ``runTests(…)``

 ``usage_exit(…)``
  Replaces ``usageExit(…)``


 Rationale
 =

 Redundant names
 ---

 The current API, with two or in some cases three different names
 referencing exactly the same function, leads to an overbroad and
 redundant API that violates PEP 20 [#PEP-20]_ (there should be one,
 and preferably only one, obvious way to do it).

 Removal of ``assert*`` names
 

 While there is consensus support to `remove redundant names`_ for the
 ``TestCase`` test methods, the issue of which set of names should be
 retained is controversial.

 Arguments in favour of retaining only the ``assert*`` names:

 * BDFL preference: The BDFL has stated [#vanrossum-1]_ a preference
  for the ``assert*`` names.

 * Precedent: The Python standard library currently uses the
  ``assert*`` names by a roughly 8:1 majority over the ``fail*``
  names. (Counting unit tests in the py3k tree at 2008-07-15
  [#pitrou-1]_.)

  An ad-hoc sampling of other projects that use `unittest` also
  demonstrates strong preference for use of the ``assert*`` names
  [#bennetts-1]_.

 * Positive admonition: The ``assert*`` names state the intent of how
  the code under test *should* behave, while the ``fail*`` names are
  phrased in terms of how the code *should not* behave.

 Arguments in favour of retaining only the ``fail*`` names:

 * Explicit is better than implicit: The ``fail*`` names state *what
  the function will do* explicitly: fail the test. With the
  ``assert*`` names, the action to be taken is only implicit.

 * Avoid false implication: The test methods do not have any necessary
  connection with the built-in ``assert`` statement. Even the
  exception raised, while it defaults to ``AssertionException``, is
  explicitly customisable via the documented ``failure_exception``
  attribute. Choosing the ``fail*`` names avoids the false association
  with either of these.

  This is exacerbated by the plain-boolean test using a name of
  ``assert_`` (with a trailing underscore) to avoid a name collision
  with the built-in ``assert`` statement. The corresponding
  ``fail_if`` name has no such issue.

 PEP 8 names
 ---

 Although `unittest` (and its predecessor `PyUnit`) are intended to be
 familiar to users of other xUnit interfaces, there is no attempt at
 direct API compatibility since the only code that Python's `unittest`
 interfaces with is other Python code. The module is in the standard
 library and its names should all conform with PEP 8 [#PEP-8]_.


 Backwards Compatibility
 ===

 The names to be obsoleted should be deprecated and removed according
 to the schedule for modules in PEP 4 [#PEP-4]_.

 While deprecated, use of the deprecated attributes should raise a
 ``DeprecationWarning``, with a message stating which replacement name
 should be used.

Is any provision being made for a 2to3 fixer/otherwise-automated
transition for the changes you propose here?

Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Can someone check my lib2to3 change for fix_imports?

2008-07-02 Thread Collin Winter
On Tue, Jul 1, 2008 at 7:38 PM, Benjamin Peterson
[EMAIL PROTECTED] wrote:
 On Tue, Jul 1, 2008 at 9:04 PM, Brett Cannon [EMAIL PROTECTED] wrote:
 I just committed r64651 which is my attempt to add support to
 fix_imports so that modules that have been split up in 3.0 can be
 properly fixed. 2to3's test suite passes and all, but I am not sure if
 I botched it somehow since I did the change slightly blind. Can
 someone just do a quick check to make sure I did it properly? Also,
 what order should renames be declared to give priority to certain
 renames (e.g., urllib should probably be renamed to urllib.requeste
 over urllib.error when not used in a ``from ... import`` statement).

 Well for starters, you know the test for fix_imports is disabled, right?

Why was this test disabled, rather than fixed? That seems a rather
poor solution to the problem of it taking longer than desired to run.

Collin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Can someone check my lib2to3 change for fix_imports?

2008-07-02 Thread Collin Winter
On Tue, Jul 1, 2008 at 11:32 PM, Brett Cannon [EMAIL PROTECTED] wrote:
 On Tue, Jul 1, 2008 at 8:36 PM, Brett Cannon [EMAIL PROTECTED] wrote:
 On Tue, Jul 1, 2008 at 7:38 PM, Benjamin Peterson
 [EMAIL PROTECTED] wrote:
 On Tue, Jul 1, 2008 at 9:04 PM, Brett Cannon [EMAIL PROTECTED] wrote:
 I just committed r64651 which is my attempt to add support to
 fix_imports so that modules that have been split up in 3.0 can be
 properly fixed. 2to3's test suite passes and all, but I am not sure if
 I botched it somehow since I did the change slightly blind. Can
 someone just do a quick check to make sure I did it properly? Also,
 what order should renames be declared to give priority to certain
 renames (e.g., urllib should probably be renamed to urllib.requeste
 over urllib.error when not used in a ``from ... import`` statement).

 Well for starters, you know the test for fix_imports is disabled, right?


 Nope, I forgot and turning it on has it failing running under 2.5.


 And refactor.py cannot be run directly from 2.5 because of a relative
 import and in 2.6 (where runpy has extra smarts) it still doesn't work
 thanks to main() not being passed an argument is needs (Issue3131).

Why are you trying to run refactor.py directly, rather than using 2to3
(http://svn.python.org/view/sandbox/trunk/2to3/2to3) as an entry
point?

 Looks like 2to3 needs some TLC.

Agreed. A lot of the pending bugs seem to be related to the version of
lib2to3 in the stdlib, rather than the stand-alone product. Neal
Norwitz and I have been working to turn parts of 2to3 into a more
general refactoring library; once that's done (or even preferably
before), lib2to3 should be removed from the stdlib. It's causing far
more trouble than it's worth.

Collin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Can someone check my lib2to3 change for fix_imports?

2008-07-02 Thread Collin Winter
On Wed, Jul 2, 2008 at 12:51 PM, Martin v. Löwis [EMAIL PROTECTED] wrote:
 Why was this test disabled, rather than fixed? That seems a rather
 poor solution to the problem of it taking longer than desired to run.

 I disabled it because I didn't know how to fix it, and created bug
 reports 2968 and 2969 in return.

So you did. I didn't notice them, sorry.

 It is policy that tests that break
 get disabled, rather than keeping them broken.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Can someone check my lib2to3 change for fix_imports?

2008-07-02 Thread Collin Winter
On Wed, Jul 2, 2008 at 1:09 PM, Martin v. Löwis [EMAIL PROTECTED] wrote:
 Agreed. A lot of the pending bugs seem to be related to the version of
 lib2to3 in the stdlib, rather than the stand-alone product. Neal
 Norwitz and I have been working to turn parts of 2to3 into a more
 general refactoring library; once that's done (or even preferably
 before), lib2to3 should be removed from the stdlib. It's causing far
 more trouble than it's worth.

 I disagree. I think it is quite useful that distutils is able to
 invoke it, and other people also asked for that feature on PyCon.

But distutils currently *doesn't* invoke it, AFAICT (unless that
support is implemented somewhere other than trunk/Lib/distutils/), and
no-one has stepped up to make that happen in the months since PyCon.
Moreover, as I told those people who asked for this at PyCon, 2to3 is
and will never be perfect, meaning that at best, distutils/2to3
integration would look like python setup.py run2to3, where distutils
is just a long-hand way of running 2to3 over your code.

This strikes me as a waste of time.

 Why do you think the trouble wouldn't be caused if it wasn't
 a standard library package?

Problems with the current setup:
1) People are currently confused as to where they should be commit fixes.
2) Changes to the sandbox version have to be manually merged into the
stdlib version, which is more overhead than I think it's worth. In
addition, the stdlib version lags the sandbox version.
3) At least one bug report (issue3131) has mentioned problems with the
stdlib 2to3 exhibiting problems that the stand-alone version does not.
This is again extra overhead.
4) The test_imports test was commented out because of stdlib test
policy. I'd rather not have that policy imposed on 2to3.

Collin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] lib2to3, need some light on the imports fixer

2008-05-12 Thread Collin Winter
On Mon, May 12, 2008 at 1:58 PM, Guilherme Polo [EMAIL PROTECTED] wrote:
 Hello,

  Would someone tell me how can I add a new entry in the MAPPING dict in
  the lib2to3/fixes/fix_imports.py that does the following:

  import A gets fixed as import C.D as A

  Right now it is fixing by doing import C.D and changing several
  other lines in the code to use this new C.D name. I wanted to avoid
  these changes if possible.

I don't believe there's a way to do that, but adding support for it
should be fairly straight-forward. Assign the patch to me for review.

Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


  1   2   >