date:20110830

Re: [Python-Dev] Python 3 optimizations continued...

2011-08-30 Thread Greg Ewing


Nick Coghlan wrote:


Personally, I *like* CPython fitting into the simple-and-portable
niche in the Python interpreter space.


Me, too! I like that I can read the CPython source and
understand what it's doing most of the time. Please don't
screw that up by attempting to perform heroic optimisations.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 393 review

2011-08-30 Thread Martin v. Löwis

 I don't compare ASCII and ISO-8859-1 decoders. I was asking if decoding 
 b'abc' 
 from ISO-8859-1 is faster than decoding b'ab\xff' from ISO-8859-1, and if 
 yes: 
 why?

No, that makes no difference.

 
 Your patch replaces PyUnicode_New(size, 255) ...  memcpy(), by 
 PyUnicode_FromUCS1().

You compared to the wrong revision. PyUnicode_New is already a PEP 393
function, and this version you have been comparing to is indeed faster
than the current version. However, it is also incorrect, as it fails
to compute the maxchar, and hence fails to detect pure-ASCII strings.

See below for the actual diff. It should be obvious why the 393 version
is faster: 3.3 currently needs to widen each char (to 16 or 32 bits).

Regards,
Martin

@@ -5569,41 +5569,8 @@
   Py_ssize_t size,
   const char *errors)
 {
-PyUnicodeObject *v;
-Py_UNICODE *p;
-const char *e, *unrolled_end;
-
 /* Latin-1 is equivalent to the first 256 ordinals in Unicode. */
-if (size == 1) {
-Py_UNICODE r = *(unsigned char*)s;
-return PyUnicode_FromUnicode(r, 1);
-}
-
-v = _PyUnicode_New(size);
-if (v == NULL)
-goto onError;
-if (size == 0)
-return (PyObject *)v;
-p = PyUnicode_AS_UNICODE(v);
-e = s + size;
-/* Unrolling the copy makes it much faster by reducing the looping
-   overhead. This is similar to what many memcpy() implementations
do. */
-unrolled_end = e - 4;
-while (s  unrolled_end) {
-p[0] = (unsigned char) s[0];
-p[1] = (unsigned char) s[1];
-p[2] = (unsigned char) s[2];
-p[3] = (unsigned char) s[3];
-s += 4;
-p += 4;
-}
-while (s  e)
-*p++ = (unsigned char) *s++;
-return (PyObject *)v;
-
-  onError:
-Py_XDECREF(v);
-return NULL;
+return PyUnicode_FromUCS1((unsigned char*)s, size);
 }

 /* create or adjust a UnicodeEncodeError */
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 3 optimizations continued...

2011-08-30 Thread Eli Bendersky

On Tue, Aug 30, 2011 at 08:57, Greg Ewing greg.ew...@canterbury.ac.nzwrote:

 Nick Coghlan wrote:

  Personally, I *like* CPython fitting into the simple-and-portable
 niche in the Python interpreter space.


 Me, too! I like that I can read the CPython source and
 understand what it's doing most of the time. Please don't
 screw that up by attempting to perform heroic optimisations.

 --


Following this argument to the extreme, the bytecode evaluation code of
CPython can be simplified quite a bit. Lose 2x performance but gain a lot of
readability. Does that sound like a good deal? I don't intend to sound
sarcastic, just show that IMHO this argument isn't a good one. I think that
even clever optimized code can be properly written and *documented* to make
the task of understanding it feasible. Personally, I'd love CPython to be a
bit faster and see no reason to give up optimization opportunities for the
sake of code readability.

Eli
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 3 optimizations continued...

2011-08-30 Thread Nick Coghlan

On Tue, Aug 30, 2011 at 4:22 PM, Eli Bendersky eli...@gmail.com wrote:
 On Tue, Aug 30, 2011 at 08:57, Greg Ewing greg.ew...@canterbury.ac.nz
 wrote:
 Following this argument to the extreme, the bytecode evaluation code of
 CPython can be simplified quite a bit. Lose 2x performance but gain a lot of
 readability. Does that sound like a good deal? I don't intend to sound
 sarcastic, just show that IMHO this argument isn't a good one. I think that
 even clever optimized code can be properly written and *documented* to make
 the task of understanding it feasible. Personally, I'd love CPython to be a
 bit faster and see no reason to give up optimization opportunities for the
 sake of code readability.

Yeah, it's definitely a trade-off - the point I was trying to make is
that there *is* a trade-off being made between complexity and speed.

I think the computed-gotos stuff struck a nice balance - the macro-fu
involved means that you can still understand what the main eval loop
is *doing*, even if you don't know exactly what's hidden behind the
target macros. Ditto for the older opcode prediction feature and the
peephole optimiser - separation of concerns means that you can
understand the overall flow of events without needing to understand
every little detail.

This is where the request to extract individual orthogonal changes and
submit separate patches comes from - it makes it clear that the
independent changes *can* be separated cleanly, and aren't a giant
ball of incomprehensible mud. It's the difference between complex
(lots of moving parts, that can each be understood on their own and
are then composed into a meaningful whole) and complicated (massive
patches that don't work at all if any one component is delayed)

Eugene Toder's AST optimiser work that I still hope to get into 3.3
will have to undergo a similar process - the current patch covers a
bit too much ground and needs to be broken up into smaller steps
before we can seriously consider pushing it into the core.

Regards,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 393 review

2011-08-30 Thread Martin v. Löwis

 This looks very nice. Is 3.3 a wide build? (how about a narrow build?)

It's a wide build. For reference, I also attach 64-bit narrow build
results, and 32-bit results (wide, narrow, and PEP 393). Savings are
much smaller in narrow builds (larger on 32-bit systems than on
64-bit systems).

 (is it with your own port of Django to py3k, or is there an official
 branch for it?)

It's https://bitbucket.org/loewis/django-3k

Regards,
Martin
3.3.0a0 (default:45b63a8a76c9, Aug 30 2011, 09:30:21) 
[GCC 4.6.1]
Strings: 36070
Chars: 1306295
Bytes: 3694690
Other objects: 1085060

By Length (length: numstrings)
Up to 4: 5709
Up to 8: 8989
Up to 16: 11654
Up to 32: 4184
Up to 64: 2360
Up to 128: 1422
Up to 256: 828
Up to 512: 558
Up to 1024: 233
Up to 2048: 104
Up to 4096: 23
Up to 8192: 5
Up to 16384: 0
Up to 32768: 1

By Size (size: numstrings)
Up to 40: 7913
Up to 80: 21782
Up to 160: 3272
Up to 320: 1506
Up to 640: 847
Up to 1280: 482
Up to 2560: 183
Up to 5120: 65
Up to 10240: 18
Up to 20480: 1
Up to 40960: 1
3.3.0a0 (pep-393:6ffa3b569228, Aug 29 2011, 22:00:31) 
[GCC 4.6.1 20110526 (prerelease)]
Strings: 36091
Chars: 1304098
Bytes: 4417522
Other objects: 1866616

By Length (length: numstrings)
Up to 4: 5728
Up to 8: 8997
Up to 16: 11658
Up to 32: 4239
Up to 64: 2335
Up to 128: 1382
Up to 256: 828
Up to 512: 558
Up to 1024: 233
Up to 2048: 104
Up to 4096: 23
Up to 8192: 5
Up to 16384: 0
Up to 32768: 1

By Size (size: numstrings)
Up to 40: 0
Up to 80: 0
Up to 160: 33247
Up to 320: 1500
Up to 640: 1007
Up to 1280: 226
Up to 2560: 86
Up to 5120: 21
Up to 10240: 3
Up to 20480: 1
3.3.0a0 (default:45b63a8a76c9, Aug 30 2011, 07:51:15) 
[GCC 4.6.1 20110526 (prerelease)]
Strings: 36428
Chars: 1318840
Bytes: 4750504
Other objects: 1954760

By Length (length: numstrings)
Up to 4: 5732
Up to 8: 9036
Up to 16: 11797
Up to 32: 4263
Up to 64: 2378
Up to 128: 1453
Up to 256: 839
Up to 512: 561
Up to 1024: 236
Up to 2048: 104
Up to 4096: 23
Up to 8192: 5
Up to 16384: 0
Up to 32768: 1

By Size (size: numstrings)
Up to 40: 0
Up to 80: 20608
Up to 160: 11763
Up to 320: 2326
Up to 640: 932
Up to 1280: 521
Up to 2560: 187
Up to 5120: 71
Up to 10240: 18
Up to 20480: 1
Up to 40960: 1
3.3.0a0 (pep-393:e227d65c9e53, Aug 30 2011, 09:52:42) 
[GCC 4.6.1]
Strings: 36091
Chars: 1305472
Bytes: 2976880
Other objects: 1061876

By Length (length: numstrings)
Up to 4: 5728
Up to 8: 8991
Up to 16: 11657
Up to 32: 4200
Up to 64: 2351
Up to 128: 1412
Up to 256: 828
Up to 512: 558
Up to 1024: 233
Up to 2048: 104
Up to 4096: 23
Up to 8192: 5
Up to 16384: 0
Up to 32768: 1

By Size (size: numstrings)
Up to 40: 0
Up to 80: 30774
Up to 160: 3047
Up to 320: 1118
Up to 640: 847
Up to 1280: 198
Up to 2560: 83
Up to 5120: 21
Up to 10240: 2
Up to 20480: 1
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 3 optimizations continued...

2011-08-30 Thread Stefan Behnel


Nick Coghlan, 30.08.2011 02:00:

On Tue, Aug 30, 2011 at 7:14 AM, Antoine Pitrou wrote:

On Mon, 29 Aug 2011 11:33:14 -0700 stefan brunthaler wrote:

* The optimized dispatch routine has a changed instruction format
(word-sized instead of bytecodes) that allows for regular instruction
decoding (without the HAS_ARG-check) and inlinind of some objects in
the instruction format on 64bit architectures.


Having a word-sized bytecode format would probably be acceptable in
itself, so if you want to submit a patch for that, go ahead.


Although any such patch should discuss how it compares with Cesare's
work on wpython.

Personally, I *like* CPython fitting into the simple-and-portable
niche in the Python interpreter space. Armin Rigo made the judgment
years ago that CPython was a poor platform for serious optimisation
when he stopped working on Psyco and started PyPy instead, and I think
the contrasting fates of PyPy and Unladen Swallow have borne out that
opinion. Significantly increasing the complexity of CPython for
speed-ups that are dwarfed by those available through PyPy seems like
a poor trade-off to me.


If Stefan can cut down his changes into smaller feature chunks, thus making 
their benefit reproducible and verifiable by others, it's well worth 
reconsidering if even a visible increase of complexity isn't worth the 
improved performance, one patch at a time. Even if PyPy's performance tops 
the improvements, it's worth remembering that that's also a very different 
kind of system than CPython, with different resource requirements and a 
different level of maturity, compatibility, portability, etc. There are 
many reasons to continue using CPython, not only in corners, and there are 
many people who would be happy about a faster CPython. Raising the bar has 
its virtues.


That being said, I also second Nick's reference to wpython. If CPython 
grows its byte code size anyway (which, as I understand, is one part of the 
proposed changes), it's worth looking at wpython first, given that it has 
been around and working for a while. The other proposed changes sound like 
at least some of them are independent from this one.


Stefan

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 3 optimizations continued...

2011-08-30 Thread Mark Shannon


Nick Coghlan wrote:

On Tue, Aug 30, 2011 at 7:14 AM, Antoine Pitrou solip...@pitrou.net wrote:

On Mon, 29 Aug 2011 11:33:14 -0700
stefan brunthaler s.bruntha...@uci.edu wrote:

* The optimized dispatch routine has a changed instruction format
(word-sized instead of bytecodes) that allows for regular instruction
decoding (without the HAS_ARG-check) and inlinind of some objects in
the instruction format on 64bit architectures.

Having a word-sized bytecode format would probably be acceptable in
itself, so if you want to submit a patch for that, go ahead.


Although any such patch should discuss how it compares with Cesare's
work on wpython.

Personally, I *like* CPython fitting into the simple-and-portable
niche in the Python interpreter space.


CPython has a a large number of micro-optimisations, scattered all of 
the code base. By removing these and adding large-scale optimisations, 
like Stephan's, the code base *might* actually get smaller overall (and 
thus simpler) *and* faster.

Of course, CPython must remain portable.

[snip]


At a bare minimum, I don't think any significant changes should be
made under the it will be faster justification until the bulk of the
real-world benchmark suite used for speed.pypy.org is available for
Python 3. (Wasn't there a GSoC project about that?)


+1

Cheers,
Mark.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 3 optimizations continued...

2011-08-30 Thread Mark Shannon


Martin v. Löwis wrote:

So, the two big issues aside, is there any interest in incorporating
these optimizations in Python 3?


The question really is whether this is an all-or-nothing deal. If you
could identify smaller parts that can be applied independently, interest
would be higher.

Also, I'd be curious whether your techniques help or hinder a potential
integration of a JIT generator.


A JIT compiler is not a silver bullet, translation to machine code is
just one of many optimisations performed by PyPy.
A compiler merely removes interpretative overhead, at the cost of
significantly increased code size, whereas Stephan's work attacks both
interpreter overhead and some of the inefficiencies due to dynamic typing.

If Unladen Swallow achieved anything it was to demonstrate that a JIT
alone does not work well.

My (experimental) HotPy VM has similar base-line speed to CPython, yet
is able to outperform Unladen Swallow using interpreter-only optimisations.
(It goes even faster with the compiler turned on :) )

Cheers,
Mark.




Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/mark%40hotpy.org



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 3 optimizations continued...

2011-08-30 Thread Martin v. Löwis

 Although any such patch should discuss how it compares with Cesare's
 work on wpython.
 Personally, I *like* CPython fitting into the simple-and-portable
 niche in the Python interpreter space.
 
 Changing the bytecode width wouldn't make the interpreter more complex.

No, but I think Stefan is proposing to add a *second* byte code format,
in addition to the one that remains there. That would certainly be an
increase in complexity.

 Some years ago we were waiting for Unladen Swallow to improve itself
 and be ported to Python 3. Now it seems we are waiting for PyPy to be
 ported to Python 3. I'm not sure how let's just wait is a good
 trade-off if someone proposes interesting patches (which, of course,
 remains to be seen).

I completely agree. Let's not put unmet preconditions to such projects.

For example, I still plan to write a JIT for Python at some point. This
may happen in two months, or in two years. I wouldn't try to stop
anybody from contributing improvements that may become obsolete with the
JIT. The only recent case where I *did* try to stop people is with
PEP-393, where I do believe that some of the changes that had been
made over the last year become redundant.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Ctypes and the stdlib (was Re: LZMA compression support in 3.3)

2011-08-30 Thread Martin v. Löwis

 You might be reading more into that statement than I meant.
 You have to supply Pyrex/Cython versions of the C declarations,
 either hand-written or generated by a tool. But you write them
 based on the advertised C API -- you don't have to manually
 expand macros, work out the low-level layout of structs, or
 anything like that (as you often have to do when using ctypes).

I can understand how that works when building a CPython extension.
But what about creating Jython/IronPython modules with Cython?
At what point get the header files considered there?

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Ctypes and the stdlib (was Re: LZMA compression support in 3.3)

2011-08-30 Thread Stefan Behnel


Martin v. Löwis, 30.08.2011 10:46:

You might be reading more into that statement than I meant.
You have to supply Pyrex/Cython versions of the C declarations,
either hand-written or generated by a tool. But you write them
based on the advertised C API -- you don't have to manually
expand macros, work out the low-level layout of structs, or
anything like that (as you often have to do when using ctypes).


I can understand how that works when building a CPython extension.
But what about creating Jython/IronPython modules with Cython?
At what point get the header files considered there?


I had written a bit about this here:

http://thread.gmane.org/gmane.comp.python.devel/126340/focus=126419

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Planned PEP status changes

2011-08-30 Thread Nick Coghlan

On Sat, Aug 27, 2011 at 2:35 AM, Brett Cannon br...@python.org wrote:
 On Tue, Aug 23, 2011 at 19:42, Nick Coghlan ncogh...@gmail.com wrote:
 Unless I hear any objections, I plan to adjust the current PEP
 statuses as follows some time this weekend:

 Move from Accepted to Finished:

    389  argparse - New Command Line Parsing Module              Bethard
    391  Dictionary-Based Configuration For Logging              Sajip
    3108  Standard Library Reorganization                         Cannon

 sigh I had always hoped to get profile/cProfile taken care of, but
 obviously that just didn't ever happen. So no objection, just a slight
 sting from the reminder of why the PEP was left open.

After starting to write a justification for marking the PEP as Final
despite the outstanding TODO items, I realised that didn't make a lot
of sense, so I left it at Accepted instead. So your call if you want
to say not gonna happen and close it out anyway.

I made the other 4 changes though (argparse, logging.dictConfig, new
super - Final, Unladen Swallow - Withdrawn).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 393 review

2011-08-30 Thread Antoine Pitrou


By the way, I don't know if you're working on it, but StringIO seems a
bit broken right now. test_memoryio crashes here:

test_newline_cr (test.test_memoryio.CStringIOTest) ... Fatal Python error: 
Segmentation fault

Current thread 0x7f3f6353b700:
  File /home/antoine/cpython/pep-393/Lib/test/test_memoryio.py, line 583 in 
test_newline_cr
  File /home/antoine/cpython/pep-393/Lib/unittest/case.py, line 386 in 
_executeTestPart
  File /home/antoine/cpython/pep-393/Lib/unittest/case.py, line 441 in run
  File /home/antoine/cpython/pep-393/Lib/unittest/case.py, line 493 in 
__call__
  File /home/antoine/cpython/pep-393/Lib/unittest/suite.py, line 105 in run
  File /home/antoine/cpython/pep-393/Lib/unittest/suite.py, line 67 in 
__call__
  File /home/antoine/cpython/pep-393/Lib/unittest/suite.py, line 105 in run
  File /home/antoine/cpython/pep-393/Lib/unittest/suite.py, line 67 in 
__call__
  File /home/antoine/cpython/pep-393/Lib/unittest/runner.py, line 168 in run
  File /home/antoine/cpython/pep-393/Lib/test/support.py, line 1293 in 
_run_suite
  File /home/antoine/cpython/pep-393/Lib/test/support.py, line 1327 in 
run_unittest
  File /home/antoine/cpython/pep-393/Lib/test/test_memoryio.py, line 718 in 
test_main
  File /home/antoine/cpython/pep-393/Lib/test/regrtest.py, line 1139 in 
runtest_inner
  File /home/antoine/cpython/pep-393/Lib/test/regrtest.py, line 915 in runtest
  File /home/antoine/cpython/pep-393/Lib/test/regrtest.py, line 707 in main
  File /home/antoine/cpython/pep-393/Lib/test/__main__.py, line 13 in module
  File /home/antoine/cpython/pep-393/Lib/runpy.py, line 73 in _run_code
  File /home/antoine/cpython/pep-393/Lib/runpy.py, line 160 in 
_run_module_as_main
Erreur de segmentation (core dumped)


And here's an excerpt of the C stack:

#0  find_control_char (translated=0, universal=0, readnl=value optimized out, 
kind=4, start=0xa75cf4 c, end=
0xa75d00 , consumed=0x7fffab38) at ./Modules/_io/textio.c:1617
#1  _PyIO_find_line_ending (translated=0, universal=0, readnl=value optimized 
out, kind=4, start=0xa75cf4 c, end=
0xa75d00 , consumed=0x7fffab38) at ./Modules/_io/textio.c:1678
#2  0x004ed3be in _stringio_readline (self=0x7291a250) at 
./Modules/_io/stringio.c:271
#3  stringio_iternext (self=0x7291a250) at ./Modules/_io/stringio.c:322
#4  0x0052aa19 in listextend (self=0x72900ab8, b=value optimized 
out) at Objects/listobject.c:844
#5  0x0052afe8 in list_init (self=0x72900ab8, args=value optimized 
out, kw=value optimized out)
at Objects/listobject.c:2312
#6  0x004283c7 in type_call (type=value optimized out, 
args=(_io.StringIO at remote 0x7291a250,), 
kwds=0x0) at Objects/typeobject.c:692
#7  0x004fdf17 in PyObject_Call (func=type at remote 0x7f95c0, 
arg=value optimized out, 
kw=value optimized out) at Objects/abstract.c:2147


Regards

Antoine.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 3 optimizations continued...

2011-08-30 Thread Antoine Pitrou

On Tue, 30 Aug 2011 13:29:59 +1000
Nick Coghlan ncogh...@gmail.com wrote:
 
 Anecdotal, non-reproducible performance figures are *not* the way to
 go about serious optimisation efforts.

What about anecdotal *and* reproducible performance figures? :)
I may be half-joking, but we already have a set of py3k-compatible
benchmarks and, besides, sometimes a timeit invocation gives a good
idea of whether an approach is fruitful or not.
While a permanent public reference with historical tracking of
performance figures is even better, let's not freeze everything until
it's ready.
(for example, do we need to wait for speed.python.org before PEP 393 is
accepted?)

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 3 optimizations continued...

2011-08-30 Thread Nick Coghlan

On Tue, Aug 30, 2011 at 9:38 PM, Antoine Pitrou solip...@pitrou.net wrote:
 On Tue, 30 Aug 2011 13:29:59 +1000
 Nick Coghlan ncogh...@gmail.com wrote:

 Anecdotal, non-reproducible performance figures are *not* the way to
 go about serious optimisation efforts.

 What about anecdotal *and* reproducible performance figures? :)
 I may be half-joking, but we already have a set of py3k-compatible
 benchmarks and, besides, sometimes a timeit invocation gives a good
 idea of whether an approach is fruitful or not.
 While a permanent public reference with historical tracking of
 performance figures is even better, let's not freeze everything until
 it's ready.
 (for example, do we need to wait for speed.python.org before PEP 393 is
 accepted?)

Yeah, I'd neglected the idea of just running perf.py for pre- and
post-patch performance comparisons. You're right that that can
generate sufficient info to make a well-informed decision.

I'd still really like it if some of the people advocating that we care
about CPython performance actually volunteered to spearhead the effort
to get speed.python.org up and running, though. As far as I know, the
hardware's spinning idly waiting to be given work to do :P

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Ctypes and the stdlib (was Re: LZMA compression support in 3.3)

2011-08-30 Thread Vinay Sajip

Meador Inge meadori at gmail.com writes:

 
 1. http://bugs.python.org/issue9041

I raised a question about this patch (in the issue tracker).

 2. http://bugs.python.org/issue9651
 3. http://bugs.python.org/issue11241

I presume, since Amaury has commit rights, that he could commit these.

Regards,

Vinay Sajip

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Ctypes and the stdlib (was Re: LZMA compression support in 3.3)

2011-08-30 Thread Vlad Riscutia

I also have some patches sitting on the tracker for some time:

http://bugs.python.org/issue12764
http://bugs.python.org/issue11835
http://bugs.python.org/issue12528 which also fixes
http://bugs.python.org/issue6069 and http://bugs.python.org/issue11920
http://bugs.python.org/issue6068 which also fixes
http://bugs.python.org/issue6493

Thank you,
Vlad

On Tue, Aug 30, 2011 at 6:09 AM, Vinay Sajip vinay_sa...@yahoo.co.ukwrote:

 Meador Inge meadori at gmail.com writes:


  1. http://bugs.python.org/issue9041

 I raised a question about this patch (in the issue tracker).

  2. http://bugs.python.org/issue9651
  3. http://bugs.python.org/issue11241

 I presume, since Amaury has commit rights, that he could commit these.

 Regards,

 Vinay Sajip

 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 http://mail.python.org/mailman/options/python-dev/riscutiavlad%40gmail.com

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] cpython: Remove display options (--name, etc.) from the Distribution class.

2011-08-30 Thread Antoine Pitrou

On Tue, 30 Aug 2011 16:22:14 +0200
eric.araujo python-check...@python.org wrote:

 http://hg.python.org/cpython/rev/af0bcccb935b
 changeset:   72127:af0bcccb935b
 user:Éric Araujo mer...@netwok.org
 date:Tue Aug 30 00:55:02 2011 +0200
 summary:
   Remove display options (--name, etc.) from the Distribution class.
 
 These options were used to implement “setup.py --name”,
 “setup.py --version”, etc. which are now handled by the pysetup metadata
 action or direct parsing of the setup.cfg file.
 
 As a side effect, the Distribution class no longer accepts a 'url' key
 in its *attrs* argument: it has to be 'home-page' to be recognized as a
 valid metadata field and passed down to the dist.metadata object.

I don't want to sound nitpicky, but it's the first time I see
home-page hyphenized. How about homepage?

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 3 optimizations continued...

2011-08-30 Thread stefan brunthaler

 Changing the bytecode width wouldn't make the interpreter more complex.

 No, but I think Stefan is proposing to add a *second* byte code format,
 in addition to the one that remains there. That would certainly be an
 increase in complexity.

Yes, indeed I have a more straightforward instruction format to allow
for more efficient decoding. Just going from bytecode size to
word-code size without changing the instruction format is going to
require 8 (or word-size) times more memory on a 64bit system. From an
optimization perspective, the irregular instruction format was the
biggest problem, because checking for HAS_ARG is always on the fast
path and mostly unpredictable. Hence, I chose to extend the
instruction format to have word-size and use the additional space to
have the upper half be used for the argument and the lower half for
the actual opcode. Encoding is more efficient, and *not* more complex.
Using profiling to indicate what code is hot, I don't waste too much
memory on encoding this regular instruction format.


 For example, I still plan to write a JIT for Python at some point. This
 may happen in two months, or in two years. I wouldn't try to stop
 anybody from contributing improvements that may become obsolete with the
 JIT.

I would not necessary argue that at least my optimizations would
become obsolete; if you still think about writing a JIT, it might make
sense to re-use what I've got and not start from scratch, e.g.,
building a simple JIT compiler that just inlines the operation
implementations as templates to eliminate the interpretative overhead
(in similar vein as Piumarta and Riccardi's paper from 1998) might be
good start. Thoug I don't want to pre-influence your JIT design, I'm
just thinking out loud...

Regards,
--stefan
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyPI went down

2011-08-30 Thread Oleg Broytman

On Tue, Aug 30, 2011 at 07:30:01PM +0400, Oleg Broytman wrote:
PyPI went down

   More information: ports 80 and 443 are open, the servers performs SSL
handshake but timeouts on HTTP requests (with or without SSL).

Oleg.
-- 
 Oleg Broytmanhttp://phdru.name/p...@phdru.name
   Programmers don't die, they just GOSUB without RETURN.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyPI went down

2011-08-30 Thread Oleg Broytman

It is back up. I am very sorry for the fuss.

Oleg.
-- 
 Oleg Broytmanhttp://phdru.name/p...@phdru.name
   Programmers don't die, they just GOSUB without RETURN.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] cpython: Remove display options (--name, etc.) from the Distribution class.

2011-08-30 Thread Éric Araujo

Hi,

Le 30/08/2011 17:20, Antoine Pitrou a écrit :
 On Tue, 30 Aug 2011 16:22:14 +0200
 eric.araujo python-check...@python.org wrote:
 As a side effect, the Distribution class no longer accepts a 'url' key
 in its *attrs* argument: it has to be 'home-page' to be recognized as a
 valid metadata field and passed down to the dist.metadata object.
 
 I don't want to sound nitpicky, but it's the first time I see
 home-page hyphenized. How about homepage?

This value is defined in the accepted Metadata PEPs, which use home-page.

Regards
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] PyPI went down

2011-08-30 Thread Oleg Broytman

Hello!

   I released the first package of two and PyPI went down while I was
preparing to release the second. I hope it wasn't me?

Oleg.
-- 
 Oleg Broytmanhttp://phdru.name/p...@phdru.name
   Programmers don't die, they just GOSUB without RETURN.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 3 optimizations continued...

2011-08-30 Thread Guido van Rossum

Stefan, have you shared a pointer to your code yet? Is it open source?
It sounds like people are definitely interested and it would make
sense to let them experiment with your code and review it.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyPI went down

2011-08-30 Thread Martin v. Löwis

I released the first package of two and PyPI went down while I was
 preparing to release the second. I hope it wasn't me?

A few minutes ago, it was responding very slowly, and I found out that
Postgres consumes all time. I haven't put energy into investigating what
was causing this - apparently, somebody was throwing odd queries at it.
Restarting Apache reduced the load. If they continue to do so, I
investigate further.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Ctypes and the stdlib (was Re: LZMA compression support in 3.3)

2011-08-30 Thread Martin v. Löwis

 I can understand how that works when building a CPython extension.
 But what about creating Jython/IronPython modules with Cython?
 At what point get the header files considered there?
 
 I had written a bit about this here:
 
 http://thread.gmane.org/gmane.comp.python.devel/126340/focus=126419

I see. So there is potential for error there.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyPI went down

2011-08-30 Thread Thomas Wouters

On Tue, Aug 30, 2011 at 18:46, Martin v. Löwis mar...@v.loewis.de wrote:

 I released the first package of two and PyPI went down while I was
  preparing to release the second. I hope it wasn't me?

 A few minutes ago, it was responding very slowly, and I found out that
 Postgres consumes all time. I haven't put energy into investigating what
 was causing this - apparently, somebody was throwing odd queries at it.
 Restarting Apache reduced the load. If they continue to do so, I
 investigate further.


Looks like the issue keeps popping up. It was slow to respond earlier today,
and I keep getting complaints about it (including now.)

-- 
Thomas Wouters tho...@python.org

Hi! I'm a .signature virus! copy me into your .signature file to help me
spread!
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Ctypes and the stdlib (was Re: LZMA compression support in 3.3)

2011-08-30 Thread Guido van Rossum

On Tue, Aug 30, 2011 at 9:49 AM, Martin v. Löwis mar...@v.loewis.de wrote:
 I can understand how that works when building a CPython extension.
 But what about creating Jython/IronPython modules with Cython?
 At what point get the header files considered there?

 I had written a bit about this here:

 http://thread.gmane.org/gmane.comp.python.devel/126340/focus=126419

 I see. So there is potential for error there.

To elaborate, with CPython it looks pretty solid, at least for
functions and constants (does it do structs?). You must manually
declare the name and signature of a function, and Pyrex/Cython emits C
code that includes the header and calls the function with the
appropriate types. If the signature you declare doesn't match what's
in the .h file you'll get a compiler error when the C code is
compiled. If (perhaps on some platforms) the function is really a
macro, the macro in the .h file will be invoked and the right thing
will happen. So far so good.

The problem lies with the PyPy backend -- there it generates ctypes
code, which means that the signature you declare to Cython/Pyrex must
match the *linker* level API, not the C compiler level API. Thus, if
in a system header a certain function is really a macro that invokes
another function with a permuted or augmented argument list, you'd
have to know what that macro does. I also don't see how this would
work for #defined constants: where does Cython/Pyrex get their value?
ctypes doesn't have their values.

So, for PyPy, a solution based on Cython/Pyrex has many of the same
downsides as one based on ctypes where it comes to complying with an
API defined by a .h file.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 393 Summer of Code Project

2011-08-30 Thread Stephen J. Turnbull

Antoine Pitrou writes:
  On Mon, 29 Aug 2011 12:43:24 +0900
  Stephen J. Turnbull step...@xemacs.org wrote:
   
   Since when can s[0] represent a code point outside the BMP, for s a
   Unicode string in a narrow build?
   
   Remember, the UCS-2/narrow vs. UCS-4/wide distinction is *not* about
   what Python supports vs. the outside world.  It's about what the str/
   unicode type is an array of.
  
  Why would that be?

Because what the outside world sees is produced by codecs, not by
str.  The outside world can't see whether you have narrow or wide
unless it uses indexing ... ie, experiments to determine what the str
type is an array of.

The problem with a narrow build (whether for space efficiency in
CPython or for platform compatibility in Jython and IronPython) is not
that we have no UTF-16 codecs.  It's that array ops aren't UTF-16
conformant.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyPI went down

2011-08-30 Thread Martin v. Löwis

 Looks like the issue keeps popping up. It was slow to respond earlier
 today, and I keep getting complaints about it (including now.)

Somebody is mirroring the site with wget. I have null-routed them.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 393 Summer of Code Project

2011-08-30 Thread Antoine Pitrou


 The problem with a narrow build (whether for space efficiency in
 CPython or for platform compatibility in Jython and IronPython) is not
 that we have no UTF-16 codecs.  It's that array ops aren't UTF-16
 conformant.

Sorry, what is a conformant UTF-16 array op?

Thanks

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 3 optimizations continued...

2011-08-30 Thread stefan brunthaler

On Tue, Aug 30, 2011 at 09:42, Guido van Rossum gu...@python.org wrote:
 Stefan, have you shared a pointer to your code yet? Is it open source?

I have no shared code repository, but could create one (is there any
pydev preferred provider?). I have all the copyrights on the code, and
I would like to open-source it.

 It sounds like people are definitely interested and it would make
 sense to let them experiment with your code and review it.

That sounds fine. I need to do some clean up work (contains most of my
comments to remind me of issues) and currently does not pass all
regression tests. But if people want to take a look first to decide if
they want it than that's good enough for me. (I just wanted to know if
there is substantial interest so that it eventually pays off to find
and fix the remaining bugs)

--stefan
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 3 optimizations continued...

2011-08-30 Thread Antoine Pitrou

On Tue, 30 Aug 2011 08:27:13 -0700
stefan brunthaler ste...@brunthaler.net wrote:
  Changing the bytecode width wouldn't make the interpreter more complex.
 
  No, but I think Stefan is proposing to add a *second* byte code format,
  in addition to the one that remains there. That would certainly be an
  increase in complexity.
 
 Yes, indeed I have a more straightforward instruction format to allow
 for more efficient decoding. Just going from bytecode size to
 word-code size without changing the instruction format is going to
 require 8 (or word-size) times more memory on a 64bit system.

Do you really need it to match a machine word? Or is, say, a 16-bit
format sufficient.

Regards

Antoine.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 3 optimizations continued...

2011-08-30 Thread stefan brunthaler

 Do you really need it to match a machine word? Or is, say, a 16-bit
 format sufficient.

Hm, technically no, but practically it makes more sense, as (at least
for x86 architectures) having opargs and opcodes in half-words can be
efficiently expressed in assembly. On 64bit architectures, I could
also inline data object references that fit into the 32bit upper half.
It turns out that most constant objects fit nicely into this, and I
have used this for a special cache region (again below 2^32) for
global objects, too. So, technically it's not necessary, but
practically it makes a lot of sense. (Most of these things work on
32bit systems, too. For architectures with a smaller size, we can
adapt or disable the optimizations.)

Cheers,
--stefan
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 3 optimizations continued...

2011-08-30 Thread Guido van Rossum

On Tue, Aug 30, 2011 at 10:50 AM, stefan brunthaler
ste...@brunthaler.net wrote:
 Do you really need it to match a machine word? Or is, say, a 16-bit
 format sufficient.

 Hm, technically no, but practically it makes more sense, as (at least
 for x86 architectures) having opargs and opcodes in half-words can be
 efficiently expressed in assembly. On 64bit architectures, I could
 also inline data object references that fit into the 32bit upper half.
 It turns out that most constant objects fit nicely into this, and I
 have used this for a special cache region (again below 2^32) for
 global objects, too. So, technically it's not necessary, but
 practically it makes a lot of sense. (Most of these things work on
 32bit systems, too. For architectures with a smaller size, we can
 adapt or disable the optimizations.)

Do I sense that the bytecode format is no longer platform-independent?
That will need a bit of discussion. I bet there are some things around
that depend on that.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Ctypes and the stdlib (was Re: LZMA compression support in 3.3)

2011-08-30 Thread Terry Reedy


On 8/30/2011 1:05 PM, Guido van Rossum wrote:


I see. So there is potential for error there.


To elaborate, with CPython it looks pretty solid, at least for
functions and constants (does it do structs?). You must manually
declare the name and signature of a function, and Pyrex/Cython emits C
code that includes the header and calls the function with the
appropriate types. If the signature you declare doesn't match what's
in the .h file you'll get a compiler error when the C code is
compiled. If (perhaps on some platforms) the function is really a
macro, the macro in the .h file will be invoked and the right thing
will happen. So far so good.

The problem lies with the PyPy backend -- there it generates ctypes
code, which means that the signature you declare to Cython/Pyrex must
match the *linker* level API, not the C compiler level API. Thus, if
in a system header a certain function is really a macro that invokes
another function with a permuted or augmented argument list, you'd
have to know what that macro does. I also don't see how this would
work for #defined constants: where does Cython/Pyrex get their value?
ctypes doesn't have their values.

So, for PyPy, a solution based on Cython/Pyrex has many of the same
downsides as one based on ctypes where it comes to complying with an
API defined by a .h file.


Thank you for this elaboration. My earlier comment that ctypes seems to 
be hard to use was based on observation of posts to python-list 
presenting failed attempts (which have included somehow getting function 
signatures wrong) and a sense that ctypes was somehow bypassing the 
public compiler API to make a more direct access via some private api. 
You have explained and named that as the 'linker API', so I understand 
much better now.


Nothing like 'linker API' or 'signature' appears in the ctypes doc. All 
I could find about discovering specific function calling conventions is 
To find out the correct calling convention you have to look into the C 
header file or the documentation for the function you want to call. 
Perhaps that should be elaborated to explain, as you did above, the need 
to trace macro definitions to find the actual calling convention and the 
need to be aware that macro definitions can change to accommodate 
implementation detail changes even as the surface calling conventions 
seems to remain the same.


--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 3 optimizations continued...

2011-08-30 Thread stefan brunthaler

 Do I sense that the bytecode format is no longer platform-independent?
 That will need a bit of discussion. I bet there are some things around
 that depend on that.

Hm, I haven't really thought about that in detail and for longer, I
ran it on PowerPC 970 and Intel Atom  i7 without problems (the latter
ones are a non-issue) and think that it can be portable. I just stuff
argument and opcode into one word for regular instruction decoding
like a RISC CPU, and I realize there might be little/big endian
issues, but they surely can be conditionally compiled...

--stefan
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 3 optimizations continued...

2011-08-30 Thread Guido van Rossum

On Tue, Aug 30, 2011 at 11:23 AM, stefan brunthaler
ste...@brunthaler.net wrote:
 Do I sense that the bytecode format is no longer platform-independent?
 That will need a bit of discussion. I bet there are some things around
 that depend on that.

 Hm, I haven't really thought about that in detail and for longer, I
 ran it on PowerPC 970 and Intel Atom  i7 without problems (the latter
 ones are a non-issue) and think that it can be portable. I just stuff
 argument and opcode into one word for regular instruction decoding
 like a RISC CPU, and I realize there might be little/big endian
 issues, but they surely can be conditionally compiled...

Um, I'm sorry, but that reply sounds incredibly naive, like you're not
really sure what the on-disk format for .pyc files is or why it would
matter. You're not even answering the question, except indirectly --
it seems that you've never even thought about the possibility of
generating a .pyc file on one platform and copying it to a computer
using a different one.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 3 optimizations continued...

2011-08-30 Thread stefan brunthaler

 Um, I'm sorry, but that reply sounds incredibly naive, like you're not
 really sure what the on-disk format for .pyc files is or why it would
 matter. You're not even answering the question, except indirectly --
 it seems that you've never even thought about the possibility of
 generating a .pyc file on one platform and copying it to a computer
 using a different one.

Well, it may sound incredibly naive, but the truth is: I am never
storing the optimized representation to disk, it's done purely at
runtime when profiling tells me it makes sense to make the switch.
Thus I circumvent many of the problems outlined by you. So I am
positive that a full fledged change of the representation has many
more intricacies to it, but my approach is only tangentially
related...

--stefan
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 3 optimizations continued...

2011-08-30 Thread Terry Reedy


On 8/30/2011 1:23 PM, stefan brunthaler wrote:

(I just wanted to know if there is substantial interest so that

  it eventually pays off to find and fix the remaining bugs)

It is the nature of our development process that there usually can be no 
guarantee of acceptance of future code. The rather early acceptance of 
Unladen Swallow was to me something of an anomaly. I also think it was 
something of a mistake insofar as it discouraged other efforts, like yours.


I think the answer you have gotten is that there is a) substantial 
interest and b) a willingness to consider a major change such as 
switfing from bytecode to something else. There also seem to be two main 
concerns: 1) that the increase in complexity be 'less' than the increase 
in speed, and 2) that the changes be presented in small enough chunks 
that they can be reviewed.


Whether this is good enough for you to proceed is for you to decide.

--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 3 optimizations continued...

2011-08-30 Thread Guido van Rossum

On Tue, Aug 30, 2011 at 11:34 AM, stefan brunthaler
ste...@brunthaler.net wrote:
 Um, I'm sorry, but that reply sounds incredibly naive, like you're not
 really sure what the on-disk format for .pyc files is or why it would
 matter. You're not even answering the question, except indirectly --
 it seems that you've never even thought about the possibility of
 generating a .pyc file on one platform and copying it to a computer
 using a different one.

 Well, it may sound incredibly naive, but the truth is: I am never
 storing the optimized representation to disk, it's done purely at
 runtime when profiling tells me it makes sense to make the switch.
 Thus I circumvent many of the problems outlined by you. So I am
 positive that a full fledged change of the representation has many
 more intricacies to it, but my approach is only tangentially
 related...

Ok, there there's something else you haven't told us. Are you saying
that the original (old) bytecode is still used (and hence written to
and read from .pyc files)?

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 3 optimizations continued...

2011-08-30 Thread Georg Brandl

Am 30.08.2011 20:34, schrieb stefan brunthaler:
 Um, I'm sorry, but that reply sounds incredibly naive, like you're not
 really sure what the on-disk format for .pyc files is or why it would
 matter. You're not even answering the question, except indirectly --
 it seems that you've never even thought about the possibility of
 generating a .pyc file on one platform and copying it to a computer
 using a different one.

 Well, it may sound incredibly naive, but the truth is: I am never
 storing the optimized representation to disk, it's done purely at
 runtime when profiling tells me it makes sense to make the switch.
 Thus I circumvent many of the problems outlined by you. So I am
 positive that a full fledged change of the representation has many
 more intricacies to it, but my approach is only tangentially
 related...

You know, instead of all these half-explanations, giving us access to
the code would shut us up much more effectively.  Don't worry about not
passing tests, this is what the official trunk does half of the time ;)

Georg

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Ctypes and the stdlib (was Re: LZMA compression support in 3.3)

2011-08-30 Thread Stefan Behnel


Guido van Rossum, 30.08.2011 19:05:

On Tue, Aug 30, 2011 at 9:49 AM, Martin v. Löwis wrote:

I can understand how that works when building a CPython extension.
But what about creating Jython/IronPython modules with Cython?
At what point get the header files considered there?


I had written a bit about this here:

http://thread.gmane.org/gmane.comp.python.devel/126340/focus=126419


I see. So there is potential for error there.


To elaborate, with CPython it looks pretty solid, at least for
functions and constants (does it do structs?).


Sure. They even coerce from Python dicts and accept keyword arguments in 
Cython.




You must manually
declare the name and signature of a function, and Pyrex/Cython emits C
code that includes the header and calls the function with the
appropriate types. If the signature you declare doesn't match what's
in the .h file you'll get a compiler error when the C code is
compiled. If (perhaps on some platforms) the function is really a
macro, the macro in the .h file will be invoked and the right thing
will happen. So far so good.


Right.



The problem lies with the PyPy backend -- there it generates ctypes
code, which means that the signature you declare to Cython/Pyrex must
match the *linker* level API, not the C compiler level API. Thus, if
in a system header a certain function is really a macro that invokes
another function with a permuted or augmented argument list, you'd
have to know what that macro does. I also don't see how this would
work for #defined constants: where does Cython/Pyrex get their value?
ctypes doesn't have their values.

So, for PyPy, a solution based on Cython/Pyrex has many of the same
downsides as one based on ctypes where it comes to complying with an
API defined by a .h file.


Right again. The declarations that Cython uses describe the API at the C or 
C++ level. They do not describe the ABI. So the situation is the same as 
with ctypes, and the same solutions (or work-arounds) apply, such as 
generating additional glue code that calls macros or reads compile time 
constants, for example. That's the approach that the IronPython backend has 
taken. It's a lot more complex, but also a lot more versatile in the long run.


Stefan

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Software Transactional Memory for Python

2011-08-30 Thread Armin Rigo

Re-hi,

2011/8/29 Armin Rigo ar...@tunes.org:
 The problem is that many locks are actually acquired implicitely.
 For example, `print` to a buffered stream will acquire the fileobject's 
 mutex.

 Indeed.
 (...)
 I suspect that I need to do a more thorough review of the stdlib (...)

I found a solution not involving any change in CPython, and updated
the patch.  The solution is to say that a with atomic block doesn't
completely prevent other threads from re-acquiring the GIL, but only
prevents them from proceeding to the following bytecode.  So if
another thread is currently suspended in a place that releases the GIL
for other reasons, then this other thread can still be switched to as
normal, and continue running until the end of the current bytecode.  I
think it's sane enough for the original purpose, and avoids most
deadlock cases.


A bientôt,

Armin.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 3 optimizations continued...

2011-08-30 Thread stefan brunthaler

 Ok, there there's something else you haven't told us. Are you saying
 that the original (old) bytecode is still used (and hence written to
 and read from .pyc files)?

Short answer: yes.
Long answer: I added an invocation counter to the code object and keep
interpreting in the usual Python interpreter until this counter
reaches a configurable threshold. When it reaches this threshold, I
create the new instruction format and interpret with this optimized
representation. All the macros look exactly the same in the source
code, they are just redefined to use the different instruction format.
I am at no point serializing this representation or the runtime
information gathered by me, as any subsequent invocation might have
different characteristics.

I will remove my development commentaries and create a private
repository at bitbucket for you* to take an early look like Georg (and
more or less Terry, too) suggested. Is that a good way for most of
you? (I would then give access to whomever wants to take a look.)

Best,
--stefan

*: not personally targeted at Guido (who is naturally very much
welcome to take a look, too) but addressed to python-dev in general.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 3 optimizations continued...

2011-08-30 Thread Benjamin Peterson

2011/8/30 stefan brunthaler ste...@brunthaler.net:
 I will remove my development commentaries and create a private
 repository at bitbucket for you* to take an early look like Georg (and
 more or less Terry, too) suggested. Is that a good way for most of
 you? (I would then give access to whomever wants to take a look.)

And what is wrong with a public one?



-- 
Regards,
Benjamin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 3 optimizations continued...

2011-08-30 Thread stefan brunthaler

On Tue, Aug 30, 2011 at 13:42, Benjamin Peterson benja...@python.org wrote:
 2011/8/30 stefan brunthaler ste...@brunthaler.net:
 I will remove my development commentaries and create a private
 repository at bitbucket for you* to take an early look like Georg (and
 more or less Terry, too) suggested. Is that a good way for most of
 you? (I would then give access to whomever wants to take a look.)

 And what is wrong with a public one?

Well, since it does not fully pass all regression tests and is just
meant for people to take a first look to find out if it's interesting,
I think I might take it offline after you had a look. It seems to me
that that is easier to be done with a private repository, but in
general, I don't have a problem with a public one...

Regards,
--stefan

PS: If you want to, I can also just put a tarball on my home page and
post a link here. It's not that I would like to have control/influence
about who is allowed to look and who doesn't.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Software Transactional Memory for Python

2011-08-30 Thread Yury Selivanov

Maybe it'd be better to put 'atomic' in the threading module?

On 2011-08-30, at 4:02 PM, Armin Rigo wrote:

 Re-hi,
 
 2011/8/29 Armin Rigo ar...@tunes.org:
 The problem is that many locks are actually acquired implicitely.
 For example, `print` to a buffered stream will acquire the fileobject's 
 mutex.
 
 Indeed.
 (...)
 I suspect that I need to do a more thorough review of the stdlib (...)
 
 I found a solution not involving any change in CPython, and updated
 the patch.  The solution is to say that a with atomic block doesn't
 completely prevent other threads from re-acquiring the GIL, but only
 prevents them from proceeding to the following bytecode.  So if
 another thread is currently suspended in a place that releases the GIL
 for other reasons, then this other thread can still be switched to as
 normal, and continue running until the end of the current bytecode.  I
 think it's sane enough for the original purpose, and avoids most
 deadlock cases.
 
 
 A bientôt,
 
 Armin.
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 http://mail.python.org/mailman/options/python-dev/yselivanov.ml%40gmail.com

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 3 optimizations continued...

2011-08-30 Thread Gregory P. Smith

On Tue, Aug 30, 2011 at 1:54 PM, Benjamin Peterson benja...@python.orgwrote:

 2011/8/30 stefan brunthaler ste...@brunthaler.net:
  On Tue, Aug 30, 2011 at 13:42, Benjamin Peterson benja...@python.org
 wrote:
  2011/8/30 stefan brunthaler ste...@brunthaler.net:
  I will remove my development commentaries and create a private
  repository at bitbucket for you* to take an early look like Georg (and
  more or less Terry, too) suggested. Is that a good way for most of
  you? (I would then give access to whomever wants to take a look.)
 
  And what is wrong with a public one?
 
  Well, since it does not fully pass all regression tests and is just
  meant for people to take a first look to find out if it's interesting,
  I think I might take it offline after you had a look. It seems to me
  that that is easier to be done with a private repository, but in
  general, I don't have a problem with a public one...

 Well, if your intention is for people to look at it, public seems to
 be the best solution.


+1

The point of open source is more eyeballs and the ability for anyone else to
pick up code and run in whatever direction they want (license permitting)
with it. :)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Coding guidelines for os.walk filter

2011-08-30 Thread Jacek Pliszka

Hi!

I would like to get some opinion on possible os.walk improvement.
For the sake of simplicity let's assume I would like to skip all .svn
and tmp directories.

Current solution looks like this:

for t in os.walk(somedir):
t[1][:]=set(t[1])-{'.svn','tmp'}
... do something

This is a very clever hack but... it relies on internal implementation
of os.walk

Alternative is adding os.walk parameter e.g. like this:

def walk(top, topdown=True, onerror=None, followlinks=False, walkfilter=None)

if walkfilter is not None:
dirs,nondirs=walkfilter(top,dirs,nondirs)
.
and remove .svn and tmp in the walkfilter definition.

What I do not like here is that followlinks is redundant - easily
implementable through walkfilter


Simpler but braking backward-compatibility option would be:

def walk(top, topdown=True, onerror=None, skipdirs=islink)
...
-if followlinks or not islink(new_path):
-for x in walk(new_path, topdown, onerror, followlinks):
+if not skipdirs(new_path):
+for x in walk(new_path, topdown, onerror, skipdirs):

And user given skipdirs function should return true for new_path
ending in .svn or tmp

Nothing is redundant and works fine with topdown=False!

What do you think?  Shall we:
a) do nothing and use the implicit hack
b) make the option explicit with backward compatibility but with
redundancy and topdown=False incompatibility
c) make the option explicit braking backward compatibility but no redundancy

Best Regards,

Jacek Pliszka
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 3 optimizations continued...

2011-08-30 Thread Jesse Noller



On Aug 30, 2011, at 9:05 AM, Nick Coghlan ncogh...@gmail.com wrote:

 On Tue, Aug 30, 2011 at 9:38 PM, Antoine Pitrou solip...@pitrou.net wrote:
 On Tue, 30 Aug 2011 13:29:59 +1000
 Nick Coghlan ncogh...@gmail.com wrote:
 
 Anecdotal, non-reproducible performance figures are *not* the way to
 go about serious optimisation efforts.
 
 What about anecdotal *and* reproducible performance figures? :)
 I may be half-joking, but we already have a set of py3k-compatible
 benchmarks and, besides, sometimes a timeit invocation gives a good
 idea of whether an approach is fruitful or not.
 While a permanent public reference with historical tracking of
 performance figures is even better, let's not freeze everything until
 it's ready.
 (for example, do we need to wait for speed.python.org before PEP 393 is
 accepted?)
 
 Yeah, I'd neglected the idea of just running perf.py for pre- and
 post-patch performance comparisons. You're right that that can
 generate sufficient info to make a well-informed decision.
 
 I'd still really like it if some of the people advocating that we care
 about CPython performance actually volunteered to spearhead the effort
 to get speed.python.org up and running, though. As far as I know, the
 hardware's spinning idly waiting to be given work to do :P
 
 Cheers,
 Nick.
 

Discussion of speed.python.org should happen on the mailing list for that 
project if possible.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 3 optimizations continued...

2011-08-30 Thread Nick Coghlan

On Wed, Aug 31, 2011 at 3:23 AM, stefan brunthaler
ste...@brunthaler.net wrote:
 On Tue, Aug 30, 2011 at 09:42, Guido van Rossum gu...@python.org wrote:
 Stefan, have you shared a pointer to your code yet? Is it open source?

 I have no shared code repository, but could create one (is there any
 pydev preferred provider?). I have all the copyrights on the code, and
 I would like to open-source it.

Currently, the easiest way to create shared repositories for CPython
variants is to start with bitbucket's mirror of the main CPython repo:
https://bitbucket.org/mirror/cpython/overview

Use the website to create your own public CPython fork, then edit the
configuration of your local copy of the CPython repo to point to the
your new bitbucket repo rather than the main one on hg.python.org. hg
push/pull can then be used as normal to publish in-development
material to the world. 'hg pull' from hg.python.org makes it fairly
easy to track the trunk.

One key thing is to avoid making any changes of your own on the
official CPython branches (i.e. default, 3.2, 2.7). Instead, use a
named branch for anything you're working on. This makes it much easier
to generate standalone patches later on.

My own public sandbox
(https://bitbucket.org/ncoghlan/cpython_sandbox/overview) is set up
that way, and you can see plenty of other examples on bitbucket.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 3 optimizations continued...

2011-08-30 Thread Nick Coghlan

On Wed, Aug 31, 2011 at 9:21 AM, Jesse Noller jnol...@gmail.com wrote:
 Discussion of speed.python.org should happen on the mailing list for that 
 project if possible.

Hah, that's how out of the loop I am on that front - I didn't even
know there *was* a mailing list for it :)

Subscribed!

Cheers,
Nick.

P.S. For anyone else that is interested:
http://mail.python.org/mailman/listinfo/speed

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Coding guidelines for os.walk filter

2011-08-30 Thread Michael Urman

 for t in os.walk(somedir):
    t[1][:]=set(t[1])-{'.svn','tmp'}
    ... do something

 This is a very clever hack but... it relies on internal implementation
 of os.walk

This doesn't appear to be an internal implementation detail; this is
documented behavior.
http://docs.python.org/dev/library/os.html#os.walk shows a similar example:

for root, dirs, files in os.walk('python/Lib/email'):
# ...
dirs.remove('CVS')  # don't visit CVS directories

-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 393 Summer of Code Project

2011-08-30 Thread Stephen J. Turnbull

Antoine Pitrou writes:

  Sorry, what is a conformant UTF-16 array op?

For starters, one that doesn't ever return lone surrogates, but rather
interprets surrogate pairs as Unicode code points as in UTF-16.  (This
is not a Unicode standard definition, it's intended to be suggestive
of why many app writers will be distressed if they must use Python
unicode/str in a narrow build without a fairly comprehensive library
that wraps the arrays in operations that treat unicode/str as an array
of code points.)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 3 optimizations continued...

2011-08-30 Thread Terry Reedy


On 8/30/2011 2:12 PM, Guido van Rossum wrote:

On Tue, Aug 30, 2011 at 10:50 AM, stefan brunthaler
ste...@brunthaler.net  wrote:

Do you really need it to match a machine word? Or is, say, a 16-bit
format sufficient.


Hm, technically no, but practically it makes more sense, as (at least
for x86 architectures) having opargs and opcodes in half-words can be
efficiently expressed in assembly. On 64bit architectures, I could
also inline data object references that fit into the 32bit upper half.
It turns out that most constant objects fit nicely into this, and I
have used this for a special cache region (again below 2^32) for
global objects, too. So, technically it's not necessary, but
practically it makes a lot of sense. (Most of these things work on
32bit systems, too. For architectures with a smaller size, we can
adapt or disable the optimizations.)


Do I sense that the bytecode format is no longer platform-independent?
That will need a bit of discussion. I bet there are some things around
that depend on that.


I find myself more comfortable with the Cesare Di Mauro's idea of 
expanding to 16 bits as the code unit. His basic idea was using 2, 4, or 
6 bytes instead of 1, 3, or 6. It actually tended to save space because 
many ops with small ints (which are very common) contract from 3 bytes 
to 2 bytes or from 9(?) (two instructions) to 6. I am sorry he was not 
able to followup on the initial promising results. The dis output was 
probably easier to read than the current output.


Perhaps he made a mistake in combining the above idea with a shift from 
stack to hybrid stack+register design.


--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 393 Summer of Code Project

2011-08-30 Thread Guido van Rossum

On Tue, Aug 30, 2011 at 7:55 PM, Stephen J. Turnbull step...@xemacs.org wrote:
 Antoine Pitrou writes:

   Sorry, what is a conformant UTF-16 array op?

 For starters, one that doesn't ever return lone surrogates, but rather
 interprets surrogate pairs as Unicode code points as in UTF-16.  (This
 is not a Unicode standard definition, it's intended to be suggestive
 of why many app writers will be distressed if they must use Python
 unicode/str in a narrow build without a fairly comprehensive library
 that wraps the arrays in operations that treat unicode/str as an array
 of code points.)

That sounds like a contradiction -- it wouldn't be a UTF-16 array if
you couldn't tell that it was using UTF-16.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 3 optimizations continued...

2011-08-30 Thread Cesare Di Mauro

2011/8/30 Antoine Pitrou solip...@pitrou.net


 Changing the bytecode width wouldn't make the interpreter more complex.


It depends on the kind of changes. :)

WPython introduced a very different intermediate code representation that
required a big change on the peepholer optimizer on 1.0 alpha version.
On 1.1 final I decided to completely move that code on ast.c (mostly for
constant-folding) and compiler.c (for the usual peepholer usage: seeking for
some patterns to substitute with better ones) because I found it simpler
and more convenient.

In the end, taking out some new optimizations that I've implemented on the
road, the interpreter code is a bit more complex.


 Some years ago we were waiting for Unladen Swallow to improve itself
 and be ported to Python 3. Now it seems we are waiting for PyPy to be
 ported to Python 3. I'm not sure how let's just wait is a good
 trade-off if someone proposes interesting patches (which, of course,
 remains to be seen).

 Regards

 Antoine.

 It isn't, because motivation to do something new with CPython vanishes, at
least on some areas (virtual machine / ceval.c), even having some ideas to
experiment with. That's why in my last talk on EuroPython I decided to move
on other areas (Python objects).

Regards

Cesare
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 3 optimizations continued...

2011-08-30 Thread Cesare Di Mauro

2011/8/30 Nick Coghlan ncogh...@gmail.com


 Yeah, it's definitely a trade-off - the point I was trying to make is
 that there *is* a trade-off being made between complexity and speed.

 I think the computed-gotos stuff struck a nice balance - the macro-fu
 involved means that you can still understand what the main eval loop
 is *doing*, even if you don't know exactly what's hidden behind the
 target macros. Ditto for the older opcode prediction feature and the
 peephole optimiser - separation of concerns means that you can
 understand the overall flow of events without needing to understand
 every little detail.

 This is where the request to extract individual orthogonal changes and
 submit separate patches comes from - it makes it clear that the
 independent changes *can* be separated cleanly, and aren't a giant
 ball of incomprehensible mud. It's the difference between complex
 (lots of moving parts, that can each be understood on their own and
 are then composed into a meaningful whole) and complicated (massive
 patches that don't work at all if any one component is delayed)

 Eugene Toder's AST optimiser work that I still hope to get into 3.3
 will have to undergo a similar process - the current patch covers a
 bit too much ground and needs to be broken up into smaller steps
 before we can seriously consider pushing it into the core.

 Regards,
 Nick.

 Sometimes it cannot be done, because big changes produces big patches as
well.

I don't see a problem here if the code is well written (as required buy
the Python community :) and the developer is available to talk about his
work to clear some doubts.

Regards

Cesare
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 3 optimizations continued...

2011-08-30 Thread Cesare Di Mauro

2011/8/30 stefan brunthaler ste...@brunthaler.net

 Yes, indeed I have a more straightforward instruction format to allow
 for more efficient decoding. Just going from bytecode size to
 word-code size without changing the instruction format is going to
 require 8 (or word-size) times more memory on a 64bit system. From an
 optimization perspective, the irregular instruction format was the
 biggest problem, because checking for HAS_ARG is always on the fast
 path and mostly unpredictable. Hence, I chose to extend the
 instruction format to have word-size and use the additional space to
 have the upper half be used for the argument and the lower half for
 the actual opcode. Encoding is more efficient, and *not* more complex.
 Using profiling to indicate what code is hot, I don't waste too much
 memory on encoding this regular instruction format.

 Regards,
 --stefan

That seems exactly the WPython approach, albeit I used the new wordcode in
place of the old bytecode. Take a look at it. ;)

Regards

Cesare
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 3 optimizations continued...

2011-08-30 Thread Cesare Di Mauro

2011/8/30 stefan brunthaler ste...@brunthaler.net

  Do I sense that the bytecode format is no longer platform-independent?
  That will need a bit of discussion. I bet there are some things around
  that depend on that.
 
 Hm, I haven't really thought about that in detail and for longer, I
 ran it on PowerPC 970 and Intel Atom  i7 without problems (the latter
 ones are a non-issue) and think that it can be portable. I just stuff
 argument and opcode into one word for regular instruction decoding
 like a RISC CPU, and I realize there might be little/big endian
 issues, but they surely can be conditionally compiled...

 --stefan

I think that you must deal with big endianess because some RISC can't handle
at all data in little endian format.

In WPython I have wrote some macros which handle both endianess, but lacking
big endian machines I never had the opportunity to verify if something was
wrong.

Regards

Cesare
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 3 optimizations continued...

2011-08-30 Thread Cesare Di Mauro

2011/8/31 Terry Reedy tjre...@udel.edu

 I find myself more comfortable with the Cesare Di Mauro's idea of expanding
 to 16 bits as the code unit. His basic idea was using 2, 4, or 6 bytes
 instead of 1, 3, or 6.


It can be expanded to longer than 6 bytes opcodes, if needed. The format is
designed to be flexible enough to accommodate such changes without pains.


 It actually tended to save space because many ops with small ints (which
 are very common) contract from 3 bytes to 2 bytes or from 9(?) (two
 instructions) to 6.


It can pack up to 4 (old) opcodes into one wordcode (superinstruction).
Wordcodes are designed to favor instruction grouping.


 I am sorry he was not able to followup on the initial promising results.


In a few words: lack of interest. Why spending (so much) time to a project
when you see that the community is oriented towards other directions
(Unladen Swallow at first, PyPy in the last period, given the substantial
drop of the former)?

Also, Guido seems to dislike what he finds as hacks, and never showed
interest.

In WPython 1.1 I rolled back the hack that I introduced in PyObject
types (a couple of fields) in 1.0 alpha, to make the code more polished
(but with a sensible drop in the performance). But again, I saw no interest
on WPython, so I decided to put a stop at it, and blocking my initial idea
to  go for Python 3.


 The dis output was probably easier to read than the current output.

 Perhaps he made a mistake in combining the above idea with a shift from
 stack to hybrid stack+register design.

 --
 Terry Jan Reedy

 As I already said, wordcodes are designed to favor grouping. So It was
quite natural to became an hybrid VM. Anyway, both space and performance
gained from this wordcodes property. ;)

Regards

Cesare
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 393 Summer of Code Project

2011-08-30 Thread Stephen J. Turnbull

Guido van Rossum writes:
  On Tue, Aug 30, 2011 at 7:55 PM, Stephen J. Turnbull step...@xemacs.org 
  wrote:

   For starters, one that doesn't ever return lone surrogates, but rather
   interprets surrogate pairs as Unicode code points as in UTF-16.  (This
   is not a Unicode standard definition, it's intended to be suggestive
   of why many app writers will be distressed if they must use Python
   unicode/str in a narrow build without a fairly comprehensive library
   that wraps the arrays in operations that treat unicode/str as an array
   of code points.)
  
  That sounds like a contradiction -- it wouldn't be a UTF-16 array if
  you couldn't tell that it was using UTF-16.

Well, that's why I wrote intended to be suggestive.  The Unicode
Standard does not specify at all what the internal representation of
characters may be, it only specifies what their external behavior must
be when two processes communicate.  (For process as used in the
standard, think Python modules here, since we are concerned with the
problems of folks who develop in Python.)  When observing the behavior
of a Unicode process, there are no UTF-16 arrays or UTF-8 arrays or
even UTF-32 arrays; only arrays of characters.

Thus, according to the rules of handling a UTF-16 stream, it is an
error to observe a lone surrogate or a surrogate pair that isn't a
high-low pair (Unicode 6.0, Ch. 3 Conformance, requirements C1 and
C8-C10).  That's what I mean by can't tell it's UTF-16.  And I
understand those requirements to mean that operations on UTF-16
streams should produce UTF-16 streams, or raise an error.  Without
that closure property for basic operations on str, I think it's a bad
idea to say that the representation of text in a str in a pre-PEP-393
narrow build is UTF-16.  For many users and app developers, it
creates expectations that are not fulfilled.

It's true that common usage is that an array of code units that
usually conforms to UTF-16 may be called UTF-16 without the closure
properties.  I just disagree with that usage, because there are two
camps that interpret UTF-16 differently.  One side says, we have an
array representation in UTF-16 that can handle all Unicode code points
efficiently, and if you think you need more, think again, while the
other says it's too painful to have to check every result for valid
UTF-16, and we need a UTF-16 type that supports the usual array
operations on *characters* via the usual operators; if you think
otherwise, think again.

Note that despite the (presumed) resolution of the UTF-16 issue for
CPython by PEP 393, at some point a very similar discussion will take
place over characters anyway, because users and app developers are
going to want a type that handles composition sequences and/or
grapheme clusters for them, as well as comparison that respects
canonical equivalence, even if it is inefficient compared to str.
That's why I insisted on use of array of code points to describe the
PEP 393 str type, rather than array of characters.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

63 matches

Mail list logo