Re: [Python-Dev] LZMA compression support in 3.3

2011-08-28 Thread Martin v. Löwis
 I just want to talk about it - for now.

python-ideas is a better place to just talk than python-dev.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LZMA compression support in 3.3

2011-08-28 Thread Stefan Behnel

Dan Stromberg, 27.08.2011 21:58:

On Sat, Aug 27, 2011 at 9:04 AM, Nick Coghlan wrote:

On Sun, Aug 28, 2011 at 1:58 AM, Nadeem Vawda wrote:

On Sat, Aug 27, 2011 at 5:52 PM, Nick Coghlan wrote:

It's acceptable for the Python version to use ctypes in the case of
wrapping an existing library, but the Python version should still
exist.


I'm not too sure about that - PEP 399 explicitly says that using ctypes
is
frowned upon, and doesn't mention anywhere that it should be used in this
sort of situation.


Note to self: do not comment on python-dev at 2 am, as one's ability
to read PEPs correctly apparently suffers :)

Consider my comment withdrawn, you're quite right that PEP 399
actually says this is precisely the case where an exemption is a
reasonable idea. Although I believe it's likely that PyPy will wrap it
with ctypes anyway :)


I'd like to better understand why ctypes is (sometimes) frowned upon.

Is it the brittleness?  Tendency to segfault?


Maybe unwieldy code and slow execution on CPython?

Note that there's a ctypes backend for Cython being written as part of a 
GSoC, so it should eventually become possible to write C library wrappers 
in Cython and have it generate a ctypes version to run on PyPy. That, 
together with the IronPython backend that is on its way, would give you a 
way to write fast wrappers for at least three of the major four Python 
implementations, without sacrificing readability or speed in one of them.


Stefan

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should we move to replace re with regex?

2011-08-28 Thread Nick Coghlan
On Sun, Aug 28, 2011 at 2:28 PM, Guido van Rossum gu...@python.org wrote:
 On Sat, Aug 27, 2011 at 8:59 PM, Ezio Melotti ezio.melo...@gmail.com wrote:
 I think it would be good to:
   1) have some document that explains the general design and main (internal)
 functions of the module (e.g. a PEP);

 I don't think that such a document needs to be a PEP; PEPs are usually
 intended where there is significant discussion expected, not just to
 explain things. A README file or a Wiki page would be fine, as long as
 it's sufficiently comprehensive.

timsort.txt and dictnotes.txt may be useful precedents for the kind of
thing that is useful on that front. IIRC, the pymalloc stuff has a
massive embedded comment, which can also work.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Ctypes and the stdlib (was Re: LZMA compression support in 3.3)

2011-08-28 Thread Simon Cross
On Sun, Aug 28, 2011 at 6:58 AM, Terry Reedy tjre...@udel.edu wrote:
 2) It is not trivial to use it correctly. I think it needs a SWIG-like
 companion script that can write at least first-pass ctypes code from the .h
 header files. Or maybe it could/should use header info at runtime (with the
 .h bundled with a module).

This is sort of already available:

-- http://starship.python.net/crew/theller/ctypes/old/codegen.html
-- http://svn.python.org/projects/ctypes/trunk/ctypeslib/

It just appears to have never made it into CPython. I've used it
successfully on a small project.

Schiavo
Simon
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Software Transactional Memory for Python

2011-08-28 Thread Guido van Rossum
On Sat, Aug 27, 2011 at 6:08 AM, Armin Rigo ar...@tunes.org wrote:
 Hi Nick,

 On Sat, Aug 27, 2011 at 2:40 PM, Nick Coghlan ncogh...@gmail.com wrote:
 1. How does the patch interact with C code that explicitly releases
 the GIL? (e.g. IO commands inside a with atomic: block)

 As implemented, any code in a with atomic is prevented from
 explicitly releasing and reacquiring the GIL: the GIL remain acquired
 until the end of the with block.  In other words
 Py_BEGIN_ALLOW_THREADS has no effect in a with block.  This gives
 semantics that, in a full multi-core STM world, would be implementable
 by saying that if, in the middle of a transaction, you need to do I/O,
 then from this point onwards the transaction is not allowed to abort
 any more.  Such inevitable transactions are already supported e.g.
 by RSTM, the C++ framework I used to prototype a C version
 (https://bitbucket.org/arigo/arigo/raw/default/hack/stm/c ).

 2. Whether or not Jython and IronPython could implement something like
 that, since they're free threaded with fine-grained locks. If they
 can't then I don't see how we could justify making it part of the
 standard library.

 Yes, I can imagine some solutions.  I am no Jython or IronPython
 expert, but let us assume that they have a way to check synchronously
 for external events from time to time (i.e. if there is some
 equivalent to sys.setcheckinterval()).  If they do, then all you need
 is the right synchronization: the thread that wants to start a with
 atomic has to wait until all other threads are paused in the external
 check code.  (Again, like CPython's, this not a properly multi-core
 STM-ish solution, but it would give the right semantics.  (And if it
 turns out that STM is successful in the future, Java will grow more
 direct support for it wink))


 A bientôt,

 Armin.

This sounds like a very interesting idea to pursue, even if it's late,
and even if it's experimental, and even if it's possible to cause
deadlocks (no news there). I propose that we offer a C API in Python
3.3 as well as an extension module that offers the proposed decorator.
The C API could then be used to implement alternative APIs purely as
extension modules (e.g. would a deadlock-detecting API be possible?).

I don't think this needs a PEP, it's not a very pervasive change. We
can even document the API as experimental. But (if I may trust Armin's
reasoning) it's important to add support directly to CPython, as
currently it cannot be done as a pure extension module.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Should we move to replace re with regex?

2011-08-28 Thread Guido van Rossum
Someone asked me off-line what I wanted besides talk. Here's the list
I came up with:

You could try for instance volunteer to do a thorough code review of
the regex code, trying to think of ways to break it (e.g. bad syntax
or extreme use of nesting etc., or bad data). Or you could volunteer
to maintain it in the future. Or you could try to port it to PEP 393.
Or you could systematically go over the given list of differences
between re and regex and decide whether they are likely to be
backwards incompatibilities that will break existing code. Or you
could try to add some of the functionality requested by Tom C in one
of his several bugs.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LZMA compression support in 3.3

2011-08-28 Thread Guido van Rossum
On Sat, Aug 27, 2011 at 10:36 PM, Dan Stromberg drsali...@gmail.com wrote:

 On Sat, Aug 27, 2011 at 8:57 PM, Guido van Rossum gu...@python.org wrote:

 On Sat, Aug 27, 2011 at 3:14 PM, Dan Stromberg drsali...@gmail.com
 wrote:
  IMO, we really, really need some common way of accessing C libraries
  that
  works for all major Python variants.

 We have one. It's called writing an extension module.

 And yet Cext's are full of CPython-isms.

I have to apologize, I somehow misread your all Python variants as a
mixture of all CPython versions and all platforms where CPython
runs.

While I have no desire to continue this discussion, you are most
welcome to do so.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 393 review

2011-08-28 Thread Martin v. Löwis
Am 26.08.2011 16:56, schrieb Guido van Rossum:
 Also, please add the table (and the reasoning that led to it) to the PEP.

Done!

Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Ctypes and the stdlib (was Re: LZMA compression support in 3.3)

2011-08-28 Thread Stefan Behnel

Hi,

sorry for hooking in here with my usual Cython bias and promotion. When the 
question comes up what a good FFI for Python should look like, it's an 
obvious reaction from my part to throw Cython into the game.


Terry Reedy, 28.08.2011 06:58:

Dan, I once had the more or less the same opinion/question as you with
regard to ctypes, but I now see at least 3 problems.

1) It seems hard to write it correctly. There are currently 47 open ctypes
issues, with 9 being feature requests, leaving 38 behavior-related issues.
Tom Heller has not been able to work on it since the beginning of 2010 and
has formally withdrawn as maintainer. No one else that I know of has taken
his place.


Cython has an active set of developers and a rather large and growing user 
base.


It certainly has lots of open issues in its bug tracker, but most of them 
are there because we *know* where the development needs to go, not so much 
because we don't know how to get there. After all, the semantics of Python 
and C/C++, between which Cython sits, are pretty much established.


Cython compiles to C code for CPython, (hopefully soon [1]) to 
Python+ctypes for PyPy and (mostly [2]) C++/CLI code for IronPython, which 
boils down to the same build time and runtime kind of dependencies that the 
supported Python runtimes have anyway. It does not add dependencies on any 
external libraries by itself, such as the libffi in CPython's ctypes 
implementation.


For the CPython backend, the generated code is very portable and is 
self-contained when compiled against the CPython runtime (plus, obviously, 
libraries that the user code explicitly uses). It generates efficient code 
for all existing CPython versions starting with Python 2.4, with several 
optimisations also for recent CPython versions (including the upcoming 3.3).




2) It is not trivial to use it correctly.


Cython is basically Python, so Python developers with some C or C++ 
knowledge tend to get along with it quickly.


I can't say yet how easy it is (or will be) to write code that is portable 
across independent Python implementations, but given that that field is 
still young, there's certainly a lot that can be done to aid this.




I think it needs a SWIG-like
companion script that can write at least first-pass ctypes code from the .h
header files. Or maybe it could/should use header info at runtime (with the
.h bundled with a module).


From my experience, this is a nice to have more than a requirement. It 
has been requested for Cython a couple of times, especially by new users, 
and there are a couple of scripts out there that do this to some extent. 
But the usual problem is that Cython users (and, similarly, ctypes users) 
do not want a 1:1 mapping of a library API to a Python API (there's SWIG 
for that), and you can't easily get more than a trivial mapping out of a 
script. But, yes, a one-shot generator for the necessary declarations would 
at least help in cases where the API to be wrapped is somewhat large.




3) It seems to be slower than compiled C extension wrappers. That, at
least, was the discovery of someone who re-wrote pygame using ctypes. (The
hope was that using ctypes would aid porting to 3.x, but the time penalty
was apparently too much for time-critical code.)


Cython code can be as fast as C code, and in some cases, especially when 
developer time is limited, even faster than hand written C extensions. It 
allows for a straight forward optimisation path from regular Python code 
down to the speed of C, and trivial interaction with C code itself, if the 
need arises.


Stefan


[1] The PyPy port of Cython is currently being written as a GSoC project.

[2] The IronPython port of Cython was written to facility a NumPy port to 
the .NET environment. It's currently not a complete port of all Cython 
features.



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] peps: Add memory consumption table.

2011-08-28 Thread Antoine Pitrou
On Sun, 28 Aug 2011 20:13:11 +0200
martin.v.loewis python-check...@python.org wrote:
  
 +Performance
 +---
 +
 +Performance of this patch must be considered for both memory
 +consumption and runtime efficiency. For memory consumption, the
 +expectation is that applications that have many large strings will see
 +a reduction in memory usage. For small strings, the effects depend on
 +the pointer size of the system, and the size of the Py_UNICODE/wchar_t
 +type. The following table demonstrates this for various small string
 +sizes and platforms.

The table is for ASCII-only strings, right? Perhaps that should be
mentioned somewhere.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 393 review

2011-08-28 Thread Martin v. Löwis
 I would say no more than a 15% slowdown on each of the following
 benchmarks:
 
 - stringbench.py -u
   (http://svn.python.org/view/sandbox/trunk/stringbench/)
 - iobench.py -t
   (in Tools/iobench/)
 - the json_dump, json_load and regex_v8 tests from
   http://hg.python.org/benchmarks/

I now have benchmark results for these; numbers are for revision
c10bcab2aac7, comparing to 1ea72da11724 (wide unicode), on 64-bit
Linux with gcc 4.6.1 running on Core i7 2.8GHz.

- stringbench gives 10% slowdown on total time; the tests take
  between 78% and 220%. The cost is typically not in performing
  the string operations themselves, but in the creation of the
  result strings. In PEP 393, a buffer must be scanned for the
  highest code point, which means that each byte must be inspected
  twice (a second time when the copying occurs).
- the iobench results are between 2% acceleration (seek operations),
  16% slowdown for small-sized reads (4.31MB/s vs. 5.22 MB/s) and
  37% for large sized reads (154 MB/s vs. 235 MB/s). The speed
  difference is probably in the UTF-8 decoder; I have already
  restored the runs of ASCII optimization and am out of ideas for
  further speedups. Again, having to scan the UTF-8 string twice
  is probably one cause of slowdown.
- the json and regex_v8 tests see a slowdown of below 1%.

The slowdown is larger when compared with a narrow Unicode build.

 Additionally, it would be nice if you could run at least some of the
 test_bigmem tests, according to your system's available RAM.

Running only StrTest with 4.5G allows me to run 2 tests
(test_encode_raw_unicode_escape and test_encode_utf7); this sees
a slowdown of 37% in Linux user time.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 393 review

2011-08-28 Thread Antoine Pitrou

 - the iobench results are between 2% acceleration (seek operations),
   16% slowdown for small-sized reads (4.31MB/s vs. 5.22 MB/s) and
   37% for large sized reads (154 MB/s vs. 235 MB/s). The speed
   difference is probably in the UTF-8 decoder; I have already
   restored the runs of ASCII optimization and am out of ideas for
   further speedups. Again, having to scan the UTF-8 string twice
   is probably one cause of slowdown.

I don't think it's the UTF-8 decoder because I see an even larger
slowdown with simpler encodings (e.g. -E latin1 or -E utf-16le).

Thanks

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 393 review

2011-08-28 Thread Martin v. Löwis
Am 28.08.2011 22:01, schrieb Antoine Pitrou:
 
 - the iobench results are between 2% acceleration (seek operations),
   16% slowdown for small-sized reads (4.31MB/s vs. 5.22 MB/s) and
   37% for large sized reads (154 MB/s vs. 235 MB/s). The speed
   difference is probably in the UTF-8 decoder; I have already
   restored the runs of ASCII optimization and am out of ideas for
   further speedups. Again, having to scan the UTF-8 string twice
   is probably one cause of slowdown.
 
 I don't think it's the UTF-8 decoder because I see an even larger
 slowdown with simpler encodings (e.g. -E latin1 or -E utf-16le).

But those aren't used in iobench, are they?

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 393 review

2011-08-28 Thread Antoine Pitrou
Le dimanche 28 août 2011 à 22:23 +0200, Martin v. Löwis a écrit :
 Am 28.08.2011 22:01, schrieb Antoine Pitrou:
  
  - the iobench results are between 2% acceleration (seek operations),
16% slowdown for small-sized reads (4.31MB/s vs. 5.22 MB/s) and
37% for large sized reads (154 MB/s vs. 235 MB/s). The speed
difference is probably in the UTF-8 decoder; I have already
restored the runs of ASCII optimization and am out of ideas for
further speedups. Again, having to scan the UTF-8 string twice
is probably one cause of slowdown.
  
  I don't think it's the UTF-8 decoder because I see an even larger
  slowdown with simpler encodings (e.g. -E latin1 or -E utf-16le).
 
 But those aren't used in iobench, are they?

I was not very clear, but you can change the encoding used in iobench by
using the -E command-line option (while UTF-8 is the default if you
don't specify anything).

For example:

$ ./python Tools/iobench/iobench.py -t -E latin1
Preparing files...
Text unit = one character (latin1-decoded)

** Text input **

[ 400KB ] read one unit at a time...   5.17 MB/s
[ 400KB ] read 20 units at a time...   77.6 MB/s
[ 400KB ] read one line at a time...209 MB/s
[ 400KB ] read 4096 units at a time...  509 MB/s

[  20KB ] read whole contents at once...885 MB/s
[ 400KB ] read whole contents at once...730 MB/s
[  10MB ] read whole contents at once...726 MB/s

(etc.)

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 393 review

2011-08-28 Thread Martin v. Löwis
Am 28.08.2011 22:01, schrieb Antoine Pitrou:
 
 - the iobench results are between 2% acceleration (seek operations),
   16% slowdown for small-sized reads (4.31MB/s vs. 5.22 MB/s) and
   37% for large sized reads (154 MB/s vs. 235 MB/s). The speed
   difference is probably in the UTF-8 decoder; I have already
   restored the runs of ASCII optimization and am out of ideas for
   further speedups. Again, having to scan the UTF-8 string twice
   is probably one cause of slowdown.
 
 I don't think it's the UTF-8 decoder because I see an even larger
 slowdown with simpler encodings (e.g. -E latin1 or -E utf-16le).

Those haven't been ported to the new API, yet. Consider, for example,
d9821affc9ee. Before that, I got 253 MB/s on the 4096 units read test;
with that change, I get 610 MB/s. The trunk gives me 488 MB/s, so this
is a 25% speedup for PEP 393.

Regards,
Martin


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LZMA compression support in 3.3

2011-08-28 Thread Greg Ewing

Guido van Rossum wrote:

On Sat, Aug 27, 2011 at 3:14 PM, Dan Stromberg drsali...@gmail.com wrote:


IMO, we really, really need some common way of accessing C libraries that
works for all major Python variants.


We have one. It's called writing an extension module.


I think Dan means some way of doing this without having
to hand-craft a different one for each Python implementation.

If we're really serious about the idea that Python is not
CPython, this seems like a reasonable thing to want. Currently
the Python universe is very much centred around CPython, with
the other implementations perpetually in catch-up mode.

My suggestion on how to address this would be something akin
to Pyrex or Cython. I gather that there has been some work
recently on adding different back-ends to Cython to generate
code for different Python implementations.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 393 Summer of Code Project

2011-08-28 Thread Stephen J. Turnbull
Paul Moore writes:

  IronPython and Jython can retain UTF-16 as their native form if that
  makes interop cleaner, but in doing so they need to ensure that basic
  operations like indexing and len work in terms of code points, not
  code units, if they are to conform.

[...]

  They lose the O(1) guarantee, but that's easily defensible as a
  tradeoff to conform to underlying runtime semantics.

Unfortunately, I don't think it's all that easy to defend.  Absent PEP
393 or a restriction to the characters in the BMP, this is a very
expensive change, easily visible to interactive users, let alone
performance-hungry applications.

I personally do advocate the array of code points definition, but I
don't use IronPython or Jython so PEP 393 is as close to heaven as I
expect to get.  OTOH, I also use Emacsen with Mule, and I have to
admit that there is a perceptible performance hit in any large (1 MB)
buffer containing non-ASCII characters vs. pure ASCII (the code unit
in Mule is 1 byte).  I expect that if IronPython and Jython really
want to retain native, code-unit-based representations, it's going to
be painful to conform to an array of code points specification.

There may need to be a compromise of the form Implementations SHOULD
provide an implementation of str that is both O(1) in indexing and an
array of code points.  Code that is Unicode-ly correct in Python
implementing PEP 393 will need to be ported with some effort to
implementations that do not satisfy this requirement, perhaps using
different algorithms or extra libraries.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Ctypes and the stdlib (was Re: LZMA compression support in 3.3)

2011-08-28 Thread Guido van Rossum
On Sun, Aug 28, 2011 at 11:23 AM, Stefan Behnel stefan...@behnel.de wrote:
 Hi,

 sorry for hooking in here with my usual Cython bias and promotion. When the
 question comes up what a good FFI for Python should look like, it's an
 obvious reaction from my part to throw Cython into the game.

 Terry Reedy, 28.08.2011 06:58:

 Dan, I once had the more or less the same opinion/question as you with
 regard to ctypes, but I now see at least 3 problems.

 1) It seems hard to write it correctly. There are currently 47 open ctypes
 issues, with 9 being feature requests, leaving 38 behavior-related issues.
 Tom Heller has not been able to work on it since the beginning of 2010 and
 has formally withdrawn as maintainer. No one else that I know of has taken
 his place.

 Cython has an active set of developers and a rather large and growing user
 base.

 It certainly has lots of open issues in its bug tracker, but most of them
 are there because we *know* where the development needs to go, not so much
 because we don't know how to get there. After all, the semantics of Python
 and C/C++, between which Cython sits, are pretty much established.

 Cython compiles to C code for CPython, (hopefully soon [1]) to Python+ctypes
 for PyPy and (mostly [2]) C++/CLI code for IronPython, which boils down to
 the same build time and runtime kind of dependencies that the supported
 Python runtimes have anyway. It does not add dependencies on any external
 libraries by itself, such as the libffi in CPython's ctypes implementation.

 For the CPython backend, the generated code is very portable and is
 self-contained when compiled against the CPython runtime (plus, obviously,
 libraries that the user code explicitly uses). It generates efficient code
 for all existing CPython versions starting with Python 2.4, with several
 optimisations also for recent CPython versions (including the upcoming 3.3).


 2) It is not trivial to use it correctly.

 Cython is basically Python, so Python developers with some C or C++
 knowledge tend to get along with it quickly.

 I can't say yet how easy it is (or will be) to write code that is portable
 across independent Python implementations, but given that that field is
 still young, there's certainly a lot that can be done to aid this.

Cythin does sound attractive for cross-Python-implementation use. This
is exciting.

 I think it needs a SWIG-like
 companion script that can write at least first-pass ctypes code from the .h
 header files. Or maybe it could/should use header info at runtime (with the
 .h bundled with a module).

 From my experience, this is a nice to have more than a requirement. It has
 been requested for Cython a couple of times, especially by new users, and
 there are a couple of scripts out there that do this to some extent. But the
 usual problem is that Cython users (and, similarly, ctypes users) do not
 want a 1:1 mapping of a library API to a Python API (there's SWIG for that),
 and you can't easily get more than a trivial mapping out of a script. But,
 yes, a one-shot generator for the necessary declarations would at least help
 in cases where the API to be wrapped is somewhat large.

Hm, the main use that was proposed here for ctypes is to wrap existing
libraries (not to create nicer APIs, that can be done in pure Python
on top of this). In general, an existing library cannot be called
without access to its .h files -- there are probably struct and
constant definitions, platform-specific #ifdefs and #defines, and
other things in there that affect the linker-level calling conventions
for the functions in the library. (Just like Python's own .h files --
e.g. the extensive renaming of the Unicode APIs depending on
narrow/wide build) How does Cython deal with these? I wonder if for
this particular purpose SWIG isn't the better match. (If SWIG weren't
universally hated, even by its original author. :-)

 3) It seems to be slower than compiled C extension wrappers. That, at
 least, was the discovery of someone who re-wrote pygame using ctypes. (The
 hope was that using ctypes would aid porting to 3.x, but the time penalty
 was apparently too much for time-critical code.)

 Cython code can be as fast as C code, and in some cases, especially when
 developer time is limited, even faster than hand written C extensions. It
 allows for a straight forward optimisation path from regular Python code
 down to the speed of C, and trivial interaction with C code itself, if the
 need arises.

 Stefan


 [1] The PyPy port of Cython is currently being written as a GSoC project.

 [2] The IronPython port of Cython was written to facility a NumPy port to
 the .NET environment. It's currently not a complete port of all Cython
 features.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 

Re: [Python-Dev] PEP 393 Summer of Code Project

2011-08-28 Thread Stephen J. Turnbull
Guido van Rossum writes:

  I don't think anyone else has that impression. Please cite chapter and
  verse if you really think this is important. IIUC, UCS-2 does not
  allow surrogate pairs,

In the original definition of UCS-2 in draft ISO 10646 (1990),
everything in the BMP except for 0x and 0xFFFE was a character,
and there was no concept of surrogate at all.  Later in ISO 10646
(1993)[1], the Surrogate Area was carved out of the Private Area, but
UCS-2 implementations simply treat them as (single) characters with
special properties.  This was more or less backward compatible as all
corporate uses of the private area used the lower code points and
didn't conflict with the surrogates.  Finally (in 2000 or 2003) the
definition of UCS-2 in ISO 10646 was revised in a backward-
incompatible way to exclude surrogates entirely, ie, nowadays it is a
range-restricted version of UTF-16.

Footnotes: 
[1]  IIRC, strictly speaking this was done slightly later (1993 or
1994) in an official Amendment to ISO 10646; the Amendment was
incorporated into the standard in 2000.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Ctypes and the stdlib (was Re: LZMA compression support in 3.3)

2011-08-28 Thread Nick Coghlan
On Mon, Aug 29, 2011 at 12:27 PM, Guido van Rossum gu...@python.org wrote:
 I wonder if for
 this particular purpose SWIG isn't the better match. (If SWIG weren't
 universally hated, even by its original author. :-)

SWIG is nice when you control the C/C++ side of the API as well and
can tweak it to be SWIG-friendly. I shudder at the idea of using it to
wrap arbitrary C++ code, though.

That said, the idea of using SWIG to emit Cython code rather than
C/API code may be one well worth exploring.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 393 Summer of Code Project

2011-08-28 Thread Stephen J. Turnbull
Raymond Hettinger writes:

  The naming convention for codecs is that the UTF prefix is used for
  lossless encodings that cover the entire range of Unicode.

Sure.  The operative word here is codec, not str, though.

  The first amendment to the original edition of the UCS defined
  UTF-16, an extension of UCS-2, to represent code points outside the
  BMP.

Since when can s[0] represent a code point outside the BMP, for s a
Unicode string in a narrow build?

Remember, the UCS-2/narrow vs. UCS-4/wide distinction is *not* about
what Python supports vs. the outside world.  It's about what the str/
unicode type is an array of.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Ctypes and the stdlib (was Re: LZMA compression support in 3.3)

2011-08-28 Thread Glyph Lefkowitz

On Aug 28, 2011, at 7:27 PM, Guido van Rossum wrote:

 In general, an existing library cannot be called
 without access to its .h files -- there are probably struct and
 constant definitions, platform-specific #ifdefs and #defines, and
 other things in there that affect the linker-level calling conventions
 for the functions in the library.

Unfortunately I don't know a lot about this, but I keep hearing about something 
called rffi that PyPy uses to call C from RPython: 
http://readthedocs.org/docs/pypy/en/latest/rffi.html.  This has some 
shortcomings currently, most notably the fact that it needs those .h files (and 
therefore a C compiler) at runtime, so it's currently a non-starter for code 
distributed to users.  Not to mention the fact that, as you can see, it's not 
terribly thoroughly documented.  But, that ExternalCompilationInfo object 
looks very promising, since it has fields like includes, libraries, etc.

Nevertheless it seems like it's a bit more type-safe than ctypes or cython, and 
it seems to me that it could cache some of that information that it extracts 
from header files and store it for later when a compiler might not be around.

Perhaps someone with more PyPy knowledge than I could explain whether this is a 
realistic contender for other Python runtimes?

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com