Re: [Python-Dev] LZMA compression support in 3.3
I just want to talk about it - for now. python-ideas is a better place to just talk than python-dev. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] LZMA compression support in 3.3
Dan Stromberg, 27.08.2011 21:58: On Sat, Aug 27, 2011 at 9:04 AM, Nick Coghlan wrote: On Sun, Aug 28, 2011 at 1:58 AM, Nadeem Vawda wrote: On Sat, Aug 27, 2011 at 5:52 PM, Nick Coghlan wrote: It's acceptable for the Python version to use ctypes in the case of wrapping an existing library, but the Python version should still exist. I'm not too sure about that - PEP 399 explicitly says that using ctypes is frowned upon, and doesn't mention anywhere that it should be used in this sort of situation. Note to self: do not comment on python-dev at 2 am, as one's ability to read PEPs correctly apparently suffers :) Consider my comment withdrawn, you're quite right that PEP 399 actually says this is precisely the case where an exemption is a reasonable idea. Although I believe it's likely that PyPy will wrap it with ctypes anyway :) I'd like to better understand why ctypes is (sometimes) frowned upon. Is it the brittleness? Tendency to segfault? Maybe unwieldy code and slow execution on CPython? Note that there's a ctypes backend for Cython being written as part of a GSoC, so it should eventually become possible to write C library wrappers in Cython and have it generate a ctypes version to run on PyPy. That, together with the IronPython backend that is on its way, would give you a way to write fast wrappers for at least three of the major four Python implementations, without sacrificing readability or speed in one of them. Stefan ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Should we move to replace re with regex?
On Sun, Aug 28, 2011 at 2:28 PM, Guido van Rossum gu...@python.org wrote: On Sat, Aug 27, 2011 at 8:59 PM, Ezio Melotti ezio.melo...@gmail.com wrote: I think it would be good to: 1) have some document that explains the general design and main (internal) functions of the module (e.g. a PEP); I don't think that such a document needs to be a PEP; PEPs are usually intended where there is significant discussion expected, not just to explain things. A README file or a Wiki page would be fine, as long as it's sufficiently comprehensive. timsort.txt and dictnotes.txt may be useful precedents for the kind of thing that is useful on that front. IIRC, the pymalloc stuff has a massive embedded comment, which can also work. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Ctypes and the stdlib (was Re: LZMA compression support in 3.3)
On Sun, Aug 28, 2011 at 6:58 AM, Terry Reedy tjre...@udel.edu wrote: 2) It is not trivial to use it correctly. I think it needs a SWIG-like companion script that can write at least first-pass ctypes code from the .h header files. Or maybe it could/should use header info at runtime (with the .h bundled with a module). This is sort of already available: -- http://starship.python.net/crew/theller/ctypes/old/codegen.html -- http://svn.python.org/projects/ctypes/trunk/ctypeslib/ It just appears to have never made it into CPython. I've used it successfully on a small project. Schiavo Simon ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Software Transactional Memory for Python
On Sat, Aug 27, 2011 at 6:08 AM, Armin Rigo ar...@tunes.org wrote: Hi Nick, On Sat, Aug 27, 2011 at 2:40 PM, Nick Coghlan ncogh...@gmail.com wrote: 1. How does the patch interact with C code that explicitly releases the GIL? (e.g. IO commands inside a with atomic: block) As implemented, any code in a with atomic is prevented from explicitly releasing and reacquiring the GIL: the GIL remain acquired until the end of the with block. In other words Py_BEGIN_ALLOW_THREADS has no effect in a with block. This gives semantics that, in a full multi-core STM world, would be implementable by saying that if, in the middle of a transaction, you need to do I/O, then from this point onwards the transaction is not allowed to abort any more. Such inevitable transactions are already supported e.g. by RSTM, the C++ framework I used to prototype a C version (https://bitbucket.org/arigo/arigo/raw/default/hack/stm/c ). 2. Whether or not Jython and IronPython could implement something like that, since they're free threaded with fine-grained locks. If they can't then I don't see how we could justify making it part of the standard library. Yes, I can imagine some solutions. I am no Jython or IronPython expert, but let us assume that they have a way to check synchronously for external events from time to time (i.e. if there is some equivalent to sys.setcheckinterval()). If they do, then all you need is the right synchronization: the thread that wants to start a with atomic has to wait until all other threads are paused in the external check code. (Again, like CPython's, this not a properly multi-core STM-ish solution, but it would give the right semantics. (And if it turns out that STM is successful in the future, Java will grow more direct support for it wink)) A bientôt, Armin. This sounds like a very interesting idea to pursue, even if it's late, and even if it's experimental, and even if it's possible to cause deadlocks (no news there). I propose that we offer a C API in Python 3.3 as well as an extension module that offers the proposed decorator. The C API could then be used to implement alternative APIs purely as extension modules (e.g. would a deadlock-detecting API be possible?). I don't think this needs a PEP, it's not a very pervasive change. We can even document the API as experimental. But (if I may trust Armin's reasoning) it's important to add support directly to CPython, as currently it cannot be done as a pure extension module. -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Should we move to replace re with regex?
Someone asked me off-line what I wanted besides talk. Here's the list I came up with: You could try for instance volunteer to do a thorough code review of the regex code, trying to think of ways to break it (e.g. bad syntax or extreme use of nesting etc., or bad data). Or you could volunteer to maintain it in the future. Or you could try to port it to PEP 393. Or you could systematically go over the given list of differences between re and regex and decide whether they are likely to be backwards incompatibilities that will break existing code. Or you could try to add some of the functionality requested by Tom C in one of his several bugs. -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] LZMA compression support in 3.3
On Sat, Aug 27, 2011 at 10:36 PM, Dan Stromberg drsali...@gmail.com wrote: On Sat, Aug 27, 2011 at 8:57 PM, Guido van Rossum gu...@python.org wrote: On Sat, Aug 27, 2011 at 3:14 PM, Dan Stromberg drsali...@gmail.com wrote: IMO, we really, really need some common way of accessing C libraries that works for all major Python variants. We have one. It's called writing an extension module. And yet Cext's are full of CPython-isms. I have to apologize, I somehow misread your all Python variants as a mixture of all CPython versions and all platforms where CPython runs. While I have no desire to continue this discussion, you are most welcome to do so. -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 393 review
Am 26.08.2011 16:56, schrieb Guido van Rossum: Also, please add the table (and the reasoning that led to it) to the PEP. Done! Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Ctypes and the stdlib (was Re: LZMA compression support in 3.3)
Hi, sorry for hooking in here with my usual Cython bias and promotion. When the question comes up what a good FFI for Python should look like, it's an obvious reaction from my part to throw Cython into the game. Terry Reedy, 28.08.2011 06:58: Dan, I once had the more or less the same opinion/question as you with regard to ctypes, but I now see at least 3 problems. 1) It seems hard to write it correctly. There are currently 47 open ctypes issues, with 9 being feature requests, leaving 38 behavior-related issues. Tom Heller has not been able to work on it since the beginning of 2010 and has formally withdrawn as maintainer. No one else that I know of has taken his place. Cython has an active set of developers and a rather large and growing user base. It certainly has lots of open issues in its bug tracker, but most of them are there because we *know* where the development needs to go, not so much because we don't know how to get there. After all, the semantics of Python and C/C++, between which Cython sits, are pretty much established. Cython compiles to C code for CPython, (hopefully soon [1]) to Python+ctypes for PyPy and (mostly [2]) C++/CLI code for IronPython, which boils down to the same build time and runtime kind of dependencies that the supported Python runtimes have anyway. It does not add dependencies on any external libraries by itself, such as the libffi in CPython's ctypes implementation. For the CPython backend, the generated code is very portable and is self-contained when compiled against the CPython runtime (plus, obviously, libraries that the user code explicitly uses). It generates efficient code for all existing CPython versions starting with Python 2.4, with several optimisations also for recent CPython versions (including the upcoming 3.3). 2) It is not trivial to use it correctly. Cython is basically Python, so Python developers with some C or C++ knowledge tend to get along with it quickly. I can't say yet how easy it is (or will be) to write code that is portable across independent Python implementations, but given that that field is still young, there's certainly a lot that can be done to aid this. I think it needs a SWIG-like companion script that can write at least first-pass ctypes code from the .h header files. Or maybe it could/should use header info at runtime (with the .h bundled with a module). From my experience, this is a nice to have more than a requirement. It has been requested for Cython a couple of times, especially by new users, and there are a couple of scripts out there that do this to some extent. But the usual problem is that Cython users (and, similarly, ctypes users) do not want a 1:1 mapping of a library API to a Python API (there's SWIG for that), and you can't easily get more than a trivial mapping out of a script. But, yes, a one-shot generator for the necessary declarations would at least help in cases where the API to be wrapped is somewhat large. 3) It seems to be slower than compiled C extension wrappers. That, at least, was the discovery of someone who re-wrote pygame using ctypes. (The hope was that using ctypes would aid porting to 3.x, but the time penalty was apparently too much for time-critical code.) Cython code can be as fast as C code, and in some cases, especially when developer time is limited, even faster than hand written C extensions. It allows for a straight forward optimisation path from regular Python code down to the speed of C, and trivial interaction with C code itself, if the need arises. Stefan [1] The PyPy port of Cython is currently being written as a GSoC project. [2] The IronPython port of Cython was written to facility a NumPy port to the .NET environment. It's currently not a complete port of all Cython features. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] peps: Add memory consumption table.
On Sun, 28 Aug 2011 20:13:11 +0200 martin.v.loewis python-check...@python.org wrote: +Performance +--- + +Performance of this patch must be considered for both memory +consumption and runtime efficiency. For memory consumption, the +expectation is that applications that have many large strings will see +a reduction in memory usage. For small strings, the effects depend on +the pointer size of the system, and the size of the Py_UNICODE/wchar_t +type. The following table demonstrates this for various small string +sizes and platforms. The table is for ASCII-only strings, right? Perhaps that should be mentioned somewhere. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 393 review
I would say no more than a 15% slowdown on each of the following benchmarks: - stringbench.py -u (http://svn.python.org/view/sandbox/trunk/stringbench/) - iobench.py -t (in Tools/iobench/) - the json_dump, json_load and regex_v8 tests from http://hg.python.org/benchmarks/ I now have benchmark results for these; numbers are for revision c10bcab2aac7, comparing to 1ea72da11724 (wide unicode), on 64-bit Linux with gcc 4.6.1 running on Core i7 2.8GHz. - stringbench gives 10% slowdown on total time; the tests take between 78% and 220%. The cost is typically not in performing the string operations themselves, but in the creation of the result strings. In PEP 393, a buffer must be scanned for the highest code point, which means that each byte must be inspected twice (a second time when the copying occurs). - the iobench results are between 2% acceleration (seek operations), 16% slowdown for small-sized reads (4.31MB/s vs. 5.22 MB/s) and 37% for large sized reads (154 MB/s vs. 235 MB/s). The speed difference is probably in the UTF-8 decoder; I have already restored the runs of ASCII optimization and am out of ideas for further speedups. Again, having to scan the UTF-8 string twice is probably one cause of slowdown. - the json and regex_v8 tests see a slowdown of below 1%. The slowdown is larger when compared with a narrow Unicode build. Additionally, it would be nice if you could run at least some of the test_bigmem tests, according to your system's available RAM. Running only StrTest with 4.5G allows me to run 2 tests (test_encode_raw_unicode_escape and test_encode_utf7); this sees a slowdown of 37% in Linux user time. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 393 review
- the iobench results are between 2% acceleration (seek operations), 16% slowdown for small-sized reads (4.31MB/s vs. 5.22 MB/s) and 37% for large sized reads (154 MB/s vs. 235 MB/s). The speed difference is probably in the UTF-8 decoder; I have already restored the runs of ASCII optimization and am out of ideas for further speedups. Again, having to scan the UTF-8 string twice is probably one cause of slowdown. I don't think it's the UTF-8 decoder because I see an even larger slowdown with simpler encodings (e.g. -E latin1 or -E utf-16le). Thanks Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 393 review
Am 28.08.2011 22:01, schrieb Antoine Pitrou: - the iobench results are between 2% acceleration (seek operations), 16% slowdown for small-sized reads (4.31MB/s vs. 5.22 MB/s) and 37% for large sized reads (154 MB/s vs. 235 MB/s). The speed difference is probably in the UTF-8 decoder; I have already restored the runs of ASCII optimization and am out of ideas for further speedups. Again, having to scan the UTF-8 string twice is probably one cause of slowdown. I don't think it's the UTF-8 decoder because I see an even larger slowdown with simpler encodings (e.g. -E latin1 or -E utf-16le). But those aren't used in iobench, are they? Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 393 review
Le dimanche 28 août 2011 à 22:23 +0200, Martin v. Löwis a écrit : Am 28.08.2011 22:01, schrieb Antoine Pitrou: - the iobench results are between 2% acceleration (seek operations), 16% slowdown for small-sized reads (4.31MB/s vs. 5.22 MB/s) and 37% for large sized reads (154 MB/s vs. 235 MB/s). The speed difference is probably in the UTF-8 decoder; I have already restored the runs of ASCII optimization and am out of ideas for further speedups. Again, having to scan the UTF-8 string twice is probably one cause of slowdown. I don't think it's the UTF-8 decoder because I see an even larger slowdown with simpler encodings (e.g. -E latin1 or -E utf-16le). But those aren't used in iobench, are they? I was not very clear, but you can change the encoding used in iobench by using the -E command-line option (while UTF-8 is the default if you don't specify anything). For example: $ ./python Tools/iobench/iobench.py -t -E latin1 Preparing files... Text unit = one character (latin1-decoded) ** Text input ** [ 400KB ] read one unit at a time... 5.17 MB/s [ 400KB ] read 20 units at a time... 77.6 MB/s [ 400KB ] read one line at a time...209 MB/s [ 400KB ] read 4096 units at a time... 509 MB/s [ 20KB ] read whole contents at once...885 MB/s [ 400KB ] read whole contents at once...730 MB/s [ 10MB ] read whole contents at once...726 MB/s (etc.) Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 393 review
Am 28.08.2011 22:01, schrieb Antoine Pitrou: - the iobench results are between 2% acceleration (seek operations), 16% slowdown for small-sized reads (4.31MB/s vs. 5.22 MB/s) and 37% for large sized reads (154 MB/s vs. 235 MB/s). The speed difference is probably in the UTF-8 decoder; I have already restored the runs of ASCII optimization and am out of ideas for further speedups. Again, having to scan the UTF-8 string twice is probably one cause of slowdown. I don't think it's the UTF-8 decoder because I see an even larger slowdown with simpler encodings (e.g. -E latin1 or -E utf-16le). Those haven't been ported to the new API, yet. Consider, for example, d9821affc9ee. Before that, I got 253 MB/s on the 4096 units read test; with that change, I get 610 MB/s. The trunk gives me 488 MB/s, so this is a 25% speedup for PEP 393. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] LZMA compression support in 3.3
Guido van Rossum wrote: On Sat, Aug 27, 2011 at 3:14 PM, Dan Stromberg drsali...@gmail.com wrote: IMO, we really, really need some common way of accessing C libraries that works for all major Python variants. We have one. It's called writing an extension module. I think Dan means some way of doing this without having to hand-craft a different one for each Python implementation. If we're really serious about the idea that Python is not CPython, this seems like a reasonable thing to want. Currently the Python universe is very much centred around CPython, with the other implementations perpetually in catch-up mode. My suggestion on how to address this would be something akin to Pyrex or Cython. I gather that there has been some work recently on adding different back-ends to Cython to generate code for different Python implementations. -- Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 393 Summer of Code Project
Paul Moore writes: IronPython and Jython can retain UTF-16 as their native form if that makes interop cleaner, but in doing so they need to ensure that basic operations like indexing and len work in terms of code points, not code units, if they are to conform. [...] They lose the O(1) guarantee, but that's easily defensible as a tradeoff to conform to underlying runtime semantics. Unfortunately, I don't think it's all that easy to defend. Absent PEP 393 or a restriction to the characters in the BMP, this is a very expensive change, easily visible to interactive users, let alone performance-hungry applications. I personally do advocate the array of code points definition, but I don't use IronPython or Jython so PEP 393 is as close to heaven as I expect to get. OTOH, I also use Emacsen with Mule, and I have to admit that there is a perceptible performance hit in any large (1 MB) buffer containing non-ASCII characters vs. pure ASCII (the code unit in Mule is 1 byte). I expect that if IronPython and Jython really want to retain native, code-unit-based representations, it's going to be painful to conform to an array of code points specification. There may need to be a compromise of the form Implementations SHOULD provide an implementation of str that is both O(1) in indexing and an array of code points. Code that is Unicode-ly correct in Python implementing PEP 393 will need to be ported with some effort to implementations that do not satisfy this requirement, perhaps using different algorithms or extra libraries. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Ctypes and the stdlib (was Re: LZMA compression support in 3.3)
On Sun, Aug 28, 2011 at 11:23 AM, Stefan Behnel stefan...@behnel.de wrote: Hi, sorry for hooking in here with my usual Cython bias and promotion. When the question comes up what a good FFI for Python should look like, it's an obvious reaction from my part to throw Cython into the game. Terry Reedy, 28.08.2011 06:58: Dan, I once had the more or less the same opinion/question as you with regard to ctypes, but I now see at least 3 problems. 1) It seems hard to write it correctly. There are currently 47 open ctypes issues, with 9 being feature requests, leaving 38 behavior-related issues. Tom Heller has not been able to work on it since the beginning of 2010 and has formally withdrawn as maintainer. No one else that I know of has taken his place. Cython has an active set of developers and a rather large and growing user base. It certainly has lots of open issues in its bug tracker, but most of them are there because we *know* where the development needs to go, not so much because we don't know how to get there. After all, the semantics of Python and C/C++, between which Cython sits, are pretty much established. Cython compiles to C code for CPython, (hopefully soon [1]) to Python+ctypes for PyPy and (mostly [2]) C++/CLI code for IronPython, which boils down to the same build time and runtime kind of dependencies that the supported Python runtimes have anyway. It does not add dependencies on any external libraries by itself, such as the libffi in CPython's ctypes implementation. For the CPython backend, the generated code is very portable and is self-contained when compiled against the CPython runtime (plus, obviously, libraries that the user code explicitly uses). It generates efficient code for all existing CPython versions starting with Python 2.4, with several optimisations also for recent CPython versions (including the upcoming 3.3). 2) It is not trivial to use it correctly. Cython is basically Python, so Python developers with some C or C++ knowledge tend to get along with it quickly. I can't say yet how easy it is (or will be) to write code that is portable across independent Python implementations, but given that that field is still young, there's certainly a lot that can be done to aid this. Cythin does sound attractive for cross-Python-implementation use. This is exciting. I think it needs a SWIG-like companion script that can write at least first-pass ctypes code from the .h header files. Or maybe it could/should use header info at runtime (with the .h bundled with a module). From my experience, this is a nice to have more than a requirement. It has been requested for Cython a couple of times, especially by new users, and there are a couple of scripts out there that do this to some extent. But the usual problem is that Cython users (and, similarly, ctypes users) do not want a 1:1 mapping of a library API to a Python API (there's SWIG for that), and you can't easily get more than a trivial mapping out of a script. But, yes, a one-shot generator for the necessary declarations would at least help in cases where the API to be wrapped is somewhat large. Hm, the main use that was proposed here for ctypes is to wrap existing libraries (not to create nicer APIs, that can be done in pure Python on top of this). In general, an existing library cannot be called without access to its .h files -- there are probably struct and constant definitions, platform-specific #ifdefs and #defines, and other things in there that affect the linker-level calling conventions for the functions in the library. (Just like Python's own .h files -- e.g. the extensive renaming of the Unicode APIs depending on narrow/wide build) How does Cython deal with these? I wonder if for this particular purpose SWIG isn't the better match. (If SWIG weren't universally hated, even by its original author. :-) 3) It seems to be slower than compiled C extension wrappers. That, at least, was the discovery of someone who re-wrote pygame using ctypes. (The hope was that using ctypes would aid porting to 3.x, but the time penalty was apparently too much for time-critical code.) Cython code can be as fast as C code, and in some cases, especially when developer time is limited, even faster than hand written C extensions. It allows for a straight forward optimisation path from regular Python code down to the speed of C, and trivial interaction with C code itself, if the need arises. Stefan [1] The PyPy port of Cython is currently being written as a GSoC project. [2] The IronPython port of Cython was written to facility a NumPy port to the .NET environment. It's currently not a complete port of all Cython features. -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe:
Re: [Python-Dev] PEP 393 Summer of Code Project
Guido van Rossum writes: I don't think anyone else has that impression. Please cite chapter and verse if you really think this is important. IIUC, UCS-2 does not allow surrogate pairs, In the original definition of UCS-2 in draft ISO 10646 (1990), everything in the BMP except for 0x and 0xFFFE was a character, and there was no concept of surrogate at all. Later in ISO 10646 (1993)[1], the Surrogate Area was carved out of the Private Area, but UCS-2 implementations simply treat them as (single) characters with special properties. This was more or less backward compatible as all corporate uses of the private area used the lower code points and didn't conflict with the surrogates. Finally (in 2000 or 2003) the definition of UCS-2 in ISO 10646 was revised in a backward- incompatible way to exclude surrogates entirely, ie, nowadays it is a range-restricted version of UTF-16. Footnotes: [1] IIRC, strictly speaking this was done slightly later (1993 or 1994) in an official Amendment to ISO 10646; the Amendment was incorporated into the standard in 2000. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Ctypes and the stdlib (was Re: LZMA compression support in 3.3)
On Mon, Aug 29, 2011 at 12:27 PM, Guido van Rossum gu...@python.org wrote: I wonder if for this particular purpose SWIG isn't the better match. (If SWIG weren't universally hated, even by its original author. :-) SWIG is nice when you control the C/C++ side of the API as well and can tweak it to be SWIG-friendly. I shudder at the idea of using it to wrap arbitrary C++ code, though. That said, the idea of using SWIG to emit Cython code rather than C/API code may be one well worth exploring. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 393 Summer of Code Project
Raymond Hettinger writes: The naming convention for codecs is that the UTF prefix is used for lossless encodings that cover the entire range of Unicode. Sure. The operative word here is codec, not str, though. The first amendment to the original edition of the UCS defined UTF-16, an extension of UCS-2, to represent code points outside the BMP. Since when can s[0] represent a code point outside the BMP, for s a Unicode string in a narrow build? Remember, the UCS-2/narrow vs. UCS-4/wide distinction is *not* about what Python supports vs. the outside world. It's about what the str/ unicode type is an array of. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Ctypes and the stdlib (was Re: LZMA compression support in 3.3)
On Aug 28, 2011, at 7:27 PM, Guido van Rossum wrote: In general, an existing library cannot be called without access to its .h files -- there are probably struct and constant definitions, platform-specific #ifdefs and #defines, and other things in there that affect the linker-level calling conventions for the functions in the library. Unfortunately I don't know a lot about this, but I keep hearing about something called rffi that PyPy uses to call C from RPython: http://readthedocs.org/docs/pypy/en/latest/rffi.html. This has some shortcomings currently, most notably the fact that it needs those .h files (and therefore a C compiler) at runtime, so it's currently a non-starter for code distributed to users. Not to mention the fact that, as you can see, it's not terribly thoroughly documented. But, that ExternalCompilationInfo object looks very promising, since it has fields like includes, libraries, etc. Nevertheless it seems like it's a bit more type-safe than ctypes or cython, and it seems to me that it could cache some of that information that it extracts from header files and store it for later when a compiler might not be around. Perhaps someone with more PyPy knowledge than I could explain whether this is a realistic contender for other Python runtimes? ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com