Re: [Python-Dev] Why can't I encode/decode base64 without importing a module?

2013-04-23 Thread M.-A. Lemburg
On 23.04.2013 17:47, Guido van Rossum wrote: On Tue, Apr 23, 2013 at 8:22 AM, M.-A. Lemburg m...@egenix.com wrote: Just as reminder: we have the general purpose encode()/decode() functions in the codecs module: import codecs r13 = codecs.encode('hello world', 'rot-13') These interface

Re: [Python-Dev] XML DoS vulnerabilities and exploits in Python

2013-02-24 Thread M.-A. Lemburg
Reminds me of the encoding attacks that were possible in earlier versions of Python... you could have e.g. an email processing script run the Python test suite by simply sending a specially crafted email :-) On 21.02.2013 13:04, Christian Heimes wrote: Am 21.02.2013 11:32, schrieb Antoine

Re: [Python-Dev] Fwd: PEP 426 is now the draft spec for distribution metadata 2.0

2013-02-20 Thread M.-A. Lemburg
On 20.02.2013 03:37, Paul Moore wrote: On 20 February 2013 00:54, Fred Drake f...@fdrake.net wrote: I'd posit that anything successful will no longer need to be added to the standard library, to boot. Packaging hasn't done well there. distlib may be the exception, though. Packaging tools

Re: [Python-Dev] Fwd: PEP 426 is now the draft spec for distribution metadata 2.0

2013-02-20 Thread M.-A. Lemburg
On 20.02.2013 00:16, Daniel Holth wrote: On Tue, Feb 19, 2013 at 5:10 PM, M.-A. Lemburg m...@egenix.com wrote: On 19.02.2013 23:01, Daniel Holth wrote: On Tue, Feb 19, 2013 at 4:34 PM, M.-A. Lemburg m...@egenix.com wrote: On 19.02.2013 14:40, Nick Coghlan wrote: On Tue, Feb 19, 2013 at 11

Re: [Python-Dev] Fwd: PEP 426 is now the draft spec for distribution metadata 2.0

2013-02-19 Thread M.-A. Lemburg
On 17.02.2013 11:11, Nick Coghlan wrote: FYI -- Forwarded message -- From: Nick Coghlan ncogh...@gmail.com Date: Sun, Feb 17, 2013 at 8:10 PM Subject: PEP 426 is now the draft spec for distribution metadata 2.0 To: DistUtils mailing list\\ distutils-...@python.org

Re: [Python-Dev] Fwd: PEP 426 is now the draft spec for distribution metadata 2.0

2013-02-19 Thread M.-A. Lemburg
On 19.02.2013 11:28, Nick Coghlan wrote: On Tue, Feb 19, 2013 at 7:37 PM, M.-A. Lemburg m...@egenix.com wrote: On 17.02.2013 11:11, Nick Coghlan wrote: I'm not against modernizing the format, but given that version 1.2 has been out for around 8 years now, without much following, I think we

Re: [Python-Dev] Fwd: PEP 426 is now the draft spec for distribution metadata 2.0

2013-02-19 Thread M.-A. Lemburg
On 19.02.2013 14:40, Nick Coghlan wrote: On Tue, Feb 19, 2013 at 11:23 PM, M.-A. Lemburg m...@egenix.com wrote: * PEP 426 doesn't include any mention of the egg distribution format, even though it's the most popular distribution format at the moment. It should at least include the location

Re: [Python-Dev] Fwd: PEP 426 is now the draft spec for distribution metadata 2.0

2013-02-19 Thread M.-A. Lemburg
On 19.02.2013 14:40, Nick Coghlan wrote: On Tue, Feb 19, 2013 at 11:23 PM, M.-A. Lemburg m...@egenix.com wrote: On 19.02.2013 11:28, Nick Coghlan wrote: On Tue, Feb 19, 2013 at 7:37 PM, M.-A. Lemburg m...@egenix.com wrote: On 17.02.2013 11:11, Nick Coghlan wrote: I'm not against modernizing

Re: [Python-Dev] Fwd: PEP 426 is now the draft spec for distribution metadata 2.0

2013-02-19 Thread M.-A. Lemburg
On 19.02.2013 23:01, Daniel Holth wrote: On Tue, Feb 19, 2013 at 4:34 PM, M.-A. Lemburg m...@egenix.com wrote: On 19.02.2013 14:40, Nick Coghlan wrote: On Tue, Feb 19, 2013 at 11:23 PM, M.-A. Lemburg m...@egenix.com wrote: * PEP 426 doesn't include any mention of the egg distribution format

Re: [Python-Dev] BDFL delegation for PEP 426 + distutils freeze

2013-02-04 Thread M.-A. Lemburg
On 03.02.2013 19:33, Éric Araujo wrote: I vote for removing the distutils is frozen principle. I’ve also been thinking about that. There have been two exceptions to the freeze, for ABI flags in extension module names and for pycache directories. When the stable ABI was added and MvL wanted

Re: [Python-Dev] [Python-checkins] Cron docs@dinsdale /home/docs/build-devguide

2012-12-22 Thread M.-A. Lemburg
On 22.12.2012 21:36, Terry Reedy wrote: On 12/22/2012 1:30 PM, Cron Daemon wrote: abort: error: Connection timed out ___ Python-checkins mailing list python-check...@python.org http://mail.python.org/mailman/listinfo/python-checkins As a

Re: [Python-Dev] [Catalog-sig] accept the wheel PEPs 425, 426, 427

2012-11-13 Thread M.-A. Lemburg
On 13.11.2012 10:51, Martin v. Löwis wrote: Am 13.11.12 03:04, schrieb Nick Coghlan: On Mon, Oct 29, 2012 at 4:47 AM, Daniel Holth dho...@gmail.com mailto:dho...@gmail.com wrote: I think Metadata 1.3 is done. Who would like to czar? (Apologies for the belated reply, it's been a busy few

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-25 Thread M.-A. Lemburg
On 25.10.2012 08:42, Nick Coghlan wrote: Why are any of these codecs here in unicodeobjectland in the first place? Sure, they're needed so that Python can find its own stuff, but in principle *any* codec could be needed. Is it just an heuristic that the codecs needed for 99% of the world are

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-25 Thread M.-A. Lemburg
On 25.10.2012 08:42, Nick Coghlan wrote: unicodeobject.c is too big, and should be restructured to make any natural modularity explicit, and provide an easier path for users that want to understand how the unicode implementation works. You can also achieve that goal by structuring the code in

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-25 Thread M.-A. Lemburg
On 25.10.2012 11:18, Maciej Fijalkowski wrote: On Thu, Oct 25, 2012 at 8:57 AM, M.-A. Lemburg m...@egenix.com wrote: On 25.10.2012 08:42, Nick Coghlan wrote: Why are any of these codecs here in unicodeobjectland in the first place? Sure, they're needed so that Python can find its own stuff

Re: [Python-Dev] Split unicodeobject.c into subfiles

2012-10-23 Thread M.-A. Lemburg
On 23.10.2012 10:22, Benjamin Peterson wrote: 2012/10/22 Victor Stinner victor.stin...@gmail.com: Hi, I forked CPython repository to work on my split unicodeobject.c project: http://hg.python.org/sandbox/split-unicodeobject.c The result is 10 files (included the existing unicodeobject.c):

Re: [Python-Dev] Split unicodeobject.c into subfiles?

2012-10-05 Thread M.-A. Lemburg
Victor Stinner wrote: Hi, I would like to split the huge unicodeobject.c file into smaller files. It's just the longest C file of CPython: 14,849 lines. I don't know exactly how to split it, but first I would like to know if you would agree with the idea. Example: -

Re: [Python-Dev] TZ-aware local time

2012-06-06 Thread M.-A. Lemburg
Just to add my 2 cents to this discussion as someone who's worked with mxDateTime for almost 15 years. I think we all agree that users of an application want to input date/time data using their local time (which may very well not be the timezone of the system running the application). On output

Re: [Python-Dev] [RFC] PEP 418: Add monotonic time, performance counter and process time functions

2012-04-15 Thread M.-A. Lemburg
Victor Stinner wrote: Hi, Here is a simplified version of the first draft of the PEP 418. The full version can be read online. http://www.python.org/dev/peps/pep-0418/ The implementation of the PEP can be found in this issue: http://bugs.python.org/issue14428 I post a simplified

Re: [Python-Dev] Use QueryPerformanceCounter() for time.monotonic() and/or time.highres()?

2012-04-03 Thread M.-A. Lemburg
Victor Stinner wrote: You seem to have missed the episode where I explained that caching the last value in order to avoid going backwards doesn't work -- at least not if the cached value is internal to the API implementation. Yes, and I can't find it by briefly searching my mail. I haven't

Re: [Python-Dev] Python install layout and the PATH on win32 (Rationale part 1: Regularizing the layout)

2012-03-23 Thread M.-A. Lemburg
VanL wrote: As this has been brought up a couple times in this subthread, I figured that I would lay out the rationale here. There are two proposals on the table: 1) Regularize the install layout, and 2) move the python binary to the binaries directory. This email will deal with the

Re: [Python-Dev] Python install layout and the PATH on win32

2012-03-21 Thread M.-A. Lemburg
Lindberg, Van wrote: Mark, MAL, Martin, Tarek, Could you comment on this? This is in the context of changing the name of the 'Scripts' directory on windows to 'bin'. Éric brings up the point (explained more below) that if we make this change, packages made/installed the new packaging

Re: [Python-Dev] Add a frozendict builtin type

2012-02-28 Thread M.-A. Lemburg
Victor Stinner wrote: See also the PEP 351. I read the PEP and the email explaining why it was rejected. Just to be clear: the PEP 351 tries to freeze an object, try to convert a mutable or immutable object to an immutable object. Whereas my frozendict proposition doesn't convert

Re: [Python-Dev] Add a frozendict builtin type

2012-02-28 Thread M.-A. Lemburg
Steven D'Aprano wrote: M.-A. Lemburg wrote: Victor Stinner wrote: See also the PEP 351. I read the PEP and the email explaining why it was rejected. Just to be clear: the PEP 351 tries to freeze an object, try to convert a mutable or immutable object to an immutable object. Whereas my

Re: [Python-Dev] accept string in a2b and base64?

2012-02-21 Thread M.-A. Lemburg
Nick Coghlan wrote: The reason Python 2's implicit str-unicode conversions are so problematic isn't just because they're implicit: it's because they effectively assume *latin-1* as the encoding on the 8-bit str side. The implicit conversion in Python2 only works with ASCII content, pretty much

Re: [Python-Dev] PEP: New timestamp formats

2012-02-02 Thread M.-A. Lemburg
Nick Coghlan wrote: On Thu, Feb 2, 2012 at 10:16 PM, Victor Stinner Add an argument to change the result type - There should also be a description of the set a boolean flag to request high precision output approach. You mean something like:

Re: [Python-Dev] Counting collisions for the win

2012-01-23 Thread M.-A. Lemburg
Frank Sievertsen wrote: Hello, I'd still prefer to see a randomized hash()-function (at least for 3.3). But to protect against the attacks it would be sufficient to use randomization for collision resolution in dicts (and sets). What if we use a second (randomized) hash-function in case

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-29 Thread M.-A. Lemburg
Mark Shannon wrote: Michael Foord wrote: Hello all, A paper (well, presentation) has been published highlighting security problems with the hashing algorithm (exploiting collisions) in many programming languages Python included:

Re: [Python-Dev] PEP 393 close to pronouncement

2011-10-11 Thread M.-A. Lemburg
Victor Stinner wrote: Given that I've been working on and maintaining the Python Unicode implementation actively or by providing assistance for almost 12 years now, I've also thought about whether it's still worth the effort. Thanks for your huge work on Unicode, Marc-Andre! Thanks. I

Re: [Python-Dev] PEP 393 close to pronouncement

2011-09-28 Thread M.-A. Lemburg
Guido van Rossum wrote: Given the feedback so far, I am happy to pronounce PEP 393 as accepted. Martin, congratulations! Go ahead and mark ity as Accepted. (But please do fix up the small nits that Victor reported in his earlier message.) I've been working on feedback for the last few days,

Re: [Python-Dev] Not able to do unregister a code

2011-09-15 Thread M.-A. Lemburg
Jai Sharma wrote: Hi, I am facing a memory leaking issue with codecs. I make my own ABC class and register it with codes. import codecs codecs.register(ABC) but I am not able to remove ABC from memory. Is there any alternative to do that. The ABC codec search function gets added to

Re: [Python-Dev] Ctypes and the stdlib (was Re: LZMA compression support in 3.3)

2011-08-29 Thread M.-A. Lemburg
Guido van Rossum wrote: On Sun, Aug 28, 2011 at 11:23 AM, Stefan Behnel stefan...@behnel.de wrote: Hi, sorry for hooking in here with my usual Cython bias and promotion. When the question comes up what a good FFI for Python should look like, it's an obvious reaction from my part to throw

Re: [Python-Dev] PEP 393 review

2011-08-29 Thread M.-A. Lemburg
Martin v. Löwis wrote: tl;dr: PEP-393 reduces the memory usage for strings of a very small Django app from 7.4MB to 4.4MB, all other objects taking about 1.9MB. Am 26.08.2011 16:55, schrieb Guido van Rossum: It would be nice if someone wrote a test to roughly verify these numbers, e.v. by

Re: [Python-Dev] PEP 393 Summer of Code Project

2011-08-26 Thread M.-A. Lemburg
Stefan Behnel wrote: Isaac Morland, 26.08.2011 04:28: On Thu, 25 Aug 2011, Guido van Rossum wrote: I'm not sure what should happen with UTF-8 when it (in flagrant violation of the standard, I presume) contains two separately-encoded surrogates forming a valid surrogate pair; probably whatever

Re: [Python-Dev] Should we move to replace re with regex?

2011-08-26 Thread M.-A. Lemburg
Guido van Rossum wrote: I just made a pass of all the Unicode-related bugs filed by Tom Christiansen, and found that in several, the response was this is fixed in the regex module [by Matthew Barnett]. I started replying that I thought that we should fix the bugs in the re module (i.e.,

Re: [Python-Dev] Should we move to replace re with regex?

2011-08-26 Thread M.-A. Lemburg
Guido van Rossum wrote: On Fri, Aug 26, 2011 at 3:09 PM, M.-A. Lemburg m...@egenix.com wrote: Guido van Rossum wrote: I just made a pass of all the Unicode-related bugs filed by Tom Christiansen, and found that in several, the response was this is fixed in the regex module [by Matthew Barnett

Re: [Python-Dev] Status of the PEP 400? (deprecate codecs.StreamReader/StreamWriter)

2011-07-29 Thread M.-A. Lemburg
Victor Stinner wrote: Le 28/07/2011 11:28, Victor Stinner a écrit : Please do keep the original implementation around (e.g. renamed to codecs.open_stream()), though, so that it's still possible to get easy-to-use access to codec StreamReader/Writers. I will add your alternative to the PEP

Re: [Python-Dev] Status of the PEP 400? (deprecate codecs.StreamReader/StreamWriter)

2011-07-28 Thread M.-A. Lemburg
Victor Stinner wrote: Hi, Three weeks ago, I posted a draft on my PEP on this mailing list. I tried to include all remarks you made, and the PEP is now online: http://www.python.org/dev/peps/pep-0400/ It's now unclear to me if the PEP will be accepted or rejected. I don't know what

Re: [Python-Dev] Draft PEP: Deprecate codecs.StreamReader and codecs.StreamWriter

2011-07-07 Thread M.-A. Lemburg
Victor Stinner wrote: Hi, Last may, I proposed to deprecate open() function, StreamWriter and StreamReader classes of the codecs module. I accepted to keep open() after the discussion on python-dev. Here is a more complete proposition as a PEP. It is a draft and I expect a lot of comments

Re: [Python-Dev] open(): set the default encoding to 'utf-8' in Python 3.3?

2011-06-29 Thread M.-A. Lemburg
Victor Stinner wrote: Le mardi 28 juin 2011 à 16:02 +0200, M.-A. Lemburg a écrit : How about a more radical change: have open() in Py3 default to opening the file in binary mode, if no encoding is given (even if the mode doesn't include 'b') ? I tried your suggested change: Python doesn't

Re: [Python-Dev] open(): set the default encoding to 'utf-8' in Python 3.3?

2011-06-29 Thread M.-A. Lemburg
Victor Stinner wrote: Le mercredi 29 juin 2011 à 10:18 +0200, M.-A. Lemburg a écrit : Victor Stinner wrote: Le mardi 28 juin 2011 à 16:02 +0200, M.-A. Lemburg a écrit : How about a more radical change: have open() in Py3 default to opening the file in binary mode, if no encoding is given

Re: [Python-Dev] open(): set the default encoding to 'utf-8' in Python 3.3?

2011-06-28 Thread M.-A. Lemburg
Victor Stinner wrote: In Python 2, open() opens the file in binary mode (e.g. file.readline() returns a byte string). codecs.open() opens the file in binary mode by default, you have to specify an encoding name to open it in text mode. In Python 3, open() opens the file in text mode by

[Python-Dev] Python language summit on ustream.tv

2011-06-16 Thread M.-A. Lemburg
Dear Python Developers, for the upcoming language summit at EuroPython, I'd like to try out whether streaming such meetings would work. I'll setup a webcam and stream the event live to a private channel on ustream.tv. These are the details in case you want to watch: URL:

Re: [Python-Dev] cpython: Remove some extraneous parentheses and swap the comparison order to

2011-06-07 Thread M.-A. Lemburg
Georg Brandl wrote: On 06/07/11 05:20, brett.cannon wrote: http://hg.python.org/cpython/rev/fc282e375703 changeset: 70695:fc282e375703 user:Brett Cannon br...@python.org date:Mon Jun 06 20:20:36 2011 -0700 summary: Remove some extraneous parentheses and swap the

Re: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader

2011-05-27 Thread M.-A. Lemburg
Victor Stinner wrote: Le mercredi 25 mai 2011 à 15:43 +0200, M.-A. Lemburg a écrit : For UTF-16 it would e.g. make sense to always read data in blocks with even sizes, removing the trial-and-error decoding and extra buffering currently done by the base classes. For UTF-32, the blocks should

Re: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader

2011-05-27 Thread M.-A. Lemburg
Victor Stinner wrote: Le vendredi 27 mai 2011 10:17:29, M.-A. Lemburg a écrit : I am still -1 on deprecating the StreamReader/Writer parts of the codec APIs. I've given numerous reasons on why these are useful, what their intention is, why they were added to Python 1.6. codecs.open() now

Re: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader

2011-05-27 Thread M.-A. Lemburg
Victor Stinner wrote: Le vendredi 27 mai 2011 15:42:10, M.-A. Lemburg a écrit : If we'd go by your reasoning for deprecating and eventually removing parts of the stdlib or Python's subsystems, we'll end up with a barebone version of Python. That's not what we want and it's not what our users

Re: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader

2011-05-25 Thread M.-A. Lemburg
Walter Dörwald wrote: On 24.05.11 12:58, Victor Stinner wrote: Le mardi 24 mai 2011 à 12:42 +0200, Łukasz Langa a écrit : Wiadomość napisana przez Walter Dörwald w dniu 2011-05-24, o godz. 12:16: I don't see which usecase is not covered by TextIOWrapper. But I know some cases which are not

Re: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader

2011-05-25 Thread M.-A. Lemburg
Victor Stinner wrote: Le mercredi 25 mai 2011 à 11:38 +0200, M.-A. Lemburg a écrit : You are missing the point: we have StreamReader and StreamWriter APIs on codecs to allow each codecs to implement more efficient ways of encoding and decoding streams. Examples of such optimizations

Re: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader

2011-05-24 Thread M.-A. Lemburg
Victor Stinner wrote: Hi, In Python 2, codecs.open() is the best way to read and/or write files using Unicode. But in Python 3, open() is preferred with its fast io module. I would like to deprecate codecs.open() because it can be replaced by open() and io.TextIOWrapper. I would like your

Re: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader

2011-05-24 Thread M.-A. Lemburg
Victor Stinner wrote: Le mardi 24 mai 2011 à 10:03 +0200, M.-A. Lemburg a écrit : Please read PEP 100 regarding StreamReader and StreamWriter. Those codecs parts were explicitly designed to be stateful, unlike the stateless encoder/decoder methods. Yes, it is possible to implement stateful

Re: [Python-Dev] [Python-checkins] cpython (3.2): Avoid codec spelling issues by just using the utf-8 default.

2011-05-05 Thread M.-A. Lemburg
Raymond Hettinger wrote: On May 5, 2011, at 11:41 AM, Benjamin Peterson wrote: 2011/5/5 raymond.hettinger python-check...@python.org: http://hg.python.org/cpython/rev/1a56775c6e54 changeset: 69857:1a56775c6e54 branch: 3.2 parent: 69855:97a4855202b8 user:Raymond

Re: [Python-Dev] Convert Py_Buffer to Py_UNICODE

2011-05-02 Thread M.-A. Lemburg
Sijin Joseph wrote: Hi - I am working on a patch where I have an argument that can either be a unicode string or binary data, I parse the argument using the PyArg_ParseTuple method using the s* format specification and get a Py_Buffer. I now need to convert this Py_Buffer object to a

Re: [Python-Dev] Proposal for a common benchmark suite

2011-04-29 Thread M.-A. Lemburg
Mark Shannon wrote: Maciej Fijalkowski wrote: On Thu, Apr 28, 2011 at 11:10 PM, Stefan Behnel stefan...@behnel.de wrote: M.-A. Lemburg, 28.04.2011 22:23: Stefan Behnel wrote: DasIch, 28.04.2011 20:55: the CPython benchmarks have an extensive set of microbenchmarks in the pybench package

Re: [Python-Dev] Proposal for a common benchmark suite

2011-04-29 Thread M.-A. Lemburg
DasIch wrote: Given those facts I think including pybench is a mistake. It does not allow for a fair or meaningful comparison between implementations which is one of the things the suite is supposed to be used for in the future. This easily leads to misinterpretation of the results from

Re: [Python-Dev] Proposal for a common benchmark suite

2011-04-28 Thread M.-A. Lemburg
Stefan Behnel wrote: DasIch, 28.04.2011 20:55: the CPython benchmarks have an extensive set of microbenchmarks in the pybench package Try not to care too much about pybench. There is some value in it, but some of its microbenchmarks are also tied to CPython's interpreter behaviour. For

Re: [Python-Dev] Drop OS/2 and VMS support?

2011-04-19 Thread M.-A. Lemburg
Victor Stinner wrote: Hi, I asked one year ago if we should drop OS/2 support: Andrew MacIntyre, our OS/2 maintainer, answered: http://mail.python.org/pipermail/python-dev/2010-April/099477.html Extract: The 3.x branch needs quite a bit of work on OS/2 to deal with Unicode, as OS/2 was

Re: [Python-Dev] Drop OS/2 and VMS support?

2011-04-19 Thread M.-A. Lemburg
Doug Hellmann wrote: On Apr 19, 2011, at 10:36 AM, M.-A. Lemburg wrote: Victor Stinner wrote: Hi, I asked one year ago if we should drop OS/2 support: Andrew MacIntyre, our OS/2 maintainer, answered: http://mail.python.org/pipermail/python-dev/2010-April/099477.html Extract: The 3.x

Re: [Python-Dev] Replace useless %.100s by %s in PyErr_Format()

2011-03-30 Thread M.-A. Lemburg
Victor Stinner wrote: Le jeudi 24 mars 2011 à 13:22 +0100, M.-A. Lemburg a écrit : BTW: Why do you think that %.100s is not supported in PyErr_Format() in Python 2.x ? PyString_FromFormatV() does support this. The change to use Unicode error strings introduced the problem, since

Re: [Python-Dev] Copyright notices

2011-03-21 Thread M.-A. Lemburg
Nadeem Vawda wrote: I was wondering what the policy is regarding copyright notices and license boilerplate text at the top of source files. I am currently rewriting the bz2 module (see http://bugs.python.org/issue5863), splitting the existing Modules/bz2module.c into Modules/_bz2module.c

Re: [Python-Dev] Improvements for Porting C Extension from 2 to 3

2011-03-03 Thread M.-A. Lemburg
Sümer Cip wrote: Hi, While porting a C extension from 2 to 3, I realized that there are some general cases which can be automated. For example, for my specific application (yappi - http://code.google.com/p/yappi/), all I need to do is following things: 1) define PyModuleDef 2) change

Re: [Python-Dev] Strange error importing a Pickle from 2.7 to 3.2

2011-02-24 Thread M.-A. Lemburg
Alexander Belopolsky wrote: On Wed, Feb 23, 2011 at 6:32 PM, M.-A. Lemburg m...@egenix.com wrote: Alexander Belopolsky wrote: .. In what sense is Latin-1 the official name? The IANA charset registry has the following listing Name: ISO_8859-1:1987

Re: [Python-Dev] Strange error importing a Pickle from 2.7 to 3.2

2011-02-23 Thread M.-A. Lemburg
Alexander Belopolsky wrote: On Wed, Feb 23, 2011 at 4:07 PM, Guido van Rossum gu...@python.org wrote: I'm guessing that one of these encoding names is recognized by the C code while the other one takes the slow path via the aliasing code. This is absolutely right. In fact I am going to

Re: [Python-Dev] Strange error importing a Pickle from 2.7 to 3.2

2011-02-23 Thread M.-A. Lemburg
Alexander Belopolsky wrote: On Wed, Feb 23, 2011 at 4:23 PM, M.-A. Lemburg m...@egenix.com wrote: .. Latin-1 is the official name and the one used internally by Python, so it would be good to have the test suite and Python code in general to use that variant of the name (just as utf-8

Re: [Python-Dev] Strange error importing a Pickle from 2.7 to 3.2

2011-02-23 Thread M.-A. Lemburg
Alexander Belopolsky wrote: On Wed, Feb 23, 2011 at 4:54 PM, M.-A. Lemburg m...@egenix.com wrote: .. Yet 108 for the correct name, so I can't follow your statement that the wrong variant is used more often. Hmm, your grepping skills are probably better than mine. I get $ grep -iw latin

Re: [Python-Dev] Strange error importing a Pickle from 2.7 to 3.2

2011-02-23 Thread M.-A. Lemburg
Alexander Belopolsky wrote: On Wed, Feb 23, 2011 at 4:23 PM, M.-A. Lemburg m...@egenix.com wrote: .. Latin-1 is the official name and the one used internally by Python, In what sense is Latin-1 the official name? The IANA charset registry has the following listing Name: ISO_8859-1

Re: [Python-Dev] API bloat

2011-02-10 Thread M.-A. Lemburg
Mark Shannon wrote: Nick Coghlan wrote: On Thu, Feb 10, 2011 at 8:16 PM, Mark Shannon ma...@dcs.gla.ac.uk wrote: Doing a search for the regex: PyAPI_FUNC\([^)]*\) *Py in .h files, which should match API functions (functions starting _Py are excluded) gives the following result: Version

Re: [Python-Dev] API bloat

2011-02-10 Thread M.-A. Lemburg
Mark Shannon wrote: M.-A. Lemburg wrote: Mark Shannon wrote: Nick Coghlan wrote: On Thu, Feb 10, 2011 at 8:16 PM, Mark Shannon ma...@dcs.gla.ac.uk wrote: Doing a search for the regex: PyAPI_FUNC\([^)]*\) *Py in .h files, which should match API functions (functions starting _Py

Re: [Python-Dev] API bloat

2011-02-09 Thread M.-A. Lemburg
Mark Shannon wrote: The Unicode Exception Objects section is new and seemingly redundant: http://docs.python.org/py3k/c-api/exceptions.html#unicode-exception-objects Should this be in the public API? Those function have been in the public API since we introduced Unicode callbak error handlers.

Re: [Python-Dev] Python Unit Tests

2011-02-08 Thread M.-A. Lemburg
Wesley Mesquita wrote: Hi all, I starting to explore python 3k core development environment. So, sorry in advance for any mistakes, but I really don't know what is the best list to post this, since it not a use of python issue, and probably is not a dev issue, it is more like a dev env

Re: [Python-Dev] PEP 393: Flexible String Representation

2011-01-25 Thread M.-A. Lemburg
I'll comment more on this later this week... From my first impression, I'm not too thrilled by the prospect of making the Unicode implementation more complicated by having three different representations on each object. I also don't see how this could save a lot of memory. As an example take a

Re: [Python-Dev] [Python-checkins] r88127 - in python/branches/py3k/Misc: README.AIX README.OpenBSD cheatsheet

2011-01-20 Thread M.-A. Lemburg
brett.cannon wrote: Author: brett.cannon Date: Thu Jan 20 20:34:35 2011 New Revision: 88127 Log: Remove some outdated files from Misc. Removed: python/branches/py3k/Misc/README.AIX Are you sure that the AIX README is outdated ? It explains some of the details of why there are

Re: [Python-Dev] Tools/unicode

2011-01-03 Thread M.-A. Lemburg
Michael Foord wrote: On 03/01/2011 15:39, Alexander Belopolsky wrote: On Mon, Jan 3, 2011 at 10:33 AM, Michael Foordmich...@voidspace.org.uk wrote: .. If someone knows if this tool is still used/useful then please let us know how the description should best be updated. If there are no

Re: [Python-Dev] The fate of transform() and untransform() methods

2010-12-09 Thread M.-A. Lemburg
Alexander Belopolsky wrote: On Fri, Dec 3, 2010 at 1:05 PM, Guido van Rossum gu...@python.org wrote: On Fri, Dec 3, 2010 at 9:58 AM, R. David Murray rdmur...@bitdance.com wrote: .. I believe MAL's thought was that the addition of these methods had been approved pre-moratorium, but I don't

Re: [Python-Dev] The fate of transform() and untransform() methods

2010-12-09 Thread M.-A. Lemburg
Michael Foord wrote: On 09/12/2010 15:03, M.-A. Lemburg wrote: Alexander Belopolsky wrote: On Fri, Dec 3, 2010 at 1:05 PM, Guido van Rossumgu...@python.org wrote: On Fri, Dec 3, 2010 at 9:58 AM, R. David Murrayrdmur...@bitdance.com wrote: .. I believe MAL's thought

Re: [Python-Dev] The fate of transform() and untransform() methods

2010-12-09 Thread M.-A. Lemburg
Alexander Belopolsky wrote: On Thu, Dec 9, 2010 at 10:03 AM, M.-A. Lemburg m...@egenix.com wrote: Alexander Belopolsky wrote: .. The ticket that introduced the change is currently closed [3] even though the last message suggests that at least part of the change needs to be reverted

Re: [Python-Dev] transform() and untransform() methods, and the codec registry

2010-12-06 Thread M.-A. Lemburg
Guido van Rossum wrote: On Fri, Dec 3, 2010 at 9:58 AM, R. David Murray rdmur...@bitdance.com wrote: On Fri, 03 Dec 2010 11:14:56 -0500, Alexander Belopolsky alexander.belopol...@gmail.com wrote: On Fri, Dec 3, 2010 at 10:11 AM, R. David Murray rdmur...@bitdance.com wrote: .. Please also

Re: [Python-Dev] Python and the Unicode Character Database

2010-12-03 Thread M.-A. Lemburg
Alexander Belopolsky wrote: On Thu, Dec 2, 2010 at 5:58 PM, M.-A. Lemburg m...@egenix.com wrote: .. I will change my mind on this issue when you present a machine-readable file with Arabic-Indic numerals and a program capable of reading it and show that this program uses the same number

Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread M.-A. Lemburg
Martin v. Löwis wrote: Now, one may wonder what precisely a possibly signed floating point number is, but most likely, this refers to floatnumber ::= pointfloat | exponentfloat pointfloat::= [intpart] fraction | intpart . exponentfloat ::= (intpart | pointfloat) exponent intpart

Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread M.-A. Lemburg
Martin v. Löwis wrote: [...] For direct entry by an interactive user, yes. Why are some people in this discussion thinking only of direct entry by an interactive user? Ultimately, somebody will have entered the data. I don't think you really believe that all data processed by a computer was

Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread M.-A. Lemburg
Eric Smith wrote: The current behavior should go nowhere; it is not useful. Something very similar to the current behavior (but done correctly) should go into the locale module. I agree with everything Martin says here. I think the basic premise is: you won't find strings in the wild that

Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread M.-A. Lemburg
Alexander Belopolsky wrote: On Thu, Dec 2, 2010 at 4:14 PM, M.-A. Lemburg m...@egenix.com wrote: .. Have you tried Google ? I tried google at I could not find any plain text or HTML file that would use Arabic-Indic numerals. What was interesting, though that a search for quran unicode

Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread M.-A. Lemburg
Terry Reedy wrote: On 11/29/2010 10:19 AM, M.-A. Lemburg wrote: Nick Coghlan wrote: On Mon, Nov 29, 2010 at 9:02 PM, M.-A. Lemburgm...@egenix.com wrote: If we would go down that road, we would also have to disable other Unicode features based on locale, e.g. whether to apply non-ASCII case

Re: [Python-Dev] Python and the Unicode Character Database

2010-12-02 Thread M.-A. Lemburg
Eric Smith wrote: On 12/2/2010 5:43 PM, M.-A. Lemburg wrote: Eric Smith wrote: The current behavior should go nowhere; it is not useful. Something very similar to the current behavior (but done correctly) should go into the locale module. I agree with everything Martin says here. I think

Re: [Python-Dev] Python and the Unicode Character Database

2010-12-01 Thread M.-A. Lemburg
Terry Reedy wrote: On 11/30/2010 10:05 AM, Alexander Belopolsky wrote: My general answers to the questions you have raised are as follows: 1. Each new feature release should use the latest version of the UCD as of the first beta release (or perhaps a week or so before). New chars are new

Re: [Python-Dev] Python and the Unicode Character Database

2010-12-01 Thread M.-A. Lemburg
Martin v. Löwis wrote: Am 30.11.2010 21:24, schrieb Ben Finney: haiyang kang corn...@gmail.com writes: I think it is a little ugly to have code like this: num = float(一.一), expected result is: num = 1.1 That's a straw man, though. The string need not be a literal in the program; it can

Re: [Python-Dev] Python and the Unicode Character Database

2010-12-01 Thread M.-A. Lemburg
Terry Reedy wrote: On 11/30/2010 3:23 AM, Stephen J. Turnbull wrote: I see no reason not to make a similar promise for numeric literals. I see no good reason to allow compatibility full-width Japanese ASCII numerals or Arabic cursive numerals in for i in range(...) for example. I do not

Re: [Python-Dev] Python and the Unicode Character Database

2010-11-29 Thread M.-A. Lemburg
Alexander Belopolsky wrote: On Sun, Nov 28, 2010 at 5:42 PM, M.-A. Lemburg m...@egenix.com wrote: .. I don't see why the language spec should limit the wealth of number formats supported by float(). The Language Spec (whatever it is) should not, but hopefully the Library Reference should

Re: [Python-Dev] Python and the Unicode Character Database

2010-11-29 Thread M.-A. Lemburg
Nick Coghlan wrote: On Mon, Nov 29, 2010 at 1:39 PM, Stephen J. Turnbull step...@xemacs.org wrote: I agree that Python should make it easy for the programmer to get numerical values of native numeric strings, but it's not at all clear to me that there is any point to having float() recognize

Re: [Python-Dev] Python and the Unicode Character Database

2010-11-29 Thread M.-A. Lemburg
Nick Coghlan wrote: On Mon, Nov 29, 2010 at 9:02 PM, M.-A. Lemburg m...@egenix.com wrote: If we would go down that road, we would also have to disable other Unicode features based on locale, e.g. whether to apply non-ASCII case mappings, what to consider whitespace, etc. We don't do

Re: [Python-Dev] Python and the Unicode Character Database

2010-11-29 Thread M.-A. Lemburg
Alexander Belopolsky wrote: On Mon, Nov 29, 2010 at 2:22 AM, Martin v. Löwis mar...@v.loewis.de wrote: The former ensures that literals in code are always readable; the later allows users to enter numbers in their own number system. How could that be a bad thing? It's YAGNI, feature bloat.

Re: [Python-Dev] Python and the Unicode Character Database

2010-11-28 Thread M.-A. Lemburg
Martin v. Löwis wrote: float('١٢٣٤.٥٦') 1234.56 I think it's a bug that this works. The definition of the float builtin says Convert a string or a number to floating point. If the argument is a string, it must contain a possibly signed decimal or floating point number, possibly

Re: [Python-Dev] Python and the Unicode Character Database

2010-11-28 Thread M.-A. Lemburg
Alexander Belopolsky wrote: Two recently reported issues brought into light the fact that Python language definition is closely tied to character properties maintained by the Unicode Consortium. [1,2] For example, when Python switches to Unicode 6.0.0 (planned for the upcoming 3.2 release),

Re: [Python-Dev] len(chr(i)) = 2?

2010-11-25 Thread M.-A. Lemburg
Terry Reedy wrote: On 11/24/2010 3:06 PM, Alexander Belopolsky wrote: Any non-trivial text processing is likely to be broken in presence of surrogates. Producing them on input is just trading known issue for an unknown one. Processing surrogate pairs in python code is hard. Software that

Re: [Python-Dev] len(chr(i)) = 2?

2010-11-25 Thread M.-A. Lemburg
Alexander Belopolsky wrote: On Wed, Nov 24, 2010 at 9:17 PM, Stephen J. Turnbull step...@xemacs.org wrote: .. I note that an opinion has been raised on this thread that if we want compressed internal representation for strings, we should use UTF-8. I tend to agree, but UTF-8 has been

Re: [Python-Dev] len(chr(i)) = 2?

2010-11-24 Thread M.-A. Lemburg
Alexander Belopolsky wrote: To conclude, I feel that rather than trying to fully support non-BMP characters as surrogate pairs in narrow builds, we should make it easier for application developers to avoid them. I don't understand what you're after here. Programmers can easily avoid them by

Re: [Python-Dev] len(chr(i)) = 2?

2010-11-23 Thread M.-A. Lemburg
Alexander Belopolsky wrote: On Mon, Nov 22, 2010 at 1:13 PM, Raymond Hettinger raymond.hettin...@gmail.com wrote: .. Any explanation we give users needs to let them know two things: * that we cover the entire range of unicode not just BMP * that sometimes len(chr(i)) is one and sometimes two

Re: [Python-Dev] len(chr(i)) = 2?

2010-11-22 Thread M.-A. Lemburg
Martin, it is really irrelevant whether the standards have decided to no longer use the terms UCS-2 and UCS-4 in their latest standard documents. The definitions still stand (just like Unicode 2.0 is still a valid standard, even if it's ten years old): * UCS-2 is defined as Universal Character

Re: [Python-Dev] len(chr(i)) = 2?

2010-11-22 Thread M.-A. Lemburg
Raymond Hettinger wrote: Any explanation we give users needs to let them know two things: * that we cover the entire range of unicode not just BMP * that sometimes len(chr(i)) is one and sometimes two The term UCS-2 is a complete communications failure in that regard. If someone looks up

Re: [Python-Dev] len(chr(i)) = 2?

2010-11-19 Thread M.-A. Lemburg
Victor Stinner wrote: Hi, On Friday 19 November 2010 17:53:58 Alexander Belopolsky wrote: I was recently surprised to learn that chr(i) can produce a string of length 2 in python 3.x. Yes, but only on narrow build. Eg. Debian and Ubuntu compile Python 3.1 in wide mode (sys.maxunicode ==

<    1   2   3   4   5   6   7   8   9   10   >