Re: [Python-Dev] Unicode 5.1.0

2008-08-25 Thread Barry Warsaw
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I was away for the weekend and am struggling to catch up on my email. Since I haven't digested this entire thread, I'll refrain for the moment from giving my opinion, however this comment jumped out to me. On Aug 22, 2008, at 9:42 AM, Facundo

Re: [Python-Dev] Unicode 5.1.0

2008-08-25 Thread İsmail Dönmez
Hi, On Thu, Aug 21, 2008 at 23:35, Guido van Rossum [EMAIL PROTECTED] wrote: I was just paid a visit by my Google colleague Mark Davis, co-founder of the Unicode project and the president of the Unicode Consortium. He would like to see improved Unicode support for Python. (Well duh. :-) On

Re: [Python-Dev] Unicode 5.1.0

2008-08-25 Thread M.-A. Lemburg
On 2008-08-22 03:25, Guido van Rossum wrote: On Thu, Aug 21, 2008 at 2:26 PM, M.-A. Lemburg [EMAIL PROTECTED] wrote: On 2008-08-21 22:35, Guido van Rossum wrote: I was just paid a visit by my Google colleague Mark Davis, co-founder of the Unicode project and the president of the Unicode

Re: [Python-Dev] Unicode 5.1.0

2008-08-25 Thread Guido van Rossum
2008/8/25 M.-A. Lemburg [EMAIL PROTECTED]: I would really like to see more Unicode support in Python, e.g. for collation, compression, indexing based on graphemes and code points, better support for special casing situations (to cover e.g. the dotted vs. non-dotted i in the Turkish scripts),

Re: [Python-Dev] Unicode 5.1.0

2008-08-25 Thread Terry Reedy
Guido van Rossum wrote: 2008/8/25 M.-A. Lemburg [EMAIL PROTECTED] mailto:[EMAIL PROTECTED]: I would really like to see more Unicode support in Python, e.g. for collation, compression, indexing based on graphemes and code points, better support for special casing situations (to cover

Re: [Python-Dev] Unicode 5.1.0

2008-08-25 Thread Barry Warsaw
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Aug 21, 2008, at 6:30 PM, Terry Reedy wrote: http://www.unicode.org/versions/Unicode5.1.0/ Unicode 5.1.0 contains over 100,000 characters, and provides significant additions and improvements... to existing features, including new files and

Re: [Python-Dev] Unicode 5.1.0

2008-08-25 Thread Benjamin Peterson
On Mon, Aug 25, 2008 at 12:34 PM, Barry Warsaw [EMAIL PROTECTED] wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Aug 21, 2008, at 6:30 PM, Terry Reedy wrote: http://www.unicode.org/versions/Unicode5.1.0/ Unicode 5.1.0 contains over 100,000 characters, and provides significant

Re: [Python-Dev] Unicode 5.1.0

2008-08-25 Thread Fredrik Lundh
Barry Warsaw wrote: I agree. This seriously feels like new, potentially high risk code to be adding this late in the game. The BDFL can always override, but unless someone is really convincing that this is low risk high benefit, I'd vote no for 2.6/3.0. at least two Unicode experts have

Re: [Python-Dev] Unicode 5.1.0

2008-08-25 Thread M.-A. Lemburg
On 2008-08-25 19:34, Barry Warsaw wrote: On Aug 21, 2008, at 6:30 PM, Terry Reedy wrote: http://www.unicode.org/versions/Unicode5.1.0/ Unicode 5.1.0 contains over 100,000 characters, and provides significant additions and improvements... to existing features, including new files and

Re: [Python-Dev] Unicode 5.1.0

2008-08-25 Thread Brett Cannon
On Mon, Aug 25, 2008 at 10:56 AM, Guido van Rossum [EMAIL PROTECTED] wrote: On Mon, Aug 25, 2008 at 10:52 AM, Benjamin Peterson [EMAIL PROTECTED] wrote: On Mon, Aug 25, 2008 at 12:34 PM, Barry Warsaw [EMAIL PROTECTED] wrote: On Aug 21, 2008, at 6:30 PM, Terry Reedy wrote:

Re: [Python-Dev] Unicode 5.1.0

2008-08-25 Thread Barry Warsaw
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Aug 25, 2008, at 1:53 PM, Fredrik Lundh wrote: Barry Warsaw wrote: I agree. This seriously feels like new, potentially high risk code to be adding this late in the game. The BDFL can always override, but unless someone is really

Re: [Python-Dev] Unicode 5.1.0

2008-08-25 Thread Barry Warsaw
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Aug 25, 2008, at 2:15 PM, M.-A. Lemburg wrote: Guido's request was just for updating the Unicode database with the data from 5.1 - without adding new support for properties or changing the interfaces. See this page for a list of changes to the

Re: [Python-Dev] Unicode 5.1.0

2008-08-25 Thread Fredrik Lundh
Barry Warsaw wrote: You don't mean the experts claimed they weren't important, right? Unimportant changes definitely don't need to go in now wink. Well, at least Guido managed to figure out what I was trying to say ;-) /F ___ Python-Dev mailing

Re: [Python-Dev] Unicode 5.1.0

2008-08-25 Thread Barry Warsaw
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Aug 25, 2008, at 3:17 PM, Fredrik Lundh wrote: Barry Warsaw wrote: You don't mean the experts claimed they weren't important, right? Unimportant changes definitely don't need to go in now wink. Well, at least Guido managed to figure out

Re: [Python-Dev] Unicode 5.1.0

2008-08-25 Thread Steve Holden
Barry Warsaw wrote: On Aug 21, 2008, at 6:30 PM, Terry Reedy wrote: http://www.unicode.org/versions/Unicode5.1.0/ Unicode 5.1.0 contains over 100,000 characters, and provides significant additions and improvements... to existing features, including new files and upgrades to existing files.

Re: [Python-Dev] Unicode 5.1.0

2008-08-24 Thread Martin v. Löwis
is the suggestion to *replace* the 4.1.0 database with a 5.1.0 database, or to add yet another database in that module? I would replace it. (how's the 3.2/4.1 dual support implemented? The compiler needs data files for all supported versions, with old_versions listing the, well, old

Re: [Python-Dev] Unicode 5.1.0

2008-08-24 Thread Martin v. Löwis
That's up to us. I don't know what the reason was for keeping the 3.2.0 database around -- does anyone here recall ever using it? For what? It's needed for IDNA. The IDNA RFC requires that Unicode 3.2 is used for performing IDNA (in particular, for determining what a valid domain name is).

Re: [Python-Dev] Unicode 5.1.0

2008-08-24 Thread Martin v. Löwis
I can tinker a little with this over the weekend, unless Martin tells me not to ;-) Go ahead; I can't work on this at the moment, anyway. I would also be confident that a mere replacement of 4.1 with 5.1 should be easy, and I see no reason to keep the 4.1 version. Perhaps makeunicodedata

Re: [Python-Dev] Unicode 5.1.0

2008-08-22 Thread Fredrik Lundh
On Fri, Aug 22, 2008 at 3:25 AM, Guido van Rossum [EMAIL PROTECTED] wrote: So while we could say: we provide access to the Unicode 5.1.0 database, we cannot say: we support Unicode 5.1.0, simply because we have not reviewed the all the necessary changes and implications. Mark's response to

Re: [Python-Dev] Unicode 5.1.0

2008-08-22 Thread Facundo Batista
2008/8/21 Guido van Rossum [EMAIL PROTECTED]: The two, quite separate, questions, then, are (a) how much work would it be to upgrade to version 5.1.0 of the database; and (b) would it be acceptable to do this post-beta3 (but before rc1). If the answer to (b) is positive, Google can help with

Re: [Python-Dev] Unicode 5.1.0

2008-08-22 Thread Antoine Pitrou
Facundo Batista facundobatista at gmail.com writes: Two thoughts: - In view of jumping to a new standard at *this* point, what I'd like to have is a comprehensive test suite for unicodedata in a similar sense to what happens with Decimal... It would be great to have from the Unicode

Re: [Python-Dev] Unicode 5.1.0

2008-08-22 Thread Guido van Rossum
On Fri, Aug 22, 2008 at 3:47 AM, Fredrik Lundh [EMAIL PROTECTED] wrote: On Fri, Aug 22, 2008 at 3:25 AM, Guido van Rossum [EMAIL PROTECTED] wrote: [MAL] So while we could say: we provide access to the Unicode 5.1.0 database, we cannot say: we support Unicode 5.1.0, simply because we have not

Re: [Python-Dev] Unicode 5.1.0

2008-08-22 Thread Guido van Rossum
On Fri, Aug 22, 2008 at 6:42 AM, Facundo Batista [EMAIL PROTECTED] wrote: - In view of jumping to a new standard at *this* point, what I'd like to have is a comprehensive test suite for unicodedata in a similar sense to what happens with Decimal... It would be great to have from the Unicode

Re: [Python-Dev] Unicode 5.1.0

2008-08-22 Thread Fredrik Lundh
On Fri, Aug 22, 2008 at 4:59 PM, Guido van Rossum [EMAIL PROTECTED] wrote: (how's the 3.2/4.1 dual support implemented? do we have two distinct datasets, or are the differences encoded in some clever way? would it make sense to split the unicodedata module into three separate modules, one

Re: [Python-Dev] Unicode 5.1.0

2008-08-22 Thread Guido van Rossum
2008/8/22 Fredrik Lundh [EMAIL PROTECTED]: On Fri, Aug 22, 2008 at 4:59 PM, Guido van Rossum [EMAIL PROTECTED] wrote: (how's the 3.2/4.1 dual support implemented? do we have two distinct datasets, or are the differences encoded in some clever way? would it make sense to split the

Re: [Python-Dev] Unicode 5.1.0

2008-08-22 Thread Fredrik Lundh
when did Python-Dev turn into a members only list, btw? --- Your mail to 'Python-Dev' with the subject Re: Unicode 5.1.0 Is being held until the list moderator can review it for approval. The reason it is being held: Post by non-member to a members-only list ---

Re: [Python-Dev] Unicode 5.1.0

2008-08-22 Thread Guido van Rossum
I think it's an anti-spam measure. Anybody can be a member though. :-) On Fri, Aug 22, 2008 at 8:15 AM, Fredrik Lundh [EMAIL PROTECTED] wrote: when did Python-Dev turn into a members only list, btw? --- Your mail to 'Python-Dev' with the subject Re: Unicode 5.1.0 Is being held until

Re: [Python-Dev] Unicode 5.1.0

2008-08-22 Thread A.M. Kuchling
On Fri, Aug 22, 2008 at 07:59:46AM -0700, Guido van Rossum wrote: That's up to us. I don't know what the reason was for keeping the 3.2.0 database around -- does anyone here recall ever using it? For what? RFC 3491, one of the internationalized domain name RFCs, explicitly requires Unicode

[Python-Dev] Unicode 5.1.0

2008-08-21 Thread Guido van Rossum
I was just paid a visit by my Google colleague Mark Davis, co-founder of the Unicode project and the president of the Unicode Consortium. He would like to see improved Unicode support for Python. (Well duh. :-) On his list of top priorities are: 1. Upgrade the unicodata module to the Unicode

Re: [Python-Dev] Unicode 5.1.0

2008-08-21 Thread M.-A. Lemburg
On 2008-08-21 22:35, Guido van Rossum wrote: I was just paid a visit by my Google colleague Mark Davis, co-founder of the Unicode project and the president of the Unicode Consortium. He would like to see improved Unicode support for Python. (Well duh. :-) On his list of top priorities are: 1.

Re: [Python-Dev] Unicode 5.1.0

2008-08-21 Thread Terry Reedy
Guido van Rossum wrote: I was just paid a visit by my Google colleague Mark Davis, co-founder of the Unicode project and the president of the Unicode Consortium. He would like to see improved Unicode support for Python. (Well duh. :-) On his list of top priorities are: 1. Upgrade the

Re: [Python-Dev] Unicode 5.1.0

2008-08-21 Thread Guido van Rossum
On Thu, Aug 21, 2008 at 2:26 PM, M.-A. Lemburg [EMAIL PROTECTED] wrote: On 2008-08-21 22:35, Guido van Rossum wrote: I was just paid a visit by my Google colleague Mark Davis, co-founder of the Unicode project and the president of the Unicode Consortium. He would like to see improved Unicode