Re: Python Unicode handling wins again -- mostly

2013-12-04 Thread wxjmfauth
Le mardi 3 décembre 2013 15:26:45 UTC+1, Ethan Furman a écrit : On 12/02/2013 12:38 PM, Ethan Furman wrote: On 11/29/2013 04:44 PM, Steven D'Aprano wrote: Out of the nine tests, Python 3.3 passes six, with three tests being failures or dubious. If you believe that the native

Re: Python Unicode handling wins again -- mostly

2013-12-04 Thread Mark Lawrence
On 04/12/2013 13:52, wxjmfa...@gmail.com wrote: [snip all the double spaced stuff] Yon intuitively pointed a very important feature of unicode. However, it is not necessary, this is exactly what unicode does (when used properly). jmf Presumably using unicode correctly prevents messages

Re: Python Unicode handling wins again -- mostly

2013-12-04 Thread Neil Cerutti
On 2013-12-04, wxjmfa...@gmail.com wxjmfa...@gmail.com wrote: Yon intuitively pointed a very important feature of unicode. However, it is not necessary, this is exactly what unicode does (when used properly). Unicode only provides character sets. It's not a natural language parsing facility.

Re: Code of Conduct, Trolls, and Thankless Jobs [was Re: Python Unicode handling wins again -- mostly]

2013-12-03 Thread Mark Lawrence
On 03/12/2013 04:32, Grant Edwards wrote: On 2013-12-03, Roy Smith r...@panix.com wrote: I believe that Pythonistas should commit themselves to achieving the goal, before this decade is out, of making Python 3 the default version and having everybody be cool with unicode. I'm cool with

Re: Code of Conduct, Trolls, and Thankless Jobs [was Re: Python Unicode handling wins again -- mostly]

2013-12-03 Thread Mark Lawrence
On 03/12/2013 01:38, Roy Smith wrote: In article mailman.3485.1386021891.18130.python-l...@python.org, Mark Lawrence breamore...@yahoo.co.uk wrote: My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. I believe that Pythonistas should

Re: Python Unicode handling wins again -- mostly

2013-12-03 Thread Neil Cerutti
On 2013-12-02, Ethan Furman et...@stoneleaf.us wrote: On 11/29/2013 04:44 PM, Steven D'Aprano wrote: Out of the nine tests, Python 3.3 passes six, with three tests being failures or dubious. If you believe that the native string type should operate on code-points, then you'll think that

Re: Python Unicode handling wins again -- mostly

2013-12-03 Thread Ethan Furman
On 12/02/2013 12:38 PM, Ethan Furman wrote: On 11/29/2013 04:44 PM, Steven D'Aprano wrote: Out of the nine tests, Python 3.3 passes six, with three tests being failures or dubious. If you believe that the native string type should operate on code-points, then you'll think that Python does the

Re: Python Unicode handling wins again -- mostly

2013-12-03 Thread wxjmfauth
Le mardi 3 décembre 2013 06:06:26 UTC+1, Steven D'Aprano a écrit : On Mon, 02 Dec 2013 16:14:13 -0500, Ned Batchelder wrote: On 12/2/13 3:38 PM, Ethan Furman wrote: On 11/29/2013 04:44 PM, Steven D'Aprano wrote: Out of the nine tests, Python 3.3 passes six, with three tests

Re: Python Unicode handling wins again -- mostly

2013-12-02 Thread wxjmfauth
Le dimanche 1 décembre 2013 21:54:48 UTC+1, Tim Delaney a écrit : On 2 December 2013 07:15, wxjm...@gmail.com wrote: 0.11.13 02:44, Steven D'Aprano написав(ла): (2) If you reverse that string, does it give lëon? The implication of this question is that strings should operate on

Re: Python Unicode handling wins again -- mostly

2013-12-02 Thread Mark Lawrence
On 02/12/2013 12:39, wxjmfa...@gmail.com wrote: My English is far too be perfect, I think I understood it correctly. PS I did not even speak about the FSR. 1) Your English is far from perfect as you clearly do not understand the repeated requests *NOT* to send us double spaced crap via

Re: Python Unicode handling wins again -- mostly

2013-12-02 Thread Ned Batchelder
On 12/2/13 9:46 AM, Mark Lawrence wrote: On 02/12/2013 12:39, wxjmfa...@gmail.com wrote: My English is far too be perfect, I think I understood it correctly. PS I did not even speak about the FSR. 1) Your English is far from perfect as you clearly do not understand the repeated requests

Re: Python Unicode handling wins again -- mostly

2013-12-02 Thread Mark Lawrence
On 02/12/2013 15:22, Ned Batchelder wrote: On 12/2/13 9:46 AM, Mark Lawrence wrote: On 02/12/2013 12:39, wxjmfa...@gmail.com wrote: My English is far too be perfect, I think I understood it correctly. PS I did not even speak about the FSR. 1) Your English is far from perfect as you

Re: Python Unicode handling wins again -- mostly

2013-12-02 Thread Chris Angelico
On Tue, Dec 3, 2013 at 2:45 AM, Mark Lawrence breamore...@yahoo.co.uk wrote: He's quite deliberately dragged it up by using p.s. Without doubt he's the worst loser in the world and I'm *NOT* stopping getting at him. I find his behaviour, continuously and groundlessly insulting the Python core

Re: Python Unicode handling wins again -- mostly

2013-12-02 Thread Ned Batchelder
On 12/2/13 10:45 AM, Mark Lawrence wrote: On 02/12/2013 15:22, Ned Batchelder wrote: On 12/2/13 9:46 AM, Mark Lawrence wrote: On 02/12/2013 12:39, wxjmfa...@gmail.com wrote: My English is far too be perfect, I think I understood it correctly. PS I did not even speak about the FSR. 1)

Re: Python Unicode handling wins again -- mostly

2013-12-02 Thread Terry Reedy
On 12/2/2013 10:45 AM, Mark Lawrence wrote: the worst loser in the world Mark, I consider your continual direct personal attacks on other posters to be a violation of the PSF Code of Conduct, which *does* apply to python-list. Please stop. -- Terry Jan Reedy, one of multiple list

Re: Python Unicode handling wins again -- mostly

2013-12-02 Thread Mark Lawrence
On 02/12/2013 20:26, Terry Reedy wrote: On 12/2/2013 10:45 AM, Mark Lawrence wrote: the worst loser in the world Mark, I consider your continual direct personal attacks on other posters to be a violation of the PSF Code of Conduct, which *does* apply to python-list. Please stop. The

Re: Python Unicode handling wins again -- mostly

2013-12-02 Thread Ethan Furman
On 11/29/2013 04:44 PM, Steven D'Aprano wrote: Out of the nine tests, Python 3.3 passes six, with three tests being failures or dubious. If you believe that the native string type should operate on code-points, then you'll think that Python does the right thing. I think Python is doing it

Re: Python Unicode handling wins again -- mostly

2013-12-02 Thread Ned Batchelder
On 12/2/13 3:38 PM, Ethan Furman wrote: On 11/29/2013 04:44 PM, Steven D'Aprano wrote: Out of the nine tests, Python 3.3 passes six, with three tests being failures or dubious. If you believe that the native string type should operate on code-points, then you'll think that Python does the

Re: Python Unicode handling wins again -- mostly

2013-12-02 Thread Chris Angelico
On Tue, Dec 3, 2013 at 8:14 AM, Ned Batchelder n...@nedbatchelder.com wrote: This is where my knowledge about Unicode gets fuzzy. Isn't it the case that some grapheme clusters (or whatever the right word is) can't be normalized down to a single code point? Characters can accept many accents,

Re: Python Unicode handling wins again -- mostly

2013-12-02 Thread MRAB
On 02/12/2013 21:14, Ned Batchelder wrote: On 12/2/13 3:38 PM, Ethan Furman wrote: On 11/29/2013 04:44 PM, Steven D'Aprano wrote: Out of the nine tests, Python 3.3 passes six, with three tests being failures or dubious. If you believe that the native string type should operate on code-points,

Re: Python Unicode handling wins again -- mostly

2013-12-02 Thread Ned Batchelder
On 12/2/13 3:45 PM, Mark Lawrence wrote: On 02/12/2013 20:26, Terry Reedy wrote: On 12/2/2013 10:45 AM, Mark Lawrence wrote: the worst loser in the world Mark, I consider your continual direct personal attacks on other posters to be a violation of the PSF Code of Conduct, which *does* apply

Code of Conduct, Trolls, and Thankless Jobs [was Re: Python Unicode handling wins again -- mostly]

2013-12-02 Thread Ethan Furman
On 12/02/2013 12:45 PM, Mark Lawrence wrote: On 02/12/2013 20:26, Terry Reedy wrote: On 12/2/2013 10:45 AM, Mark Lawrence wrote: the worst loser in the world Mark, I consider your continual direct personal attacks on other posters to be a violation of the PSF Code of Conduct, which *does*

Re: Python Unicode handling wins again -- mostly

2013-12-02 Thread Ethan Furman
On 12/02/2013 01:23 PM, Chris Angelico wrote: On Tue, Dec 3, 2013 at 8:14 AM, Ned Batchelder n...@nedbatchelder.com wrote: This is where my knowledge about Unicode gets fuzzy. Isn't it the case that some grapheme clusters (or whatever the right word is) can't be normalized down to a single

Re: Code of Conduct, Trolls, and Thankless Jobs [was Re: Python Unicode handling wins again -- mostly]

2013-12-02 Thread Mark Lawrence
On 02/12/2013 21:25, Ethan Furman wrote: On 12/02/2013 12:45 PM, Mark Lawrence wrote: On 02/12/2013 20:26, Terry Reedy wrote: On 12/2/2013 10:45 AM, Mark Lawrence wrote: the worst loser in the world Mark, I consider your continual direct personal attacks on other posters to be a violation

Re: Code of Conduct, Trolls, and Thankless Jobs [was Re: Python Unicode handling wins again -- mostly]

2013-12-02 Thread Ned Batchelder
On 12/2/13 4:25 PM, Ethan Furman wrote: On 12/02/2013 12:45 PM, Mark Lawrence wrote: On 02/12/2013 20:26, Terry Reedy wrote: On 12/2/2013 10:45 AM, Mark Lawrence wrote: the worst loser in the world Mark, I consider your continual direct personal attacks on other posters to be a violation

Re: Python Unicode handling wins again -- mostly

2013-12-02 Thread Ned Batchelder
On 12/2/13 4:44 PM, Ned Batchelder wrote: On 12/2/13 3:45 PM, Mark Lawrence wrote: On 02/12/2013 20:26, Terry Reedy wrote: On 12/2/2013 10:45 AM, Mark Lawrence wrote: the worst loser in the world Mark, I consider your continual direct personal attacks on other posters to be a violation of

Re: Python Unicode handling wins again -- mostly

2013-12-02 Thread Mark Lawrence
On 02/12/2013 22:24, Ned Batchelder wrote: On 12/2/13 4:44 PM, Ned Batchelder wrote: On 12/2/13 3:45 PM, Mark Lawrence wrote: On 02/12/2013 20:26, Terry Reedy wrote: On 12/2/2013 10:45 AM, Mark Lawrence wrote: the worst loser in the world Mark, I consider your continual direct personal

Re: Python Unicode handling wins again -- mostly

2013-12-02 Thread Ned Batchelder
On 12/2/13 5:32 PM, Mark Lawrence wrote: On 02/12/2013 22:24, Ned Batchelder wrote: On 12/2/13 4:44 PM, Ned Batchelder wrote: On 12/2/13 3:45 PM, Mark Lawrence wrote: On 02/12/2013 20:26, Terry Reedy wrote: On 12/2/2013 10:45 AM, Mark Lawrence wrote: the worst loser in the world Mark, I

Re: Python Unicode handling wins again -- mostly

2013-12-02 Thread Ben Finney
Ned Batchelder n...@nedbatchelder.com writes: This is where my knowledge about Unicode gets fuzzy. Isn't it the case that some grapheme clusters (or whatever the right word is) can't be normalized down to a single code point? Characters can accept many accents, for example. That's true,

Re: Python Unicode handling wins again -- mostly

2013-12-02 Thread Ethan Furman
On 12/02/2013 02:32 PM, Mark Lawrence wrote: ... the other being a pot smoking hippy who ... Please trim your posts. You comment a lot on people sending double-spaced google posts -- not trimming is nearly as bad. The above is a good example of unnecessary name calling. I value your good

Re: Code of Conduct, Trolls, and Thankless Jobs [was Re: Python Unicode handling wins again -- mostly]

2013-12-02 Thread Roy Smith
In article mailman.3485.1386021891.18130.python-l...@python.org, Mark Lawrence breamore...@yahoo.co.uk wrote: My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. I believe that Pythonistas should commit themselves to achieving the goal,

Re: Code of Conduct, Trolls, and Thankless Jobs [was Re: Python Unicode handling wins again -- mostly]

2013-12-02 Thread Terry Reedy
On 12/2/2013 4:25 PM, Ethan Furman wrote: jmf is certainly a troll No, he is a person who discovered a minor performance regression in the FSR, which we fixed. Unfortunately, he then continued for a year with a strange troll-like anti-FSR crusade. But his posts in the Unicode handling

Re: Code of Conduct, Trolls, and Thankless Jobs [was Re: Python Unicode handling wins again -- mostly]

2013-12-02 Thread Grant Edwards
On 2013-12-03, Roy Smith r...@panix.com wrote: I believe that Pythonistas should commit themselves to achieving the goal, before this decade is out, of making Python 3 the default version and having everybody be cool with unicode. I'm cool with Unicode as long as it just works without me

Re: Code of Conduct, Trolls, and Thankless Jobs [was Re: Python Unicode handling wins again -- mostly]

2013-12-02 Thread Ethan Furman
On 12/02/2013 07:22 PM, Terry Reedy wrote: On 12/2/2013 4:25 PM, Ethan Furman wrote: jmf is certainly a troll No, he is a person who discovered a minor performance regression in the FSR, which we fixed. Unfortunately, he then continued for a year with a strange troll-like anti-FSR crusade.

Re: Python Unicode handling wins again -- mostly

2013-12-02 Thread Steven D'Aprano
On Mon, 02 Dec 2013 16:14:13 -0500, Ned Batchelder wrote: On 12/2/13 3:38 PM, Ethan Furman wrote: On 11/29/2013 04:44 PM, Steven D'Aprano wrote: Out of the nine tests, Python 3.3 passes six, with three tests being failures or dubious. If you believe that the native string type should

Re: Code of Conduct, Trolls, and Thankless Jobs [was Re: Python Unicode handling wins again -- mostly]

2013-12-02 Thread Steven D'Aprano
On Tue, 03 Dec 2013 04:32:13 +, Grant Edwards wrote: On 2013-12-03, Roy Smith r...@panix.com wrote: I believe that Pythonistas should commit themselves to achieving the goal, before this decade is out, of making Python 3 the default version and having everybody be cool with unicode.

Re: Python Unicode handling wins again -- mostly

2013-12-02 Thread joe
How would a grapheme library work? Basic cluster combination, or would implementing other algorithms (line break, normalizing to a canonical form) be necessary? How do people use grapheme clusters in non-rendering situations? Or here's perhaps here's a better question: does anyone know any

Re: Python Unicode handling wins again -- mostly

2013-12-01 Thread wxjmfauth
Le dimanche 1 décembre 2013 00:07:36 UTC+1, Ned Batchelder a écrit : On 11/30/13 5:37 PM, Gregory Ewing wrote: wxjmfa...@gmail.com wrote: And do you know the origin of this typographical feature? Because, mechanically, the dot of the i broke too often. In my opinion, a very

Re: Python Unicode handling wins again -- mostly

2013-12-01 Thread Serhiy Storchaka
30.11.13 02:44, Steven D'Aprano написав(ла): (2) If you reverse that string, does it give lëon? The implication of this question is that strings should operate on grapheme clusters rather than code points. Python fails this test: py print(noe\u0308l[::-1]) leon

Re: Python Unicode handling wins again -- mostly

2013-12-01 Thread wxjmfauth
0.11.13 02:44, Steven D'Aprano написав(ла): (2) If you reverse that string, does it give lëon? The implication of this question is that strings should operate on grapheme clusters rather than code points. ... BTW, a grapheme cluster *is* a code points cluster. jmf --

Re: Python Unicode handling wins again -- mostly

2013-12-01 Thread Tim Delaney
On 2 December 2013 07:15, wxjmfa...@gmail.com wrote: 0.11.13 02:44, Steven D'Aprano написав(ла): (2) If you reverse that string, does it give lëon? The implication of this question is that strings should operate on grapheme clusters rather than code points. ... BTW, a grapheme cluster

Re: Python Unicode handling wins again -- mostly

2013-12-01 Thread Mark Lawrence
On 01/12/2013 20:54, Tim Delaney wrote: On 2 December 2013 07:15, wxjmfa...@gmail.com mailto:wxjmfa...@gmail.com wrote: 0.11.13 02:44, Steven D'Aprano написав(ла): (2) If you reverse that string, does it give lëon? The implication of this question is that strings should

Re: Python Unicode handling wins again -- mostly

2013-12-01 Thread Tim Delaney
On 2 December 2013 09:06, Mark Lawrence breamore...@yahoo.co.uk wrote: I don't remember him ever having a valid point, so FTR can we have a reference please. I do remember Steven D'Aprano showing that there was a regression which I flagged up here http://bugs.python.org/issue16061. It was

Re: Python Unicode handling wins again -- mostly

2013-12-01 Thread Mark Lawrence
On 01/12/2013 22:29, Tim Delaney wrote: On 2 December 2013 09:06, Mark Lawrence breamore...@yahoo.co.uk mailto:breamore...@yahoo.co.uk wrote: I don't remember him ever having a valid point, so FTR can we have a reference please. I do remember Steven D'Aprano showing that there was

Re: Python Unicode handling wins again -- mostly

2013-12-01 Thread Ethan Furman
On 12/01/2013 02:06 PM, Mark Lawrence wrote: I don't remember him [jmf] ever having a valid point, so FTR can we have a reference please. I do remember Steven D'Aprano showing that there was a regression which I flagged up here http://bugs.python.org/issue16061. It was fixed by Serhiy

Re: Python Unicode handling wins again -- mostly

2013-12-01 Thread Mark Lawrence
On 01/12/2013 22:50, Ethan Furman wrote: On 12/01/2013 02:06 PM, Mark Lawrence wrote: I don't remember him [jmf] ever having a valid point, so FTR can we have a reference please. I do remember Steven D'Aprano showing that there was a regression which I flagged up here

Re: Python Unicode handling wins again -- mostly

2013-11-30 Thread Mark Lawrence
On 30/11/2013 02:08, Roy Smith wrote: In article 529934dc$0$29993$c3e8da3$54964...@news.astraweb.com, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: (8) What's the uppercase of baffle spelled with an ffl ligature? Like most other languages, Python 3.2 fails: py

Re: Python Unicode handling wins again -- mostly

2013-11-30 Thread wxjmfauth
Le samedi 30 novembre 2013 03:08:49 UTC+1, Roy Smith a écrit : The whole idea of ligatures like fi is purely typographic. The crossbar on the f (at least in some fonts) runs into the dot on the i. Likewise, the top curl on an f run into the serif on top of the l (and similarly

Re: Python Unicode handling wins again -- mostly

2013-11-30 Thread Gregory Ewing
wxjmfa...@gmail.com wrote: And do you know the origin of this typographical feature? Because, mechanically, the dot of the i broke too often. In my opinion, a very plausible explanation. It doesn't sound very plausible to me, because there are a lot more stand-alone 'i's in English text than

Re: Python Unicode handling wins again -- mostly

2013-11-30 Thread Gregory Ewing
Steven D'Aprano wrote: On Sat, 30 Nov 2013 00:37:17 -0500, Roy Smith wrote: So, who am I to argue with the people who decided that I needed to be able to type a PILE OF POO character. Blame the Japanese for that. Apparently some of the biggest users of Unicode are the various Japanese

Re: Python Unicode handling wins again -- mostly

2013-11-30 Thread Ned Batchelder
On 11/30/13 5:37 PM, Gregory Ewing wrote: wxjmfa...@gmail.com wrote: And do you know the origin of this typographical feature? Because, mechanically, the dot of the i broke too often. In my opinion, a very plausible explanation. It doesn't sound very plausible to me, because there are a lot

Re: Python Unicode handling wins again -- mostly

2013-11-30 Thread Steven D'Aprano
On Sun, 01 Dec 2013 11:37:30 +1300, Gregory Ewing wrote: Which makes it even sillier to have an 'ffi' character in this day and age, when you can simply space the characters so that they overlap. It's in Unicode to support legacy character sets that included it[1]. There are a bunch of

Re: Python Unicode handling wins again -- mostly

2013-11-30 Thread Tim Chase
On 2013-12-01 00:22, Steven D'Aprano wrote: * KELVIN SIGN versus LATIN CAPITAL LETTER A I should hope so ;-) -tkc -- https://mail.python.org/mailman/listinfo/python-list

Re: Python Unicode handling wins again -- mostly

2013-11-30 Thread Steven D'Aprano
On Sat, 30 Nov 2013 18:52:48 -0600, Tim Chase wrote: On 2013-12-01 00:22, Steven D'Aprano wrote: * KELVIN SIGN versus LATIN CAPITAL LETTER A I should hope so ;-) I blame my keyboard, where letters A and K are practically right next to each other, only seven letters apart. An easy typo to

Re: Python Unicode handling wins again -- mostly

2013-11-30 Thread Tim Chase
On 2013-12-01 00:54, Steven D'Aprano wrote: On Sat, 30 Nov 2013 18:52:48 -0600, Tim Chase wrote: On 2013-12-01 00:22, Steven D'Aprano wrote: * KELVIN SIGN versus LATIN CAPITAL LETTER A I should hope so ;-) I blame my keyboard, where letters A and K are practically right

Re: Python Unicode handling wins again -- mostly

2013-11-30 Thread Chris Angelico
On Sun, Dec 1, 2013 at 11:54 AM, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: On Sat, 30 Nov 2013 18:52:48 -0600, Tim Chase wrote: On 2013-12-01 00:22, Steven D'Aprano wrote: * KELVIN SIGN versus LATIN CAPITAL LETTER A I should hope so ;-) I blame my keyboard, where letters

Re: Python Unicode handling wins again -- mostly

2013-11-30 Thread Roy Smith
In article mailman.3431.1385860444.18130.python-l...@python.org, Chris Angelico ros...@gmail.com wrote: On Sun, Dec 1, 2013 at 11:54 AM, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: On Sat, 30 Nov 2013 18:52:48 -0600, Tim Chase wrote: On 2013-12-01 00:22, Steven D'Aprano

Re: Python Unicode handling wins again -- mostly

2013-11-30 Thread Chris Angelico
On Sun, Dec 1, 2013 at 12:27 PM, Roy Smith r...@panix.com wrote: http://www.theregister.co.uk/2010/11/26/bofh_2010_episode_18/ ChrisA What means PFY? The only thing I can think of is Poor F---ing Yankee :-) In the context of the BOFH, it stands for Pimply-Faced Youth and means BOFH's

Python Unicode handling wins again -- mostly

2013-11-29 Thread Steven D'Aprano
There's a recent blog post complaining about the lousy support for Unicode text in most programming languages: http://mortoray.com/2013/11/27/the-string-type-is-broken/ The author, Mortoray, gives nine basic tests to understand how well the string type in a language works. The first four

Re: Python Unicode handling wins again -- mostly

2013-11-29 Thread Mark Lawrence
On 30/11/2013 00:44, Steven D'Aprano wrote: (5) What is the length of ? Both characters U+1F636 (GRINNING CAT FACE WITH SMILING EYES) and U+1F63E (POUTING CAT FACE) are outside the Basic Multilingual Plane, which means they require more than two bytes each. Most programming languages using

Re: Python Unicode handling wins again -- mostly

2013-11-29 Thread Roy Smith
In article 529934dc$0$29993$c3e8da3$54964...@news.astraweb.com, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: (8) What's the uppercase of baffle spelled with an ffl ligature? Like most other languages, Python 3.2 fails: py 'baffle'.upper() 'BAfflE' but Python 3.3 passes:

Re: Python Unicode handling wins again -- mostly

2013-11-29 Thread Chris Angelico
On Sat, Nov 30, 2013 at 1:08 PM, Roy Smith r...@panix.com wrote: I would certainly expect, x.lower() == x.upper().lower(), to be True for all values of x over the set of valid unicode codepoints. Having u\uFB04.upper() == FFL breaks that. I would also expect len(x) == len(x.upper()) to be

Re: Python Unicode handling wins again -- mostly

2013-11-29 Thread Roy Smith
In article mailman.3417.1385777557.18130.python-l...@python.org, Chris Angelico ros...@gmail.com wrote: On Sat, Nov 30, 2013 at 1:08 PM, Roy Smith r...@panix.com wrote: I would certainly expect, x.lower() == x.upper().lower(), to be True for all values of x over the set of valid unicode

Re: Python Unicode handling wins again -- mostly

2013-11-29 Thread Dave Angel
On Fri, 29 Nov 2013 21:28:47 -0500, Roy Smith r...@panix.com wrote: In article mailman.3417.1385777557.18130.python-l...@python.org, Chris Angelico ros...@gmail.com wrote: On Sat, Nov 30, 2013 at 1:08 PM, Roy Smith r...@panix.com wrote: I would certainly expect, x.lower() ==

Re: Python Unicode handling wins again -- mostly

2013-11-29 Thread Steven D'Aprano
On Fri, 29 Nov 2013 21:08:49 -0500, Roy Smith wrote: In article 529934dc$0$29993$c3e8da3$54964...@news.astraweb.com, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: (8) What's the uppercase of baffle spelled with an ffl ligature? Like most other languages, Python 3.2 fails:

Re: Python Unicode handling wins again -- mostly

2013-11-29 Thread Roy Smith
In article 529967dc$0$29993$c3e8da3$54964...@news.astraweb.com, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: You edited my text to remove the ligature? That's... unfortunate. It was un-ligated by the time it reached me. -- https://mail.python.org/mailman/listinfo/python-list

Re: Python Unicode handling wins again -- mostly

2013-11-29 Thread Zero Piraeus
: On Sat, Nov 30, 2013 at 04:21:49AM +, Steven D'Aprano wrote: On Fri, 29 Nov 2013 21:08:49 -0500, Roy Smith wrote: The whole idea of ligatures like fi is purely typographic. In English, that's correct. I'm not sure if we can generalise that to all languages that have ligatures. It

Re: Python Unicode handling wins again -- mostly

2013-11-29 Thread Gene Heskett
On Saturday 30 November 2013 00:23:22 Zero Piraeus did opine: On Sat, Nov 30, 2013 at 04:21:49AM +, Steven D'Aprano wrote: On Fri, 29 Nov 2013 21:08:49 -0500, Roy Smith wrote: The whole idea of ligatures like fi is purely typographic. In English, that's correct. I'm not sure if we

Re: Python Unicode handling wins again -- mostly

2013-11-29 Thread Roy Smith
In article 529967dc$0$29993$c3e8da3$54964...@news.astraweb.com, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: The whole idea of ligatures like fi is purely typographic. In English, that's correct. I'm not sure if we can generalise that to all languages that have ligatures.

Re: Python Unicode handling wins again -- mostly

2013-11-29 Thread Ian Kelly
On Fri, Nov 29, 2013 at 10:37 PM, Roy Smith r...@panix.com wrote: I was speaking specifically of ligatures like fi (or, if you prefer, ligatures like ό. By which I mean those things printers invented because some letter combinations look funny when typeset as two distinct letters. I think

Re: Python Unicode handling wins again -- mostly

2013-11-29 Thread Steven D'Aprano
On Sat, 30 Nov 2013 02:05:59 -0300, Zero Piraeus wrote: (I happen to think the presence of ligatures in Unicode is insane, but my dictator-of-the-world certificate appears to have gotten lost in the post, so fixing that will have to wait). You're probably right, but we live in an insane world

Re: Python Unicode handling wins again -- mostly

2013-11-29 Thread Steven D'Aprano
On Fri, 29 Nov 2013 23:00:27 -0700, Ian Kelly wrote: On Fri, Nov 29, 2013 at 10:37 PM, Roy Smith r...@panix.com wrote: I was speaking specifically of ligatures like fi (or, if you prefer, ligatures like ό. By which I mean those things printers invented because some letter combinations look

Re: Python Unicode handling wins again -- mostly

2013-11-29 Thread Steven D'Aprano
On Sat, 30 Nov 2013 00:37:17 -0500, Roy Smith wrote: So, who am I to argue with the people who decided that I needed to be able to type a PILE OF POO character. Blame the Japanese for that. Apparently some of the biggest users of Unicode are the various Japanese mobile phone manufacturers, TV