Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-19 Thread Steven D'Aprano
On Thu, 19 Jul 2018 20:34:26 +0200, Christian Gollwitzer wrote: > Am 19.07.2018 um 14:50 schrieb Gregory Ewing: >> Chris Angelico wrote: >>> On Thu, Jul 19, 2018 at 4:41 PM, Gregory Ewing >>> wrote: >>> (Google doesn't seem to think so -- it asks me whether I meant "assist shop".

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-19 Thread Christian Gollwitzer
Am 19.07.2018 um 14:50 schrieb Gregory Ewing: Chris Angelico wrote: On Thu, Jul 19, 2018 at 4:41 PM, Gregory Ewing wrote: (Google doesn't seem to think so -- it asks me whether I meant "assist shop". Although it does offer to translateč it into Czech...) Into or from?? I'm thoroughly

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-19 Thread Gregory Ewing
Chris Angelico wrote: On Thu, Jul 19, 2018 at 4:41 PM, Gregory Ewing wrote: (Google doesn't seem to think so -- it asks me whether I meant "assist shop". Although it does offer to translate it into Czech...) Into or from?? I'm thoroughly confused now! Hard to tell. This is what the link

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-19 Thread Abdur-Rahmaan Janhangeer
it's also thoroughly time to give this thread a well deserved rest RIP Abdur-Rahmaan Janhangeer https://github.com/Abdur-rahmaanJ Into or from?? I'm thoroughly confused now! > > ChrisA > -- > https://mail.python.org/mailman/listinfo/python-list > --

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-19 Thread Chris Angelico
On Thu, Jul 19, 2018 at 4:41 PM, Gregory Ewing wrote: > Stefan Ram wrote: >> >> »assistshop«, > > > Is that a word? > > (Google doesn't seem to think so -- it asks me whether > I meant "assist shop". Although it does offer to translate > it into Czech...) > Into or from?? I'm thoroughly

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-19 Thread Gregory Ewing
Stefan Ram wrote: »assistshop«, Is that a word? (Google doesn't seem to think so -- it asks me whether I meant "assist shop". Although it does offer to translate it into Czech...) -- Greg -- https://mail.python.org/mailman/listinfo/python-list

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-19 Thread Gregory Ewing
Stefan Ram wrote: Gregory Ewing writes: That's debatable. I've never thought of it that way and I'm fairly certain I don't pronounce it that way. My tongue does not do the same thing when I say "ch" as it does when I say "tsh". archives ˈɑɚ kɑɪvz (n) bachelor ˈbæʧ lɚ (n) machine

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-18 Thread Gregory Ewing
MRAB wrote: "ch" usually represents 2 phonemes, basically the sounds of "t" followed by "sh"; That's debatable. I've never thought of it that way and I'm fairly certain I don't pronounce it that way. My tongue does not do the same thing when I say "ch" as it does when I say "tsh". -- Greg --

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-18 Thread Antoon Pardon
On 18-07-18 10:07, Marko Rauhamaa wrote: >> Sure there were some surprises or gotcha's, but the result was still >> better than doing it in python2 and they were easier to deal with than >> in python2. > BTW, in those needs, even Python2 has Unicode strings and unicodedata at > your disposal.

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-18 Thread Marko Rauhamaa
Antoon Pardon : > On 17-07-18 14:22, Marko Rauhamaa wrote: >> If you assume that NFC normalizes every letter to a single codepoint >> (and carefully use NFC everywhere), you are right. But equally likely >> you may inadvertently be setting yourself up for a surprise. > > You are moving the goal

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-18 Thread Antoon Pardon
On 17-07-18 14:22, Marko Rauhamaa wrote: > Antoon Pardon : > >> On 17-07-18 10:27, Marko Rauhamaa wrote: >>> Also, Python2's strings do as good a job at delivering codepoints as >>> Python3. >> No they don't. The programs that I work on, need to be able to treat >> at least german, french, dutch

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-17 Thread Mark Lawrence
On 17/07/18 19:16, Marko Rauhamaa wrote: MRAB : "ch" usually represents 2 phonemes, basically the sounds of "t" followed by "sh"; Traditionally, that sound is considered a single phoneme: https://en.wikipedia.org/wiki/Affricate_consonant> Can you hear the difference in these

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-17 Thread Rhodri James
On 17/07/18 19:16, Marko Rauhamaa wrote: MRAB : "ch" usually represents 2 phonemes, basically the sounds of "t" followed by "sh"; Traditionally, that sound is considered a single phoneme: https://en.wikipedia.org/wiki/Affricate_consonant> To quote the introduction of that article, "It

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-17 Thread Marko Rauhamaa
MRAB : > "ch" usually represents 2 phonemes, basically the sounds of "t" > followed by "sh"; Traditionally, that sound is considered a single phoneme: https://en.wikipedia.org/wiki/Affricate_consonant> Can you hear the difference in these expressions: high chairs height shares

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-17 Thread MRAB
On 2018-07-17 03:25, Tim Chase wrote: On 2018-07-17 01:08, Steven D'Aprano wrote: In English, I think most people would prefer to use a different term for whatever "sh" and "ch" represent than "character". The term you may be reaching for is "consonant cluster"?

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-17 Thread Marko Rauhamaa
Antoon Pardon : > On 17-07-18 10:27, Marko Rauhamaa wrote: >> Also, Python2's strings do as good a job at delivering codepoints as >> Python3. > > No they don't. The programs that I work on, need to be able to treat > at least german, french, dutch and english text. My experience is that > in

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-17 Thread Antoon Pardon
On 17-07-18 10:27, Marko Rauhamaa wrote: > Steven D'Aprano : >> On Mon, 16 Jul 2018 21:48:42 -0400, Richard Damon wrote: >>> Who says there needs to be one. A good engineer will use the >>> definition that is most appropriate to the task at hand. Some things >>> need very solid definitions, and

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-17 Thread Richard Damon
> On Jul 17, 2018, at 3:44 AM, Steven D'Aprano > wrote: > > On Mon, 16 Jul 2018 21:48:42 -0400, Richard Damon wrote: > >>> On Jul 16, 2018, at 9:21 PM, Steven D'Aprano >>> wrote: >>> On Mon, 16 Jul 2018 19:02:36 -0400, Richard Damon wrote: You are defining a variable/fixed

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-17 Thread Marko Rauhamaa
Chris Angelico : > On Tue, Jul 17, 2018 at 6:27 PM, Marko Rauhamaa wrote: >> Of course, UTF-8 doesn't relieve you from Unicode problems. But it has >> one big advantage: it can usually deal with non-Unicode data without any >> extra considerations while Python3's strings make you have to take >>

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-17 Thread Marko Rauhamaa
Chris Angelico : > On Tue, Jul 17, 2018 at 7:03 PM, Marko Rauhamaa wrote: >> What I'd need is for the tty to tell me what column the cursor is >> visually. Or better yet, the tty would have to tell me where the column >> would be *after* I emit the next grapheme cluster. > > Are you prepared for

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-17 Thread Chris Angelico
On Tue, Jul 17, 2018 at 7:03 PM, Marko Rauhamaa wrote: > Chris Angelico : > >> On Tue, Jul 17, 2018 at 6:27 PM, Marko Rauhamaa wrote: >>> For me, the issue is where do I produce a line break in my text output? >>> Currently, I'm just counting codepoints to estimate the width of the >>> output.

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-17 Thread Marko Rauhamaa
Chris Angelico : > On Tue, Jul 17, 2018 at 6:27 PM, Marko Rauhamaa wrote: >> For me, the issue is where do I produce a line break in my text output? >> Currently, I'm just counting codepoints to estimate the width of the >> output. > > Well, that's just flat out wrong, then. Counting graphemes

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-17 Thread Chris Angelico
On Tue, Jul 17, 2018 at 6:27 PM, Marko Rauhamaa wrote: > It is essential for people to understand that the very same issues that > plague UTF-8 plague UTF-32 as well. Using UTF in both highlights that > fact. What a wonderful nonsense. I suppose that the same issues plague Elon Musk as plague

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-17 Thread Chris Angelico
On Tue, Jul 17, 2018 at 6:27 PM, Marko Rauhamaa wrote: >> But of course other people's experience may vary. I'm interested in >> learning about the library you use to process graphemes in your software. > > For me, the issue is where do I produce a line break in my text output? > Currently, I'm

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-17 Thread Marko Rauhamaa
Steven D'Aprano : > On Tue, 17 Jul 2018 09:52:13 +0300, Marko Rauhamaa wrote: > >> Both Python2 and Python3 provide two forms of string, one containing >> 8-bit integers and another one containing 21-bit integers. > > Why do you insist on making counter-factual statements as facts? Don't > you

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-17 Thread Marko Rauhamaa
Steven D'Aprano : > On Mon, 16 Jul 2018 21:48:42 -0400, Richard Damon wrote: >> Who says there needs to be one. A good engineer will use the >> definition that is most appropriate to the task at hand. Some things >> need very solid definitions, and some things don’t. > > The the problem is solved:

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-17 Thread Steven D'Aprano
On Tue, 17 Jul 2018 10:51:38 +0300, Marko Rauhamaa wrote: > in which Python3's honor is defended in a good many of the discussions > in this newsgroup: anger, condescension, ridicule, name-calling. You call it defending Python 3's honour. I call it responding to people who insist on spreading

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-17 Thread Steven D'Aprano
On Tue, 17 Jul 2018 15:20:16 +0900, INADA Naoki wrote (replying to Marko): > I still don't understand what's your original point. I think UTF-8 vs > UTF-32 is totally different from Python 2 vs 3. > > For example, string in Rust and Swift (2010s languages!) are *valid* > UTF-8. There are strong

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-17 Thread Steven D'Aprano
On Tue, 17 Jul 2018 09:52:13 +0300, Marko Rauhamaa wrote: > Both Python2 and Python3 provide two forms of string, one containing > 8-bit integers and another one containing 21-bit integers. Why do you insist on making counter-factual statements as facts? Don't you have a Python REPL you can try

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-17 Thread Steven D'Aprano
On Tue, 17 Jul 2018 08:26:45 +0300, Marko Rauhamaa wrote: > Steven D'Aprano : >> On Mon, 16 Jul 2018 22:51:32 +0300, Marko Rauhamaa wrote: >>> UTF-8 bytes can only represent the first 128 code points of Unicode. >> >> This is DailyWTF material. Perhaps you want to rethink your wording and >>

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-17 Thread Marko Rauhamaa
INADA Naoki : >> I won't comment on Rust and Swift because I don't know them. > ... >> I won't comment on Go, either. > > Hmm, do you say Python 3 is "cult-like" without survey other popular, > programming languages? You can talk about Python3 independently of other programming languages.

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-17 Thread Steven D'Aprano
On Mon, 16 Jul 2018 21:25:20 -0500, Tim Chase wrote: > On 2018-07-17 01:08, Steven D'Aprano wrote: >> In English, I think most people would prefer to use a different term >> for whatever "sh" and "ch" represent than "character". > > The term you may be reaching for is "consonant cluster"? > >

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-17 Thread Steven D'Aprano
On Mon, 16 Jul 2018 21:48:42 -0400, Richard Damon wrote: >> On Jul 16, 2018, at 9:21 PM, Steven D'Aprano >> wrote: >> >>> On Mon, 16 Jul 2018 19:02:36 -0400, Richard Damon wrote: >>> >>> You are defining a variable/fixed width codepoint set. Many others >>> want to deal with CHARACTER sets. >>

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-17 Thread INADA Naoki
> I won't comment on Rust and Swift because I don't know them. ... > I won't comment on Go, either. Hmm, do you say Python 3 is "cult-like" without survey other popular, programming languages? There are many popular languages which separate bytes and unicode string explicitly and string is not

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-17 Thread Terry Reedy
On 7/16/2018 10:25 PM, Tim Chase wrote: On 2018-07-17 01:08, Steven D'Aprano wrote: In English, I think most people would prefer to use a different term for whatever "sh" and "ch" represent than "character". The term you may be reaching for is "consonant cluster"?

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-17 Thread Marko Rauhamaa
INADA Naoki : > On Tue, Jul 17, 2018 at 2:31 PM Marko Rauhamaa wrote: >> So I hope that by now you have understood my point and been able to >> decide if you agree with it or not. > > I still don't understand what's your original point. > I think UTF-8 vs UTF-32 is totally different from Python

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-17 Thread Terry Reedy
On 7/16/2018 7:02 PM, Richard Damon wrote: On Jul 16, 2018, at 3:28 PM, Terry Reedy wrote: If one is using a broader definition than usual, it is clearer to say so. This is the core of what I wrote. Do you disagree? You are defining a variable/fixed width codepoint set. No, I did

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-17 Thread INADA Naoki
On Tue, Jul 17, 2018 at 2:31 PM Marko Rauhamaa wrote: > > Steven D'Aprano : > > On Mon, 16 Jul 2018 22:51:32 +0300, Marko Rauhamaa wrote: > >> UTF-8 bytes can only represent the first 128 code points of Unicode. > > > > This is DailyWTF material. Perhaps you want to rethink your wording > > and

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Marko Rauhamaa
Steven D'Aprano : > On Mon, 16 Jul 2018 22:51:32 +0300, Marko Rauhamaa wrote: >> UTF-8 bytes can only represent the first 128 code points of Unicode. > > This is DailyWTF material. Perhaps you want to rethink your wording > and maybe even learn a bit more about Unicode and the UTF encodings >

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Tim Chase
On 2018-07-17 01:21, Steven D'Aprano wrote: > > This doesn’t mean that UTF-32 is an awful system, just that it > > isn’t the magical cure that some were hoping for. > > Nobody ever claimed it was, except for the people railing that > since it isn't a magically system we ought to go back to the

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Tim Chase
On 2018-07-17 01:08, Steven D'Aprano wrote: > In English, I think most people would prefer to use a different > term for whatever "sh" and "ch" represent than "character". The term you may be reaching for is "consonant cluster"? https://en.wikipedia.org/wiki/Consonant_cluster -tkc --

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Richard Damon
> On Jul 16, 2018, at 9:21 PM, Steven D'Aprano > wrote: > >> On Mon, 16 Jul 2018 19:02:36 -0400, Richard Damon wrote: >> >> You are defining a variable/fixed width codepoint set. Many others want >> to deal with CHARACTER sets. > > Good luck coming up with a universal, objective,

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Steven D'Aprano
On Mon, 16 Jul 2018 22:51:32 +0300, Marko Rauhamaa wrote: > All UTF-8. No unicode strings. That just means you are re-implementing the bits of Unicode you care about (which may be "nothing at all") as UTF-8. If your application is nothing but middleware squirting bytes from one layer to

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Steven D'Aprano
On Mon, 16 Jul 2018 15:28:51 -0400, Terry Reedy wrote: > On 7/16/2018 1:11 PM, Richard Damon wrote: > >> Many consider that UTF-32 is a variable-width encoding because of the >> combining characters. It can take multiple ‘codepoints’ to define what >> should be a single ‘character’ for display.

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Steven D'Aprano
On Mon, 16 Jul 2018 19:02:36 -0400, Richard Damon wrote: > You are defining a variable/fixed width codepoint set. Many others want > to deal with CHARACTER sets. Good luck coming up with a universal, objective, language-neutral, consistent definition for a character. > This doesn’t mean that

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Steven D'Aprano
On Tue, 17 Jul 2018 06:15:25 +1000, Chris Angelico wrote: > On Tue, Jul 17, 2018 at 4:55 AM, Steven D'Aprano > wrote: >> There is nothing special about diacritics such that we ought to treat >> some combinations like "Ch" (two code points = one character) as "fixed >> width" while others like

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Richard Damon
> On Jul 16, 2018, at 3:28 PM, Terry Reedy wrote: > >> On 7/16/2018 1:11 PM, Richard Damon wrote: >> >> Many consider that UTF-32 is a variable-width encoding because of the >> combining characters. It can take multiple ‘codepoints’ to define what >> should be a single ‘character’ for

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Chris Angelico
On Tue, Jul 17, 2018 at 7:02 AM, Ethan Furman wrote: > On 07/16/2018 01:15 PM, Chris Angelico wrote: >> >> On Tue, Jul 17, 2018 at 4:55 AM, Steven D'Aprano wrote: > > >>> There is nothing special about diacritics such that we ought to treat >>> some combinations like "Ch" (two code points = one

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Marko Rauhamaa
Ethan Furman : > Depends on the language: in Spanish, "ch" is it's own letter (at least > it was when I grew up), so any word containing it should still contain > it when reversed: "chica" would be "acich". The Royal Academy broke "ch" and "ll" up into separate letters a decade or so back. It had

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Chris Angelico
On Tue, Jul 17, 2018 at 6:54 AM, Marko Rauhamaa wrote: > Chris Angelico : >> Challenge: Reverse a string in UTF-8. > > Counter-challenge: Reverse a Unicode string: > >>>> s = "a\u0304e" >>>> s >'āe' >>>> L = list(s) >>>> L.reverse() >>>> "".join(L) >'ēa' > >>

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Ethan Furman
On 07/16/2018 01:15 PM, Chris Angelico wrote: On Tue, Jul 17, 2018 at 4:55 AM, Steven D'Aprano wrote: There is nothing special about diacritics such that we ought to treat some combinations like "Ch" (two code points = one character) as "fixed width" while others like "â" (two code points =

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Marko Rauhamaa
Chris Angelico : > Challenge: Reverse a string in UTF-8. Counter-challenge: Reverse a Unicode string: >>> s = "a\u0304e" >>> s 'āe' >>> L = list(s) >>> L.reverse() >>> "".join(L) 'ēa' > Challenge: Center text in UTF-8. Counter-challenge: Center a Unicode string: >>>

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Chris Angelico
On Tue, Jul 17, 2018 at 4:55 AM, Steven D'Aprano wrote: > There is nothing special about diacritics such that we ought to treat > some combinations like "Ch" (two code points = one character) as "fixed > width" while others like "â" (two code points = one character) as > "variable width". When

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Chris Angelico
On Tue, Jul 17, 2018 at 5:51 AM, Marko Rauhamaa wrote: > Steven D'Aprano : >> Under that standard definition, UTF-8 and UTF-16 are variable-width, >> and UTF-32 is fixed-width. >> >> But I'll accept that UTF-32 is variable-width if Marko accepts that >> ASCII is too. > > If that makes you happy,

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Rhodri James
On 16/07/18 20:51, Marko Rauhamaa wrote: I use UTF-8 in my C programs and sense no disadvantage. I have never felt a need for wchar_t. That's not a good comparison, though, because wchar_t in C really doesn't give you much (if any) advantage over rolling your own UTF-8 support, even when

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Marko Rauhamaa
Steven D'Aprano : > Under that standard definition, UTF-8 and UTF-16 are variable-width, > and UTF-32 is fixed-width. > > But I'll accept that UTF-32 is variable-width if Marko accepts that > ASCII is too. If that makes you happy, fine. The point is, UTF-32 has no advantages over UTF-8. And I'm

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Terry Reedy
On 7/16/2018 1:11 PM, Richard Damon wrote: Many consider that UTF-32 is a variable-width encoding because of the combining characters. It can take multiple ‘codepoints’ to define what should be a single ‘character’ for display. I hope you realize that this is not the standard meaning of

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Steven D'Aprano
On Mon, 16 Jul 2018 14:22:27 -0400, Richard Damon wrote: [...] > But I am not talking about those sort of characters or ligatures, So what? I am. You don't get to say "only non-standard definitions I approve of count". There is the industry standard definition of what it means to be a fixed-

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Chris Angelico
On Tue, Jul 17, 2018 at 4:22 AM, Richard Damon wrote: > > But I am not talking about those sort of characters or ligatures, but > ‘characters’ that are built up of a combining diacritical marks (like > accents) and a base character. Unicode define many code points for the more > common of

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Richard Damon
> On Jul 16, 2018, at 1:36 PM, Steven D'Aprano > wrote: > > On Mon, 16 Jul 2018 13:11:23 -0400, Richard Damon wrote: > >>> On Jul 16, 2018, at 12:51 PM, Steven D'Aprano >>> wrote: >>> On Mon, 16 Jul 2018 00:28:39 +0300, Marko Rauhamaa wrote: if your new system used Python3's

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Steven D'Aprano
On Mon, 16 Jul 2018 13:11:23 -0400, Richard Damon wrote: >> On Jul 16, 2018, at 12:51 PM, Steven D'Aprano >> wrote: >> >>> On Mon, 16 Jul 2018 00:28:39 +0300, Marko Rauhamaa wrote: >>> >>> if your new system used Python3's UTF-32 strings as a foundation, that >>> would be an equally naïve

Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Richard Damon
> On Jul 16, 2018, at 12:51 PM, Steven D'Aprano > wrote: > >> On Mon, 16 Jul 2018 00:28:39 +0300, Marko Rauhamaa wrote: >> >> if your new system used Python3's UTF-32 strings as a foundation, that >> would be an equally naïve misstep. You'd need to reach a notch higher >> and use glyphs or